Documentation Index
Fetch the complete documentation index at: https://developer.comstruct.com/llms.txt
Use this file to discover all available pages before exploring further.
Overview
comstruct can receive emails containing invoices and automatically extract, classify, and process their attachments. This enables a hands-off workflow: forward supplier invoices to a dedicated address and comstruct takes care of the rest. The email processing endpoint powers two main scenarios:Standard email forwarding
Forward invoices to comstruct. Attachments are extracted, classified by AI, and turned into invoices.
Staple scan processing
Send scanned multi-invoice PDFs. comstruct segments them into individual invoices using AI page-boundary detection.
How it works
When an email arrives at the processing endpoint, comstruct runs the following pipeline:Step-by-step
- Email parsing — The raw email (MIME/RFC 822) is parsed. Inline images (signatures, logos) are filtered out. Attachments are split into supported and unsupported categories.
- Format conversion — TIFF attachments are automatically converted to PDF. Failed conversions are preserved as additional documents.
- Routing — Based on the number and type of supported attachments, comstruct decides the most efficient processing path (see Processing modes).
- Invoice creation — Each identified invoice enters the standard comstruct processing pipeline: AI-assisted data extraction, supplier matching, and approval workflows.
Supported document formats
Invoice documents (processed as invoices)
| Format | MIME types | Notes |
|---|---|---|
application/pdf | Primary format; best AI extraction results | |
| XML | application/xml, text/xml | Processed as XRechnung when valid electronic invoice XML is detected |
| TIFF | image/tiff, image/tif | Automatically converted to PDF before processing |
Additional documents (preserved alongside invoices)
Non-invoice attachments are uploaded and linked to the resulting invoice as reference documents. This includes:| Format | Examples |
|---|---|
| Office documents | .doc, .docx, .xls, .xlsx |
| Images | .jpg, .png, .bmp, .webp, .gif |
| Archives | .zip |
| Forwarded emails | .eml, message/rfc822 |
Processing modes
Standard email processing
Standard processing determines the best path based on the number and type of supported attachments:Single PDF attachment
Single PDF attachment
The most common case. The PDF is sent directly to the invoice processing pipeline — no classification needed.Result: One invoice created immediately.
Multiple or mixed attachments
Multiple or mixed attachments
When an email contains more than one supported document, or a mix of PDFs and other types, all attachments are sent to the AI classification queue.The classifier groups documents into:
- Invoice documents — each group becomes a separate invoice
- Supporting documents — linked to the relevant invoice(s)
XML-only attachments (XRechnung)
XML-only attachments (XRechnung)
When the email contains only XML attachments with valid electronic invoice content (XRechnung / EN 16931), comstruct:
- Validates the XML as a recognized electronic invoice format
- Generates a human-readable PDF from the structured data
- Creates an invoice with both the PDF and original XML attached
No supported attachments
No supported attachments
When an email has no processable attachments (e.g., only images or Office files), comstruct still creates a placeholder invoice so the email is not lost:
- A placeholder PDF is generated
- All uploadable attachments are linked as additional documents
- The email body (sender, subject, date, text) is rendered as a separate PDF and attached
Staple scan processing
Staple scan mode is designed for scanning workflows where multiple paper invoices are fed through a scanner in one batch, producing a single multi-page PDF. When staple scan mode is active:- Only PDF attachments are processed (non-PDF supported docs are ignored)
- Each PDF is analyzed by AI to detect page boundaries between individual invoices
- The PDF is split into segments — one per detected invoice
- Each segment is sent directly to invoice processing (classification is skipped, since all pages are known to be invoices)
x-staple-scan: true header on the request.
Fallback: If segmentation fails or the PDF has only one page, the entire PDF is treated as a single invoice.
Email metadata preservation
comstruct extracts metadata from the original email and uses it throughout processing:| Metadata field | Usage |
|---|---|
| From | Sender identification; helps with supplier matching |
| Subject | Stored for reference and searchability |
| Date | Original email timestamp |
| Body text | When no invoice attachments are found, the email body is rendered as a PDF and attached to the placeholder invoice |
AI-powered classification
When an email contains multiple attachments, comstruct uses AI (Gemini) to intelligently group and classify them:- Invoice vs. supporting document — The classifier determines which documents are actual invoices and which are supplementary (e.g., delivery notes, cover letters, specs)
- Grouping — Multiple pages or files that belong to the same invoice are grouped together
- XML–PDF matching — When both XML (XRechnung) and PDF versions of an invoice are present, they are matched by normalized invoice number and associated as a single invoice
- Duplicate detection — If a document has already been processed (matched by document ID), it is skipped to prevent duplicate invoices
Fallback behavior
If AI classification fails for any reason, comstruct falls back to a safe default: each PDF attachment is processed as a separate invoice, with XML and unsupported documents attached to all resulting invoices. This ensures no invoice is lost.Queue and retry behavior
Email processing uses a job queue to handle classification asynchronously when needed:| Setting | Value |
|---|---|
| Max attempts | 5 |
| Backoff strategy | Exponential, starting at 10 seconds |
| Concurrency | Configurable (default: 1 worker) |
Integration with SendGrid
The email processing endpoint is designed to receive SendGrid Inbound Parse webhook payloads. SendGrid forwards incoming emails as multipart form-data, and comstruct extracts the raw email content from theemail field.
Setup
- Configure a SendGrid Inbound Parse webhook pointing to your comstruct instance
- Set the MX records for your forwarding domain to SendGrid
- Include authentication headers (
x-api-key) in the webhook configuration - Optionally set
x-staple-scan: truefor scan-dedicated addresses
Request format
The endpoint accepts a raw body (up to 32 MB) containing multipart form-data. Theemail field must contain the complete raw email in RFC 822 / MIME format.
Best practices
Email forwarding setup
Email forwarding setup
- Use a dedicated email address per forwarding purpose (e.g., one for regular invoices, another for staple scans)
- Configure email rules to forward invoices automatically — avoid manual forwarding where possible
- Ensure the forwarding preserves original attachments (avoid inline-only forwarding)
Document quality
Document quality
- PDF yields the best AI extraction results — prefer it over scanned images
- Use 300 DPI or higher for scanned documents
- Ensure documents are not password-protected
- Avoid extremely large attachments — the endpoint accepts up to 32 MB total
Staple scans
Staple scans
- Feed invoices in order — comstruct detects boundaries but preserves page sequence
- Use clear page separations between invoices
- Single-page invoices work best; multi-page invoices within a staple scan are also supported
XRechnung / electronic invoices
XRechnung / electronic invoices
- Send XML files as standard attachments (not inline)
- When sending both PDF and XML versions, use matching invoice numbers so comstruct can link them automatically
- Supported formats: XRechnung (EN 16931 compliant)
Related endpoints
Single invoice upload
Upload a single PDF invoice directly via API.
Email invoice
Upload a single raw PDF with email-style headers (project, tenant).
Invoice list
Query and filter processed invoices.
Invoice callback
Receive status updates from ERP systems.