Email Data Source
The Email connector allows you to extract structured data from email attachments. It monitors an email inbox and automatically processes CSV files attached to incoming emails, converting them into a data stream that can be loaded into your destination.
Overview
This connector is ideal for scenarios where you receive regular data reports via email with CSV attachments, such as:
- Automated reports from partners or vendors
- Daily/weekly data exports from systems that only support email delivery
- Marketing performance reports sent via email
- Financial statements or transaction reports delivered as email attachments
Prerequisites
- A data provider or system that can send reports via email with CSV attachments (directly attached or inside a ZIP file)
Supported File Types
The Email connector processes the following attachment types:
| File Type | Description |
|---|---|
.csv | CSV files are parsed directly |
.zip | ZIP archives containing CSV files are automatically extracted and processed |
How It Works
- Scheduled Runs: On each connection run (based on your configured schedule), the connector checks the inbox for new emails
- Attachment Extraction: The connector identifies and extracts supported attachments from any emails received since the last run
- Data Parsing: CSV files are parsed, with each column becoming a field in the data stream
- Schema Detection: The schema is automatically inferred from the CSV header row
- Incremental Processing: Only new emails (based on timestamp) are processed in subsequent runs
Source Setup Guide
- In the Extract platform, navigate to Sources → Add Source → Email
- Provide a Table Name for the resulting data stream
- Click Save - a dedicated email address will be automatically provisioned for your source
- Copy the provisioned email address and configure your data provider to send reports to this address
Connection Setup Guide
Once you've connected the Email source to a destination, configure:
- Connection Pull Schedule: How frequently Extract checks for new emails
- Destination specific settings: Dataset name, target schema, etc. (depending on your destination)
- Schema Migration Policy: How Extract handles schema changes from new CSV files
Data Stream Fields
In addition to the columns from your CSV files, the connector automatically adds the following metadata fields:
| Field | Type | Description |
|---|---|---|
_email_sender | String | The email address of the sender |
_email_attachment_filename | String | The filename of the processed attachment |
_email_timestamp | DateTime | The timestamp when the email was received |
Email Data Source FAQ
Q: How is the data schema determined?
A: The connector uses the most recent email's CSV attachment to determine the schema. The first row of the CSV file is treated as column headers, which become field names in your destination.
Q: What happens if CSV schemas change between emails?
A: The schema is expected to remain consistent. If a new email contains a CSV with different columns, the connection may fail based on your Schema Migration Policy settings.
Q: Are all data types treated as strings?
A: Yes, all values from CSV files are imported as strings. You can apply transformations in your destination to cast values to appropriate data types.
Q: What happens with nested ZIP files?
A: The connector recursively processes ZIP files. If a ZIP contains another ZIP with CSV files inside, all CSVs will be extracted and processed.
Q: Are there file size limits?
A: Yes, individual attachments are limited to 100 MB. Attachments exceeding this limit will be skipped.
Q: What email formats are supported?
A: The connector supports standard MIME-encoded emails with base64-encoded attachments. Most email providers and systems use this format by default.
Q: How do I send test emails?
A: Simply send an email with a CSV attachment to the email address provisioned for your source. The connector will process it on the next scheduled run.
Best Practices
- Consistent Schema: Ensure that all CSV files sent to the inbox have the same column structure
- Clear Naming: Use descriptive filenames for attachments to make it easier to identify data in the
_email_attachment_filenamefield - Regular Monitoring: Check connection runs periodically to ensure emails are being processed correctly
Troubleshooting
| Issue | Possible Cause | Solution |
|---|---|---|
| No data extracted | Attachment is not CSV or ZIP | Ensure attachments are in supported formats |
| Missing columns | Schema mismatch | Verify the CSV header row matches expected columns |
| Duplicate data | Emails reprocessed | Check cursor state and ensure emails have unique timestamps |
| Connection fails | Empty inbox or no valid attachments | Verify emails are being received and contain valid CSV attachments |