Skip to main content

Reverse ETL Connections

info

Reverse ETL connections in Extract allow you to sync data from your data warehouse into external applications, enabling you to operationalize your data across your business tools. Whether you're sending conversion events to ad platforms, syncing audiences to marketing tools, or pushing enriched customer data into CRMs or support platforms—Reverse ETL helps you turn insights into action.

With Extract Reverse ETL, you can:

  • Load and transform data from your own warehouse

  • Map it to the required schema of your chosen destination

  • Deliver it seamlessly and efficiently to third-party applications

👉 Check out our catalog of supported Reverse ETL sources and destinations to see where you can send your data.

Step 1: Set Up Your Source​

Before you can start syncing data to external applications, you’ll need to configure your source—the origin of the data you want to send. In Reverse ETL connections, your source can be:

  • A supported database (e.g., Snowflake, BigQuery, Redshift, etc.)

  • A file or cloud storage service (e.g., Google Sheets, Amazon S3)

Dataset Configuration

During source setup, you’ll be required to define datasets. These datasets represent the data you want to sync and will eventually become streams that are mapped to your destination’s schema.

How to Configure Datasets

  • Click "Add" in the Source Setup form.
  • Provide a name for your dataset.
  • Write an SQL query that defines the dataset you want to use. This query should return the data you'd like to send to your destination.
  • Google Sheets - If you're using Google Sheets as your data source, each tab in the sheet is automatically treated as a separate dataset—no SQL required.

Step 2: Create and Configure a Connection​

Once your source is ready, the next step is to create a Reverse ETL connection.

  • Go to the Connections tab and click "Add New Connection."
  • Select your source and a Reverse ETL destination from the list of supported apps.
  • After selecting both, navigate to the Setup tab to configure your sync frequency—this determines how often your data will be pushed to the destination.

Step 3: Map Your Data​

Now that your connection is configured, the next step is to map your streams (datasets) to the destination’s data contract.

A data contract defines the schema and rules required to sync data into the selected destination. This includes the structure of the data, required fields, expected formats, and any special conditions (like hashing requirements).

How Mapping Works?

  1. Navigate to the Streams tab.
  2. You’ll see the datasets you configured in Step 1 listed as streams.
  3. For each stream you want to sync:
  • Select the appropriate data contract to define the expected schema.

  • Map your dataset fields to the destination’s fields based on the contract.

  1. When you're done, click "Save Changes" to apply your mapping configuration.
warning

Some destination fields require hashed values (e.g., email addresses for audience matching). If your fields are not already hashed, you can choose to have Extract perform the hashing for you—this is the recommended option. If the required fields are not hashed properly, the connection run will fail.

Optional: Enable the Diffing Mechanism​

Some destinations require the source data to indicate what changed since the last sync—specifically whether a record was added, updated, or deleted.

Two Ways to Handle Diffs

  1. Use Extract’s Built-In Diffing Mechanism - Enable this feature during stream setup to let Extract automatically compare the current dataset with the previous one and generate the necessary change flags.

Requirements for Using Extract's Diffing Mechanism:

  • Sorted Input: The data stream must be sorted using the same primary key.
  • Consistent Schema: The primary key field must be present in every record and must maintain a consistent data type across all records.
  1. Manage Diffs Manually in Your Source - If you choose not to use Extract’s diffing mechanism, your dataset must include a field called "diff_result" that clearly identifies the change type for each record ("DiffAdded", "DiffUpdated" or "DiffRemoved").
warning

Failing to provide proper diff information may result in sync errors or rejected records by the destination.