Databricks
This destination loads data into Databricks Delta tables using a Databricks SQL warehouse. Extract stages Parquet files in a Unity Catalog volume, then loads them into Delta tables using COPY INTO.
Setup Guide
What this destination expects
This destination writes Parquet files to a Unity Catalog volume and then loads them into Delta tables with COPY INTO through a Databricks SQL warehouse.
You need:
- A Databricks workspace URL
- A Databricks SQL warehouse ID
- A Databricks personal access token (PAT)
- A target Unity Catalog catalog and schema
- A Unity Catalog staging volume name (for example,
staging)
Step 1 - Create a SQL warehouse
Open SQL Warehouses in your Databricks workspace and copy the warehouse ID for a running warehouse.
Step 2 - Choose the target catalog and schema
Choose the catalog and schema where Extract should create Delta tables.
If they do not exist yet, Extract will try to create them automatically during the first load, as long as your Databricks user has permission to create catalogs and schemas.
Example SQL:
CREATE CATALOG IF NOT EXISTS main;
CREATE SCHEMA IF NOT EXISTS main.extract;
Step 3 - Choose a staging volume
Extract stages Parquet files in a Unity Catalog volume before loading them into Delta tables.
If the configured volume does not exist yet, Extract will try to create it automatically during the first load, as long as your Databricks user has permission to create volumes.
Example SQL:
CREATE VOLUME IF NOT EXISTS workspace.extract.staging;
Extract derives the staging root automatically from the connector settings:
/Volumes/{catalog}/{schema}/{staging_volume_name}
For example, with:
catalog = workspaceschema = extractstaging_volume_name = staging
Extract stages files under:
/Volumes/workspace/extract/staging
Step 4 - Create a personal access token
In the Databricks workspace, create a PAT and use it as the connector access token.
The token needs permission to:
- Use the selected SQL warehouse
- Create catalogs, schemas, and volumes if they do not already exist
- Create and update tables in the target catalog and schema
- Read and write files under the derived staging volume path
Step 5 - Configure the destination in Extract
Fill in:
Workspace URLAccess TokenSQL Warehouse IDCatalogSchemaStaging Volume NameTable Prefix(optional)
Authentication
This destination authenticates to Databricks using a personal access token (PAT).
- Header used:
Authorization: Bearer <token> - Keep the token scoped to the minimum permissions required for the target catalog/schema and staging volume.
Configuration reference
| Field | Type | Required | Description |
|---|---|---|---|
| Workspace URL | string | ✅ | Your Databricks workspace URL (for example, https://dbc-xxxxxxxx-xxxx.cloud.databricks.com). |
| Access Token | string | ✅ | Databricks personal access token (PAT) used to authenticate API and SQL warehouse requests. |
| SQL Warehouse ID | string | ✅ | The ID of the Databricks SQL warehouse used to run DDL/DML and COPY INTO. |
| Catalog | string | ✅ | Unity Catalog catalog where Extract will create and load tables. |
| Schema | string | ✅ | Unity Catalog schema where Extract will create and load tables. |
| Staging Volume Name | string | ✅ | Unity Catalog volume name used for staging Parquet files. The staging root is derived as /Volumes/{catalog}/{schema}/{staging_volume_name}. |
| Table Prefix | string | optional | Prefix applied to all destination table names (useful for namespacing multiple syncs into the same schema). |
Data model and loading behavior
- File format: Parquet (staged in the configured Unity Catalog volume)
- Table format: Delta
- Load mechanism:
COPY INTOfrom the staged Parquet files into a temporary table, then data is merged/inserted into the final table depending on the selected load mode.
Extract may create the following automatically (if permissions allow):
- Catalog and schema (if missing)
- Staging volume (if missing)
- Destination tables (if missing)
Streams
Each stream is written to a Delta table in:
{catalog}.{schema}.{table_prefix}{stream_table_name}
Notes:
- Extract may create a per-run temporary table during loading (used to stage the
COPY INTOresults before applying the final write to the destination table). - Table and column names are sanitized/quoted as needed to be compatible with Databricks SQL and Unity Catalog.