Databricks

This destination loads data into Databricks Delta tables using a Databricks SQL warehouse. Extract stages Parquet files in a Unity Catalog volume, then loads them into Delta tables using COPY INTO.

Setup Guide

What this destination expects

This destination writes Parquet files to a Unity Catalog volume and then loads them into Delta tables with COPY INTO through a Databricks SQL warehouse.

You need:

A Databricks workspace URL
A Databricks SQL warehouse ID
A Databricks personal access token (PAT), Databricks user OAuth app connection, or Databricks service principal OAuth secret
A target Unity Catalog catalog and schema
A Unity Catalog staging volume name (for example, staging)

Step 1 - Create a SQL warehouse

Open SQL Warehouses in your Databricks workspace and copy the warehouse ID for a running warehouse.

Step 2 - Choose the target catalog and schema

Choose the catalog and schema where Extract should create Delta tables.

If they do not exist yet, Extract will try to create them automatically during the first load, as long as your Databricks user has permission to create catalogs and schemas.

If your Databricks user does not have catalog, schema, or volume creation permissions, enable Do not attempt to create catalogs, schemas, or volumes automatically and create the catalog, schema, and staging volume before running the sync.

Example SQL:

CREATE CATALOG IF NOT EXISTS main;
CREATE SCHEMA IF NOT EXISTS main.extract;

Step 3 - Choose a staging volume

Extract stages Parquet files in a Unity Catalog volume before loading them into Delta tables.

If the configured volume does not exist yet, Extract will try to create it automatically during the first load, as long as your Databricks user has permission to create volumes.

Example SQL:

CREATE VOLUME IF NOT EXISTS workspace.extract.staging;

Extract derives the staging root automatically from the connector settings:

/Volumes/{catalog}/{schema}/{staging_volume_name}

For example, with:

catalog = workspace
schema = extract
staging_volume_name = staging

Extract stages files under:

/Volumes/workspace/extract/staging

Step 4 - Choose an authentication method

Access token is the default authentication option. In the Databricks workspace, create a PAT and use it as the connector access token.

For OAuth (User), create a Databricks app connection:

Go to Databricks account settings.
Open App connections.
Click Add connection.
Choose SQL for the Access Scopes.
Set the redirect URL to https://api.extract.to/oauth/callback/databricks.
Copy the client ID and client secret into Extract, then sign in with Databricks.

For OAuth M2M / Service Principal, create an OAuth secret for a Databricks service principal:

In Databricks, go to Settings.
Open Identity and access.
Click Add service principal.
Add an existing service principal or create a new one.
Open the service principal and go to Secrets.
Click Generate secret.
Set Lifetime to 730 days, or your organization's preferred maximum.
Copy the generated Client ID and Client Secret.
In Extract, choose Authentication Type: OAuth M2M / Service Principal.
Paste the values into Service Principal Client ID and Service Principal Client Secret.
In Databricks, go to SQL Warehouses.
Select the warehouse Extract will use.
Open Permissions.
Add the service principal from the previous steps.
Grant Can use.

The selected identity needs permission to:

Use the selected SQL warehouse
Create catalogs, schemas, and volumes if automatic creation is enabled
Create and update tables in the target catalog and schema
Read and write files under the derived staging volume path

To grant catalog permissions in the Databricks UI:

Go to Catalog.
Select the target catalog.
Open Permissions.
Click Grant.
Select the required scopes for the destination, such as USE CATALOG, USE SCHEMA, CREATE TABLE, and volume permissions if Extract uses a staging volume.
Select the service principal from the OAuth M2M setup.
Save the grant.

Step 5 - Configure the destination in Extract

Fill in:

Workspace URL
Authentication Type
Access Token, OAuth user client fields and sign-in, or OAuth M2M service principal client fields
SQL Warehouse ID
Catalog
Schema
Staging Volume Name
Table Prefix (optional)

Authentication

This destination can authenticate to Databricks with:

Access Token: a Databricks personal access token (PAT).
OAuth (User): a Databricks app connection that issues OAuth tokens for SQL access after a user signs in.
OAuth M2M / Service Principal: a Databricks service principal OAuth secret for machine-to-machine authentication.

Extract sends the resulting token with Authorization: Bearer <token>. Keep the selected identity scoped to the minimum permissions required for the target catalog/schema and staging volume.

Configuration reference

Field	Type	Required	Description
Workspace URL	string	✅	Your Databricks workspace URL (for example, `https://dbc-xxxxxxxx-xxxx.cloud.databricks.com`).
Authentication Type	option	✅	Choose `Access Token`, `OAuth (User)`, or `OAuth M2M / Service Principal`. Access Token is the default option.
Access Token	password	required for Access Token	Databricks personal access token (PAT) used to authenticate API and SQL warehouse requests.
OAuth Client ID	string	required for OAuth (User)	Client ID from the Databricks app connection.
OAuth Client Secret	password	required for OAuth (User)	Client secret from the Databricks app connection.
Authentication	OAuth	required for OAuth (User)	Databricks OAuth sign-in.
Service Principal Client ID	string	required for OAuth M2M / Service Principal	Client ID from the Databricks service principal OAuth secret.
Service Principal Client Secret	password	required for OAuth M2M / Service Principal	Client secret from the Databricks service principal OAuth secret.
SQL Warehouse ID	string	✅	The ID of the Databricks SQL warehouse used to run DDL/DML and `COPY INTO`.
Do not attempt to create catalogs, schemas, or volumes automatically	boolean	optional	If enabled, Extract skips automatic `CREATE CATALOG`, `CREATE SCHEMA`, and `CREATE VOLUME` statements. The configured catalog, schema, and staging volume must already exist.
Catalog	string	✅	Unity Catalog catalog where Extract will create and load tables.
Schema	string	✅	Unity Catalog schema where Extract will create and load tables.
Staging Volume Name	string	✅	Unity Catalog volume name used for staging Parquet files. The staging root is derived as `/Volumes/{catalog}/{schema}/{staging_volume_name}`.
Table Prefix	string	optional	Prefix applied to all destination table names (useful for namespacing multiple syncs into the same schema).

Data model and loading behavior

File format: Parquet (staged in the configured Unity Catalog volume)
Table format: Delta
Load mechanism: COPY INTO from the staged Parquet files into a temporary table, then data is merged/inserted into the final table depending on the selected load mode.

Extract may create the following automatically (if permissions allow):

Catalog and schema (if missing)
Staging volume (if missing)
Destination tables (if missing)

When Do not attempt to create catalogs, schemas, or volumes automatically is enabled, Extract skips catalog, schema, and staging volume creation. It may still create destination and temporary Delta tables in the configured schema.

Streams

Each stream is written to a Delta table in:

{catalog}.{schema}.{table_prefix}{stream_table_name}

Notes:

Extract may create a per-run temporary table during loading (used to stage the COPY INTO results before applying the final write to the destination table).
Table and column names are sanitized/quoted as needed to be compatible with Databricks SQL and Unity Catalog.

Setup Guide​

What this destination expects​

Step 1 - Create a SQL warehouse​

Step 2 - Choose the target catalog and schema​

Step 3 - Choose a staging volume​

Step 4 - Choose an authentication method​

Step 5 - Configure the destination in Extract​

Authentication​

Configuration reference​

Data model and loading behavior​

Streams​