Skip to main content

Postgres

Overview

Extract connects to Postgres as a source by running SQL queries you define and streaming the results as datasets. Each dataset is a named SQL query that becomes a stream in Extract. Queries can run in Full Refresh mode or Incremental Changes mode when you provide an incremental timestamp column.

Prerequisites

  1. A reachable Postgres instance (self-hosted or managed).
  2. A database user with permission to connect and run SELECT queries on the schemas/tables you want to extract.
  3. Network access from Extract to your Postgres host (allowlist as needed).

Setup Guide

Step 1 - Create a read-only service user

Connect as an admin and create a dedicated user:

CREATE USER extract_reader WITH PASSWORD 'replace-with-strong-password';

Step 2 - Grant SELECT access

Grant access to the schemas and tables you want Extract to query:

GRANT CONNECT ON DATABASE <database> TO extract_reader;
GRANT USAGE ON SCHEMA <schema> TO extract_reader;
GRANT SELECT ON ALL TABLES IN SCHEMA <schema> TO extract_reader;
GRANT SELECT ON FUTURE TABLES IN SCHEMA <schema> TO extract_reader;

Step 3 (Optional) - Enable TLS

If your Postgres instance requires TLS, set Use TLS to true in the connector settings.

Extract connects with sslmode=prefer and does not verify server certificates. If you require certificate verification, use trusted networks and/or an SSH tunnel.

Step 4 - Configure the connector in Extract

Fill in the connection fields:

  • Hostname - Your Postgres host or IP.
  • Port - Default 5432.
  • Username - The service user created above.
  • Password - The service user password.
  • Database - The database to query.

Then define at least one Dataset (see below), save, and run a test sync.

Configuration Parameters

Required Parameters

  • Hostname - Postgres server hostname or IP address.
  • Port - Postgres server port (default 5432).
  • Username - Database username.
  • Password - Database password.
  • Database - Target database to query.
  • Datasets - List of SQL queries that define your streams.

Optional Parameters

  • Use TLS - Enable TLS for the Postgres connection (default false).

Dataset Configuration

Each dataset is a SQL query that defines a stream. Configure the following fields:

  • Dataset Name - Unique name for the stream (used as the stream title).
  • SQL Query - The SELECT query to run. Extract prepares this statement to infer columns.
  • Primary Key (optional) - Column name used for dedupe and diffing.
  • Incremental Timestamp Field (optional) - Column name used for incremental syncs.

Query Guidelines

  • Use a SELECT query that returns a stable schema.
  • Avoid non-deterministic expressions unless required.
  • If you enable incremental sync, the timestamp field must be a Postgres TIMESTAMP or TIMESTAMPTZ column (recommend TIMESTAMPTZ).

Extract Modes

Full Refresh

Extract runs the dataset query as-is.

If diffing is enabled and a primary key is defined, Extract orders results by the primary key and computes changes. For stable ordering across runs, Extract uses:

  • Numeric primary keys: ORDER BY <primary_key> ASC
  • Non-numeric primary keys: ORDER BY <primary_key>::text COLLATE "C" ASC

This ensures a consistent sort order for diffing, especially for text-like keys.

Incremental Changes

If you set Incremental Timestamp Field, Extract filters results by:

WITH query AS (<your query>)
SELECT * FROM query WHERE <timestamp_field> > '<last_cursor_value>'

On the first run (no cursor yet), Extract runs:

WITH query AS (<your query>)
SELECT * FROM query

The cursor updates to the max timestamp seen in each run.

Incremental timestamp field requirements

The Incremental Timestamp Field must:

  • Exist in the dataset query output (the selected columns).
  • Be typed as TIMESTAMP or TIMESTAMPTZ.

If the field is missing, or is a different type (for example DATE or TEXT), the sync will fail with an error indicating the field must be TIMESTAMP/TIMESTAMPTZ.

Additional Information

Data Type Handling

Extract maps common Postgres types to JSON-friendly types, including:

  • BOOL -> boolean
  • VARCHAR, TEXT -> string
  • VARCHAR[], TEXT[] -> array of strings
  • INT2, INT4, INT8 -> integer
  • FLOAT4, FLOAT8 -> number
  • JSON, JSONB -> json
  • DATE -> string (YYYY-MM-DD)
  • TIMESTAMP, TIMESTAMPTZ -> string (RFC 3339)
  • UUID -> string

Other/unknown types are treated as strings.

Diffing Requirements

Diffing is only available when a primary key is defined on the dataset. If you want change detection, ensure the query includes a stable unique key.