Postgres (CDC)
Overview
Extract's Postgres (CDC) connector captures row-level changes from Postgres logical replication (pgoutput) and streams inserts, updates, deletes, and truncates into Extract.
Use this connector when you need ongoing change capture. If you only need scheduled query-based extraction, use the standard Postgres source connector.
How It Works
- Extract validates your Postgres CDC settings and server capabilities.
- Optional initial snapshot reads current table state.
- Extract tails WAL changes from a logical replication slot using
pgoutput. - Extract applies table filters and writes records to selected streams.
Compatibility and Requirements
- Postgres
15+(primary/standard CDC). wal_level=logical.max_replication_slots >= 1.max_wal_senders >= 1.- A database user with login, table read permissions, and replication capability.
- Network access from Extract to the database host (directly or via SSH tunnel).
You can verify the required server settings with:
SHOW server_version_num;
SHOW wal_level;
SHOW max_replication_slots;
SHOW max_wal_senders;
Step 1 - Create a Dedicated CDC User
CREATE USER extract_cdc WITH PASSWORD 'replace-with-strong-password';
GRANT CONNECT ON DATABASE <database> TO extract_cdc;
GRANT USAGE ON SCHEMA <schema> TO extract_cdc;
GRANT SELECT ON ALL TABLES IN SCHEMA <schema> TO extract_cdc;
GRANT SELECT ON FUTURE TABLES IN SCHEMA <schema> TO extract_cdc;
ALTER ROLE extract_cdc WITH REPLICATION;
If you use auto-managed publications, ensure this user can create/alter publications.
Step 2 - Decide Publication and Slot Ownership
Option A: Let Extract manage them (recommended)
Use defaults:
create_publication=truecreate_replication_slot=truepublication_name=extract_cdc_pubreplication_slot_name=extract_cdc_slot
Extract will create missing objects and keep publication tables aligned with selected streams.
Option B: Manage them manually
Set both creation flags to false and create objects yourself.
Example:
CREATE PUBLICATION extract_cdc_pub FOR TABLE public.orders, public.customers;
SELECT * FROM pg_create_logical_replication_slot('extract_cdc_slot', 'pgoutput', false);
If manual mode is enabled and objects are missing, the connector fails fast at runtime.
Step 3 - Configure Replica Identity
For safe update/delete CDC behavior, tracked tables should use REPLICA IDENTITY FULL.
Manual approach:
ALTER TABLE public.orders REPLICA IDENTITY FULL;
Or set connector parameter replica_identity_override=FULL to apply it automatically on selected tables.
Notes:
- Only
FULLis supported. replica_identity_overridecannot be applied whenread_replica=true.
Step 4 - Configure Connector Parameters in Extract
Account Connector Parameters
hostport(default5432)usernamepassworddatabasetls(defaultfalse)use_ssh_tunneland SSH fields (ssh_host,ssh_port,ssh_user,ssh_key)
Connection Parameters
replication_slot_namepublication_namecreate_publicationcreate_replication_slotslot_temporaryread_replicainitial_snapshotresnapshotlag_warning_consecutive_runs(default3)include_schemas,exclude_schemasinclude_tables,exclude_tablesreplica_identity_override
Read Replica Mode
Set read_replica=true only when you are intentionally decoding from a replica.
Requirements and constraints:
- Replica host must support standby logical decoding (Postgres 16+).
create_publication=falseandcreate_replication_slot=falseare required.- Publication/slot must already exist and be valid for the target database.
replica_identity_overrideis not applied in read-replica mode.
Snapshot and Resnapshot Behavior
initial_snapshot=true: performs a baseline snapshot before WAL streaming.resnapshot=true: clears CDC state and forces a new snapshot on the next run.- If a slot becomes invalidated/lost, Extract forces a resnapshot for recovery.
Table Selection Rules
Table filters are applied in this order:
include_schemas(if provided)exclude_schemasinclude_tables(if provided, supportsschema.tableor baretable)exclude_tables(supportsschema.tableor baretable)
Use fully qualified names (schema.table) when possible to avoid ambiguity.
Records and CDC Metadata
For change events, the connector emits table records and includes CDC metadata fields:
_extract_commit_timestamp: commit timestamp for the change (when available from the replication stream)_extract_cdc_lsn: WAL LSN associated with the change_extract_deleted:truefor delete events, otherwisefalse/unset
Truncate operations are also captured. Depending on your destination and modeling, truncates may be represented as control events rather than row-level records.
Validation and Monitoring Queries
Confirm publication members
SELECT schemaname, tablename
FROM pg_publication_tables
WHERE pubname = 'extract_cdc_pub'
ORDER BY schemaname, tablename;
Check slot health and WAL retention
SELECT
slot_name,
active,
restart_lsn,
confirmed_flush_lsn,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS retained_wal
FROM pg_replication_slots
WHERE slot_name = 'extract_cdc_slot';
Troubleshooting
-
wal_level must be 'logical' for CDC- Set
wal_level=logicaland restart Postgres.
- Set
-
Replication slot '<slot>' does not exist and create_replication_slot=false- Create the slot manually or enable slot creation.
-
Publication '<pub>' does not exist and create_publication=false- Create the publication manually or enable publication creation.
-
read_replica=true requires create_publication=false and create_replication_slot=false- Disable both creation flags in read-replica mode.
-
read_replica requires Postgres 16+ for standby logical decoding- Use a primary host or upgrade replica version/capabilities.
-
CDC requires REPLICA IDENTITY FULL for update safety- Set
replica_identity_override=FULLor applyALTER TABLE ... REPLICA IDENTITY FULLmanually.
- Set
Best Practices
- Use a dedicated replication slot per Extract connection.
- Keep runs frequent to prevent excessive WAL retention.
- Prefer explicit table lists for production CDC.
- Avoid temporary slots (
slot_temporary=true) unless you explicitly want ephemeral state.