| Type | Write-ahead log (WAL) Replication |
| Supported Replication | Initial Sync, Continuous Sync |
| Authentication Type | Password Authentication |
Setting up PostgreSQL CDC for ELT & CDC
ELT & CDC uses WAL based logical replication with thepgoutput decoding plugin for PostgreSQL sources. It requires a database user with the LOGIN and REPLICATION roles. The sync user does not need to own your tables, and does not need CREATE permission on the database or schema.
Supported providers
- Amazon RDS PostgreSQL
- Amazon Aurora PostgreSQL
- Azure Single Node PostgreSQL
- Self-hosted (generic) PostgreSQL
- Google Cloud SQL for PostgreSQL
Requirements
- PostgreSQL versions 10.x or above.
- A database user with the
LOGINandREPLICATIONroles. - Tables with a
PRIMARY KEY. - Master (primary) instances of PostgreSQL. Log based replication only works for master instances.
Features
| Feature | Supported | Notes |
|---|---|---|
| Full (Historical) sync | Yes | |
| Incremental sync | Yes | |
| Replicate DELETE | Yes | |
| UPSERT | Yes | |
| Append only mode | Yes | Can be specified at a table level |
| Exclude tables | Yes | |
| Exclude columns | Yes | |
| SSL support | Yes | |
| SSH tunnel | Yes | SSH tunnel guide |
How setup works
ELT & CDC supports two setup models for PostgreSQL. Which one applies is determined automatically from the privileges of the user you connect with.| Model | When it applies | What you do |
|---|---|---|
| Connector-managed | The sync user has CREATE on the database and schema, owns the synced tables, and owns any existing publication. | Nothing. The connector creates and maintains the publication, replication slot, and heartbeat table automatically. |
| Customer-managed (no ownership) | The sync user lacks one or more of the privileges above. | Run the setup SQL once (see below). The connector never changes table ownership; it only reads, streams from the replication slot, and ticks the heartbeat. |
Setup steps (customer-managed)
Use these steps when the sync user does not own the tables. Run the SQL as a privileged user (for example, the table owner or an administrator).Create the sync user
Create a dedicated user with On Amazon RDS and Aurora, grant replication through the managed role instead:
LOGIN, then assign the REPLICATION role.Grant read privileges
Grant the sync user access to the schema and read access to the tables you want to replicate. No
CREATE privilege is required.Create the heartbeat table
Create the heartbeat table the connector uses to keep the replication slot advancing while your synced tables are idle, and grant the sync user write access to it.
Create the publication
Create a publication that includes your synced tables and the heartbeat table. Choose the form that matches your privileges and PostgreSQL version (see the table below).
Publication forms
The publication can be created in three forms. Integrate.io detects an existing publication’s form and uses it; for new ones, pick based on the privileges available.| Form | SQL | Requirements | Notes |
|---|---|---|---|
| List | CREATE PUBLICATION ... FOR TABLE <tables> | PostgreSQL 10+, table owner | Explicitly lists tables and the heartbeat table. Supports ALTER PUBLICATION ... ADD TABLE. |
| Schema | CREATE PUBLICATION ... FOR TABLES IN SCHEMA <schema> | PostgreSQL 15+, superuser | Auto-includes current and future tables in the schema (including the heartbeat table). |
| All tables | CREATE PUBLICATION ... FOR ALL TABLES | Superuser | Auto-includes current and future tables in the database. |
Heartbeat
PostgreSQL uses theconfirmed_flush_lsn to determine which WAL segments can be safely removed, so if it’s not updated because the tables in your publication aren’t changing, the WAL storage can grow significantly.
To ensure that the confirmed_flush_lsn advances even when there are no transactions for the selected tables, the connector creates (or uses) a heartbeat table. It inserts a single record and updates it every 10 minutes while the pipeline is running.
Frequently asked questions (FAQs)
Does the sync user need to own my tables?
Does the sync user need to own my tables?
No. ELT & CDC supports a customer-managed model that requires only
LOGIN and REPLICATION on the user, plus USAGE on the schema and SELECT/REFERENCES on the tables. Table ownership and CREATE on the database or schema are not required. If the user happens to have those privileges, the connector manages the publication, slot, and heartbeat table for you automatically.What SQL do I need to run?
What SQL do I need to run?
Integrate.io inspects your database during setup and generates only the statements that are still missing (grants, heartbeat table, publication, and replication slot). If everything is already in place, or the user can fully self-manage, no manual SQL is needed.
What happens when I de-select a table?
What happens when I de-select a table?
For a customer-managed publication, the connector only adds missing tables; it never drops tables you previously published, because it does not own the publication. A de-selected table left in the publication is harmless (it adds some extra WAL decoding overhead but causes no data loss). To remove the overhead, drop it manually with
ALTER PUBLICATION <publication_name> DROP TABLE <schema>.<table>;.How does Integrate.io handle Postgres replication slots?
How does Integrate.io handle Postgres replication slots?
The connector creates a replication slot when the initial sync starts, and then listens for changes from it once the initial sync completes. When you delete or archive a pipeline, the connector removes the replication slot from your Postgres database if it can still access it.
How are large JSONB objects handled?
How are large JSONB objects handled?
Large JSONB objects map to
TEXT/VARCHAR/STRING columns and might be truncated, depending on the destination.For tables that use TOAST storage (typically tables with large JSONB or TEXT columns), set their REPLICA IDENTITY to FULL so the connector can sync the data correctly during continuous sync.