Sources - PostgreSQL

Description

The World's Most Advanced Open Source Relational Database

Type

CDC/Binlog Replication

Supported Replication

Initial Sync

Continuous Sync

Authentication Type

Password Authentication

Setting up PostgreSQL CDC for ELT & CDC

ELT & CDC uses WAL based logical replication - pgoutputdecoding for PostgreSQL database source and requires a user with LOGIN and REPLICATION role.

Supported providers

Requirements

  • PostgreSQL  versions 10.x or above.
  • Database user with CREATE permission on the schema and database to be synced.
  • Tables with PRIMARY KEY .
  • Sync user should be the OWNER of the tables.
  • The ownership requirement is for adding the tables to the PUBLICATION that we create in order for receiving replication events from the database. For more info, click here.
  • Master instances of Postgres (log based replication only works for master instances).

Features

Feature
Supported Notes
Full (Historical) sync Yes
Incremental sync
Yes
Replicate DELETE
Yes
UPSERT Yes
Append only mode
Yes
Can be specified at a table level
Exclude tables
Yes
Exclude columns
Yes
SSL Support
Yes
Secure tunnel
Yes Supported on AWS

Frequently Asked Questions (FAQs)

How do you handle Postgres replication slots?

We create a replication slot when the initial sync starts. After the initial sync completes, we start listening for changes from it. Once a pipeline is deleted or archived by you, we try to remove the replication slot from your Postgres DB provided that we can still access it.
How do you handle large JSONB objects?

Large JSONB objects are mapped as TEXT/VARCHAR/STRING columns and might be truncated (depending on destination). Note about the TOAST table - if there are tables which use the toast tables (usually tables with huge JSONB or TEXT columns) then their REPLICA IDENTITY (https://www.postgresql.org/docs/10/sql-altertable.html#SQL-CREATETABLE-REPLICA-IDENTITY) needs to be set to FULL for us to properly sync the data in case of continuous sync.