Connection Setup
Configure your Amazon Redshift connection in Integrate.io, then add the Amazon Redshift (Snapshot CDC) source component to your pipeline and select that connection. See the Snapshot CDC Source reference for full configuration details.Snapshot Storage
Amazon Redshift Snapshot CDC uses File Based snapshot storage. The snapshot of your previous run is stored as a Parquet file on Integrate.io managed cloud storage, which means:- Only read access to your Redshift cluster is required.
- No snapshot tables are created in your source database.
- On the first run, every current record is treated as upserted because there is no previous snapshot to compare against.
Database (table-based) snapshot storage is only available for SQL Server connections. For Amazon Redshift, File Based storage is the only option, and you do not need to configure an S3 connection of your own.
Change Detection
Choose how the component decides whether a row has changed:- Primary Key: matches rows by a unique identifier column and detects updates by comparing the remaining column values. Best when the table has a reliable key.
- Composite Hash: builds a hash from all or selected columns and compares hashes between runs. Best when the table has no reliable primary key.
Full Configuration Reference
Snapshot CDC Source reference
Configuration steps, change detection methods, query mode, best practices, troubleshooting, and limitations that apply to every Snapshot CDC source.