History Mode (SCD Type 2)

Overview

History Mode implements Slowly Changing Dimension Type 2 (SCD Type 2) tracking for your data pipelines. When enabled, Integrate.io maintains a complete history of all changes to your records, allowing you to see not just the current state of your data, but every version that has ever existed.

This feature is ideal for:

  • Audit compliance - Track every change made to sensitive data
  • Historical analysis - Analyze how data has changed over time
  • Point-in-time queries - Reconstruct the state of your data at any moment in history
  • Data lineage - Understand the evolution of your records

How It Works

When History Mode is enabled for a table, Integrate.io automatically adds three tracking columns to your destination:

Column Type Description
_iio_active BOOLEAN TRUE for the current version, FALSE for historical versions
_iio_start TIMESTAMP When this version became active
_iio_end TIMESTAMP When this version became inactive (9999-12-31 for active records)

How _iio_start Is Populated

The _iio_start timestamp indicates when a record version became active. The value depends on the sync mode:

CDC Sync (Continuous Changes)

For CDC operations, _iio_start uses the actual transaction timestamp from the source database:

Source Timestamp Source
MySQL Binlog event timestamp
SQL Server CDC transaction time

This means _iio_start reflects when the change actually happened in your source database, not when Integrate.io processed it.

Initial Sync

During the initial full-table sync, there is no CDC event. In this case, _iio_start is set to the import timestamp - when the sync batch started. All records in the same batch share the same _iio_start value.

Record Lifecycle

INSERT - New record added

When a new record is inserted in the source:

  • A new row is created with _iio_active = true
  • _iio_start is set to the transaction timestamp
  • _iio_end is set to 9999-12-31 23:59:59

UPDATE - Existing record modified

When a record is updated in the source:

  • The current active row is marked inactive: _iio_active = false, _iio_end = transaction_timestamp
  • A new row is inserted with the updated values and _iio_active = true

DELETE - Record removed

When a record is deleted in the source:

  • The current active row is marked inactive: _iio_active = false, _iio_end = transaction_timestamp
  • No new row is inserted (soft delete - history preserved)

Supported Sources & Destinations

⚠️ Note: History Mode is only available for specific source and destination combinations that support CDC (Change Data Capture).

Supported Sources

Source Supported Notes
MySQL ✅ Yes Requires binary logging enabled
SQL Server ✅ Yes Requires CDC enabled on database and tables
Other Sources ❌ No Not supported at this time

Supported Destinations

Destination Supported
Amazon Redshift ✅ Yes
Snowflake ✅ Yes
Other Destinations ❌ No

Configuration

Enabling History Mode

  1. Navigate to your pipeline's Schema Settings page
  2. Find the History Mode toggle in the table header or individual table rows
  3. Enable History Mode for the desired tables

ℹ️ Important: History Mode can only be enabled for tables that are not yet selected for sync. Once a table has started syncing, History Mode cannot be enabled.

Restrictions

  • History Mode must be enabled before the initial sync
  • Cannot be enabled for tables that have already started syncing
  • Requires a primary key on the source table

Example Scenario

After Initial Sync

Destination table with History Mode columns:

id username email status _iio_active _iio_start _iio_end
1 alice alice@example.com active true 2024-01-01 10:00:00 9999-12-31 23:59:59
2 bob bob@example.com active true 2024-01-01 10:00:00 9999-12-31 23:59:59
3 charlie charlie@example.com inactive true 2024-01-01 10:00:00 9999-12-31 23:59:59

After UPDATE (id=1 status changed)

UPDATE users SET status = 'inactive' WHERE id = 1;
id username email status _iio_active _iio_start _iio_end
1 alice alice@example.com active false 2024-01-01 10:00:00 2024-01-15 14:30:00
1 alice alice@example.com inactive true 2024-01-15 14:30:00 9999-12-31 23:59:59
2 bob bob@example.com active true 2024-01-01 10:00:00 9999-12-31 23:59:59
3 charlie charlie@example.com inactive true 2024-01-01 10:00:00 9999-12-31 23:59:59

After DELETE (id=2 removed)

DELETE FROM users WHERE id = 2;
id username email status _iio_active _iio_start _iio_end
1 alice alice@example.com active false 2024-01-01 10:00:00 2024-01-15 14:30:00
1 alice alice@example.com inactive true 2024-01-15 14:30:00 9999-12-31 23:59:59
2 bob bob@example.com active false 2024-01-01 10:00:00 2024-02-01 09:00:00
3 charlie charlie@example.com inactive true 2024-01-01 10:00:00 9999-12-31 23:59:59

Note: After deletion, id=2 still exists with _iio_active = false for historical queries.