Documentation Index
Fetch the complete documentation index at: https://www.integrate.io/docs/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Many data warehouse tables (e.g., BigQuery views, external tables, or tables created without constraints) do not define primary key columns. Without a primary key, Integrate.io cannot track rows for deduplication or incremental sync. The Surrogate Key feature solves this by automatically generating a deterministic_iio_id column, a SHA-256 hash computed from all user data columns in each row. This hash serves as a synthetic primary key, enabling these tables to be selected and replicated.
Supported Sources
| Source | Supported | Notes |
|---|---|---|
| BigQuery | Yes | Enabled by default |
How It Works
Hash Generation
For each row in a table without a primary key, Integrate.io:
The hash is deterministic: the same row data always produces the same
_iio_id, regardless of which BigQuery API is used to read it.
Example
Consider a BigQuery tableusers with no primary key:
Source Table (BigQuery)
| name | |
|---|---|
| Alice Johnson | alice@example.com |
| Bob Smith | bob@example.com |
| Carol Davis | null |
_iio_id for each row.
Destination Table (after replication)
| _iio_id | name | |
|---|---|---|
bd5d05a59bdf4667... | Alice Johnson | alice@example.com |
22cae61171dea987... | Bob Smith | bob@example.com |
486af6f1101a528b... | Carol Davis | null |
_iio_id column is marked as the primary key in the destination schema, enabling the destination sink to create the appropriate constraint.
How each hash is computed
Columns are sorted alphabetically:email, name.
For Alice Johnson, the hash input is:
SHA-256 → bd5d05a59bdf4667ea52b80fa88d2900c371850a825f538aaf388f6123065333
For Carol Davis, the null email is represented as a null byte (\\0):
SHA-256 → 486af6f1101a528b970ae6a33dae64ba65ae9f1eef7b3e1d609cd41ee258df1c
Limitations
- Append-only sync: Surrogate key tables only capture new rows. Updates and deletes are not tracked because there is no stable identifier to match rows across syncs.
- Schema changes: If columns are added or removed from the source table, the
_iio_idhash for existing rows will change (since different columns are included in the hash). This effectively treats them as new rows after the schema change.