Skip to main content

Documentation Index

Fetch the complete documentation index at: https://www.integrate.io/docs/llms.txt

Use this file to discover all available pages before exploring further.

Overview

Many data warehouse tables (e.g., BigQuery views, external tables, or tables created without constraints) do not define primary key columns. Without a primary key, Integrate.io cannot track rows for deduplication or incremental sync. The Surrogate Key feature solves this by automatically generating a deterministic _iio_id column, a SHA-256 hash computed from all user data columns in each row. This hash serves as a synthetic primary key, enabling these tables to be selected and replicated.

Supported Sources

SourceSupportedNotes
BigQueryYesEnabled by default

How It Works

Hash Generation

For each row in a table without a primary key, Integrate.io:
1
Collects all user data columns (excludes only the _iio_id column itself)
2
Sorts column names alphabetically for deterministic ordering
3
Normalizes each value to a consistent string representation
4
Computes a SHA-256 hash of the joined string
5
Outputs a 64-character lowercase hex string as _iio_id
The hash is deterministic: the same row data always produces the same _iio_id, regardless of which BigQuery API is used to read it.

Example

Consider a BigQuery table users with no primary key:

Source Table (BigQuery)

nameemail
Alice Johnsonalice@example.com
Bob Smithbob@example.com
Carol Davisnull
This table has no primary key columns, so Integrate.io automatically generates _iio_id for each row.

Destination Table (after replication)

_iio_idnameemail
bd5d05a59bdf4667...Alice Johnsonalice@example.com
22cae61171dea987...Bob Smithbob@example.com
486af6f1101a528b...Carol Davisnull
The _iio_id column is marked as the primary key in the destination schema, enabling the destination sink to create the appropriate constraint.

How each hash is computed

Columns are sorted alphabetically: email, name. For Alice Johnson, the hash input is:
alice@example.com \\x1F Alice Johnson
SHA-256bd5d05a59bdf4667ea52b80fa88d2900c371850a825f538aaf388f6123065333 For Carol Davis, the null email is represented as a null byte (\\0):
\\0 \\x1F Carol Davis
SHA-256486af6f1101a528b970ae6a33dae64ba65ae9f1eef7b3e1d609cd41ee258df1c

Limitations

  • Append-only sync: Surrogate key tables only capture new rows. Updates and deletes are not tracked because there is no stable identifier to match rows across syncs.
  • Schema changes: If columns are added or removed from the source table, the _iio_id hash for existing rows will change (since different columns are included in the hash). This effectively treats them as new rows after the schema change.
Last modified on May 12, 2026