> ## Documentation Index
> Fetch the complete documentation index at: https://www.integrate.io/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# MySQL source for ELT & CDC

> How to set up and configure MySQL as a CDC source in Integrate.io ELT & CDC, including supported providers, requirements, and features.

|                           |                                                        |
| :------------------------ | :----------------------------------------------------- |
| **Description**           | MySQL is the world's most popular open source database |
| **Type**                  | CDC/Binlog Replication                                 |
| **Supported Replication** | Initial Sync  Continuous Sync                          |
| **Authentication Type**   | Password Authentication                                |

## Setting up MySQL CDC for ELT & CDC

ELT & CDC uses [binlog replication](https://dev.mysql.com/doc/refman/8.0/en/binlog-replication-configuration-overview.html) for MySQL database sources. The binary log records all changes to your database, and Integrate.io reads these log events to capture inserts, updates, and deletes in near real-time. This approach has minimal impact on your source database since it reads from the replication stream rather than querying tables directly.

We recommend using a read-replica for syncing to eliminate any performance impact on your primary database instance.

### Supported providers

* [Amazon RDS MySQL](/cdc/amazon-rds-mysql/)
* [Amazon Aurora MySQL](/cdc/aurora-mysql/)
* [Google CloudSQL MySQL](/cdc/google-cloud-sql/)
* [Azure MySQL](/cdc/azure-mysql/)
* [Self-hosted (generic) MySQL](/cdc/self-hosted-mysql/)

Select your provider above for platform-specific setup instructions covering read-replica creation, user configuration, and binary log settings.

### Requirements

* MySQL versions **5.7.x** or above.
* Binary logging enabled with `binlog_format` set to `ROW` and `binlog_row_image` set to `FULL`.
* Tables with `PRIMARY KEY`.
* Superuser access for creating a sync user. The sync user itself only needs `SELECT`, `RELOAD`, `REPLICATION SLAVE`, `REPLICATION CLIENT`, and `LOCK TABLES` privileges.
* Binlog retention set to a minimum of 1 day (86400 seconds). We recommend 7 days (604800 seconds) to allow time for recovery from any sync interruptions.

### Features

| Feature                | Supported | Notes                                |
| :--------------------- | :-------- | :----------------------------------- |
| Full (Historical) sync | Yes       |                                      |
| Incremental sync       | Yes       |                                      |
| Replicate DELETE       | Yes       |                                      |
| UPSERT                 | Yes       |                                      |
| Append only mode       | Yes       | Can be specified at a table level    |
| Exclude tables         | Yes       |                                      |
| Exclude columns        | Yes       |                                      |
| SSL Support            | Yes       |                                      |
| SSH tunnel             | Yes       | [SSH Tunnel Guide](/cdc/ssh-tunnel/) |

### Binlog retention

Binlog files must be retained long enough for Integrate.io to read changes since the last sync checkpoint. If binlog files are purged before the pipeline can read them, the pipeline will need to perform a full resync.

The recommended retention settings depend on your provider:

| Provider         | Setting                                                      | Recommended Value |
| :--------------- | :----------------------------------------------------------- | :---------------- |
| Amazon RDS       | `binlog retention hours` (via `mysql.rds_set_configuration`) | 168 (7 days)      |
| Aurora MySQL     | `binlog retention hours` (via `mysql.rds_set_configuration`) | 168 (7 days)      |
| Google Cloud SQL | Automated backups with point-in-time recovery enabled        | 7 days            |
| Azure MySQL      | `binlog_expire_logs_seconds` server parameter                | 604800 (7 days)   |
| Self-hosted      | `binlog_expire_logs_seconds` in MySQL config                 | 604800 (7 days)   |

### Frequently Asked Questions (FAQs)

<AccordionGroup>
  <Accordion title="Do I need to use a read-replica?">
    A read-replica is optional but recommended. Syncing from a replica eliminates any performance impact on your primary database. If you prefer to sync directly from the primary instance, that works too. The sync user only performs lightweight read operations.
  </Accordion>

  <Accordion title="What happens if binary logging is not enabled?">
    Binary logging is required for CDC. Without it, Integrate.io cannot capture changes. You will need to enable it and restart your MySQL instance before setting up the pipeline. Each provider guide includes the specific steps.
  </Accordion>

  <Accordion title="What MySQL engines are supported?">
    InnoDB is the recommended and most commonly used engine. MyISAM tables can be replicated, but InnoDB provides better transaction support and crash recovery.
  </Accordion>

  <Accordion title="Can I sync specific tables only?">
    Yes. When configuring your pipeline, you can select which tables to include. You can also exclude specific columns from synced tables.
  </Accordion>
</AccordionGroup>

## Related

<CardGroup cols={2}>
  <Card title="Amazon RDS MySQL" icon="arrow-right" href="/cdc/amazon-rds-mysql" horizontal />

  <Card title="Aurora MySQL" icon="arrow-right" href="/cdc/aurora-mysql" horizontal />

  <Card title="Azure MySQL" icon="arrow-right" href="/cdc/azure-mysql" horizontal />

  <Card title="Google Cloud SQL MySQL" icon="arrow-right" href="/cdc/google-cloud-sql" horizontal />

  <Card title="Self-hosted MySQL" icon="arrow-right" href="/cdc/self-hosted-mysql" horizontal />

  <Card title="MySQL PrivateLink" icon="arrow-right" href="/cdc/privatelink-set-up" horizontal />
</CardGroup>
