Change data capture (CDC) is an essential feature to make modern data integration workflows faster and more efficient. By identifying which data sources have changed since you last extracted the information, CDC allows your data pipelines to only ingest the information they need. Rather than continuously polling an API, you can seamlessly perform data replication and integration.

Related Reading: When to Use Change Data Capture

CDC is especially valuable for large repositories such as the Salesforce CRM, which stores your customer data and helps you manage relationships with your customers and prospects. Improving your Salesforce change data capture could help save your company valuable time and money during data integration. In this article, we’ll investigate 4 ways to make better use of Salesforce CDC.

Table of Contents

1. Understand How It Works

The first step to CDC Salesforce improvement is to understand how the process works and how to configure it. Below is a simplified description:

  1. When a change occurs to a Salesforce record, standard object, or custom object, Salesforce CDC generates an event notification (i.e. a platform event) instantaneously. For example, an AccountChangeEvent denotes that a change data capture event has occurred to an Account object. 
  2. This notification is sent to an event bus, where an external CDC tool can observe it (usually by subscribing to a specified event channel).
  3. The CDC tool uses these notifications to determine which Salesforce records need to be ingested again and processes these records. The CDC payload is then integrated into your centralized data warehouse.

For more information, check out Trailhead's module on "Change Data Capture Basics."

The CDC method that Salesforce uses is known as “trigger-based CDC”: change events are triggered when a particular change occurs. (To learn more about the different CDC methods, such as using timestamps or metadata, check out our article "The Importance of CDC for ETL.")

Salesforce APIs can be used to publish event messages. More specifically, Salesforce CDC can be implemented with apex triggers, which define custom actions after a Salesforce record is modified.

The ChangeEventHeader contains information about change data capture events in Salesforce. Below are some of the fields in this class:

  • recordIds: The ID number(s) of the changed record(s).
  • changedFields: The fields that have been changed after updating a record, such as the LastModifiedDate system field.
  • changeType: The type of change (e.g. CREATE, UPDATE, DELETE, or UNDELETE).
  • changeOrigin: The ID of the client that requested the change.
  • transactionKeyA unique string for each Salesforce transaction.
  • sequenceNumber: The sequence of the change within the transaction, starting at 1.
  • commitTimeStampThe date and time when the change was requested.
  • commitUser: The ID of the user that requested the change.

As you can see, there’s quite a lot of technical complexity required to understand the CDC Salesforce process. For this reason, many businesses choose a dedicated Salesforce data integration tool that can automatically handle these concerns behind the scenes.

2. Know Your Use Case

Change data capture is mainly used for ETL (extract, transform, load) workloads, but it has other applications as well. Make sure that you’re using Salesforce CDC in the manner that best fits your needs and business use cases.

For example, CDC techniques can be used to perform real-time data synchronization between two distributed Salesforce databases or data stores. When information in one location changes, CDC captures the differences and sends the logs to the other database, which can be immediately populated with the changes.

Another Salesforce CDC use case is data auditing of your CRM software. The logs produced by CDC can be used by auditors to more easily reconstruct your database activity, ensuring that your organization is complying with all applicable laws and regulations.

Whatever your CDC Salesforce use case, Integrate.io can help. The Integrate.io platform is a robust solution for data integration that makes it simple to build production-ready data pipelines. Integrate.io’s FlyData CDC tool has been built from the ground up to help users sync their data across their IT environment.

3. Use Salesforce CDC Connectors

Third-party companies such as Confluent have developed Salesforce CDC connectors, making it easier for users to consume this information. For example, Confluent’s Kafka Connect Salesforce Change Data Capture Source connector helps monitor and collect Salesforce CDC information and write these events to an Apache Kafka topic. (This connector is also available for Confluent’s fully managed cloud Kafka service.)

Although CDC Salesforce connectors can be highly useful, they’re not always the right fit for every organization. Connectors usually work only for a single source and a single target, which makes them brittle and inflexible if you want to modify your data workflows. This means that you'll have to spend more time down the line configuring things, sacrificing valuable hours that you could use to find data-driven insights.

Instead, what you'll likely want to look for is a Salesforce connector that's part of a larger ETL tool (see the next section). For example, Integrate.io is launching its own Salesforce CDC connector that is only a single component of our data integration platform, with over 140 pre-built integrations.

4. Use an ETL Tool

Perhaps the easiest way to improve your Salesforce CDC workflow is to enlist the help of a dedicated ETL tool to send this data to an external system. Modern ETL solutions are equipped with automated CDC functionality to work with your data schema and detect information that has changed since the last pipeline execution.

Integrate.io, for example, offers both log-based and trigger-based CDC, depending on your use case. Integrate.io’s FlyData CDC tool handles all your concerns behind the scenes so that your data integration workflows always ingest precisely the data you need—and none of the data you don’t.

How Integrate.io Can Help With Salesforce Change Data Capture

Salesforce CDC is much easier when you have a dedicated ETL platform to streamline and automate the process. Looking for a tool that can help with your Salesforce change data capture needs? Look no further than Integrate.io. The Integrate.io platform is a powerful, feature-rich, user-friendly data integration solution that makes it easy to build automated data pipelines to your cloud data warehouse.

Integrate.io comes with more than 140 pre-built connectors and integrations—including Salesforce. What’s more, it’s simple for anyone, regardless of technical skill level, to use these connectors inside Integrate.io’s no-code, drag-and-drop visual interface.

The Integrate.io ETL platform makes Salesforce CDC a snap. Want to learn more? Get in touch with our team of data integration experts today for a chat about your business needs and objectives, or to start your 7-day demo of the Integrate.io platform.