To migrate data pipelines from Matillion to a modern ETL platform, audit all existing Matillion packages to inventory their sources, destinations, and transformation logic, then recreate the pipelines in the new platform in priority order, run both platforms in parallel for one full cycle to validate output parity, and cut over by disabling Matillion schedules once parity is confirmed.

This guide is for data engineers and engineering leads running Matillion ETL or Matillion Data Productivity Cloud (MDPC) who are evaluating a migration because of pricing complexity, limited API connector support, or the need for visual workflow orchestration. After reading, you will have a step-by-step process to move all Matillion packages to a new platform without breaking downstream dashboards or missing pipeline dependencies.

Migrating data pipelines from Matillion requires a phased approach: inventory first, recreate in priority order, validate with a parallel run, then cut over one pipeline at a time. Teams that follow this sequence complete the migration with fewer surprises than teams that attempt a full cutover at once.

The Problem With Matillion Migrations

Matillion packages can number in the dozens or hundreds across a mature data environment. Migrating all of them at once is not realistic, and the tooling does not make it easier.

Matillion's JSON export format is proprietary. It does not map directly to other platforms, and a one-click migration tool does not exist for most Matillion-to-X migrations. The JSON files work as reference documentation, not as importable assets in a new system.

Teams that attempt big-bang migrations (shut down Matillion, rebuild everything before going live) routinely miss dependencies, break downstream dashboards, and spend more calendar time than planned. A package that looks self-contained often turns out to feed three other packages and two dashboards that no one documented.

What works: a phased migration starting with the least complex pipelines, running both platforms in parallel during the transition, and moving schedules to the new platform only after validation passes.

What You'll Need

  • Matillion account access: read access to all packages you want to migrate, including the ability to export JSON files from the Matillion UI
  • Exported JSON files: one per Matillion package, exported via File > Export in the Matillion interface
  • Inventory spreadsheet: a tracking document with one row per package to record migration status throughout the process
  • New ETL platform: a provisioned account on the platform you are migrating to (Integrate.io is a common destination for Matillion migrations, covered in detail in Steps 3 and 4)
  • Source and destination credentials: database passwords, API keys, OAuth tokens, and SSH keys for every system connected to your Matillion packages

To learn how Integrate.io can help to automate the pipelines, reach out to our team to discuss your use case with our Sales engineer.

How to Migrate Data Pipelines From Matillion to a Modern ETL Platform: Step-by-Step

Step 1: Export and Inventory All Matillion Packages

Before touching the new platform, get a complete picture of what you are moving. Skipping this step is the most common reason migrations stall.

What to do:

  • Log into Matillion and export each package as a JSON file using File > Export in the Matillion UI; store all exports in a shared folder your team can access
  • Create an inventory spreadsheet with one row per package and these columns: package name, source system(s), destination system(s), transformation types used (SQL transformation, Python script, or connector-specific step), schedule (cron expression or manual), average run duration, last successful run date, and downstream dependencies (dashboards, reports, or dbt models that consume this pipeline's output)
  • Classify each package by migration complexity: Simple for straight extract-load with no or minimal transformation; Medium for field mapping, type casting, and joins; Complex for custom SQL transformations, Python steps, multi-branch orchestration, or proprietary Matillion-specific components
  • Sort the spreadsheet by complexity, simple packages first, to set your migration order

Output of this step: A complete package inventory with complexity ratings and a prioritized migration order, ready to drive the rest of the project.

Step 2: Identify and Document All Connector Dependencies

Knowing your source and destination systems is not enough. You need to know which specific connectors Matillion uses for each package, because connector availability in the new platform determines whether a package can be recreated directly or requires a workaround.

What to do:

  • For each package in the inventory, document the connector it uses on the source side; note which connectors are Matillion-native (Salesforce, NetSuite, HubSpot) versus generic database or file connectors (Snowflake, Postgres, S3, SFTP); mark any package where you are not certain the new platform has a matching connector as a flag item
  • Identify which Matillion packages use the proprietary ELT component (transformations that run as SQL inside Snowflake or Redshift) versus packages that extract and transform before loading; ELT packages may be better candidates for migrating to dbt rather than a new ETL platform, since the transformation logic belongs in the warehouse
  • Add a "connector dependency" column to the inventory spreadsheet with the source connector name and a flag for Matillion-native connectors that need direct equivalents

Output of this step: A connector dependency map for every package, with flags identifying where the new platform's connector library needs to cover Matillion-native connectors.

Step 3: Set Up the New ETL Platform and Recreate Connector Credentials

With the inventory complete, set up the new platform before starting any pipeline recreation. Connector configuration is the most time-consuming part; do not underestimate it.

What to do:

  • Provision the new ETL platform and configure workspace settings: environments, user permissions, and secrets management
  • Create a connector for every source and destination system in the inventory; use the same credentials your Matillion environment uses where possible, and create new service accounts only where Matillion was using personal credentials that should not be shared
  • Run a live connection test for each connector before starting pipeline recreation; a failed test at this stage is far cheaper to fix than discovering a broken connection mid-migration
  • Document each connector in the inventory spreadsheet with a pass or fail status

Where Integrate.io helps: Integrate.io covers the connector types Matillion teams most commonly need to replace: OAuth-based connections for Salesforce and HubSpot, SSH tunnel support for on-premise databases, and a secrets manager for API keys and database passwords. A dedicated onboarding engineer joins setup sessions to configure connectors and answer questions about Matillion-specific component equivalents.

Output of this step: A fully configured new ETL platform with live connection tests passing for every source and destination system in the Matillion inventory.

Step 4: Recreate Matillion Packages in Priority Order, Starting With Simple Packages

Work through the migration in three phases that match your complexity ratings. Do not skip ahead to complex packages before the simple ones are live and validated.

What to do:

  • Phase 1 (Simple packages): Recreate all straight extract-load packages first; these typically take 30 to 60 minutes each and build team familiarity with the new canvas; use the exported Matillion JSON as a reference for source fields and destination mappings, but translate the logic manually rather than attempting a JSON import
  • Phase 2 (Medium packages): Recreate packages with field mapping, type casting, and joins; the Matillion JSON shows the mapping logic clearly enough to serve as a reference document; apply the same logic using the new platform's transformation canvas
  • Phase 3 (Complex packages): Tackle packages with custom SQL, Python steps, or multi-step orchestration last; for packages using Matillion's ELT SQL transformation, rewrite the logic as SQL in the new platform's SQL step, or transition it to dbt if the logic is warehouse-bound; Python components can be replaced using the new platform's inline expression or scripting support
  • For each recreated pipeline, update the inventory spreadsheet to note which Matillion package it replaces and the date it was recreated

Where Integrate.io helps: Integrate.io's transformation canvas covers field mapping, type casting, filtering, and SQL expressions that handle the majority of Matillion medium-complexity package logic. Packages using Matillion's Python components can be replaced using Integrate.io's inline Python expression support in the SELECT component. The dedicated onboarding engineer continues to assist with pipeline recreation sessions for complex packages through this phase.

Output of this step: Recreated pipelines covering all simple and medium-complexity Matillion packages, with complex packages in progress, each mapped back to its original Matillion package in the inventory.

Step 5: Run Both Platforms in Parallel for One Full Pipeline Cycle

Before disabling any Matillion schedules, run each recreated pipeline alongside its Matillion counterpart for one complete scheduling cycle. This parallel run is the validation gate that protects downstream dashboards and data consumers.

What to do:

  • For daily pipelines, run both the Matillion package and the new platform's pipeline on the same day and compare output; for weekly pipelines, run both for one week before evaluating
  • After each parallel run, compare output in the destination: check row count from the Matillion run versus the new platform's run; spot-check 20 to 50 records by primary key to verify field-level accuracy
  • Verify that derived or calculated fields (date differences, conditional flags, aggregations) produce the same result in both platforms
  • Record the comparison result per pipeline in the inventory spreadsheet as Pass or Fail; a Fail means the pipeline needs investigation before proceeding

Output of this step: A validation record for each pipeline showing row count match and spot-checked field accuracy, with a clear Pass or Fail status for each.

Step 6: Cut Over by Disabling Matillion Schedules One Pipeline at a Time

Once a pipeline passes parallel validation, disable its Matillion schedule. Do this one pipeline at a time, not in batches.

What to do:

  • For each pipeline that passed parallel validation, disable its Matillion schedule; do not delete the package at this stage, just deactivate the schedule so the new platform becomes the sole running version
  • Enable the corresponding schedule in the new ETL platform immediately after disabling the Matillion schedule; do not leave a gap where neither platform runs the pipeline
  • Monitor the first two independent runs of each newly live pipeline to confirm output matches the Matillion run
  • Update the inventory spreadsheet to mark each pipeline as "live on new platform" with the cutover date
  • Notify downstream teams (dashboard owners, dbt maintainers, BI analysts) that the pipeline has migrated so they can flag any data discrepancies they observe

Output of this step: Pipelines fully cut over to the new platform, Matillion schedules disabled, and downstream stakeholders informed of each cutover.

Step 7: Decommission Matillion Packages and Archive the JSON Exports

Once all pipelines are live on the new platform and have completed at least two weeks of stable runs, remove the Matillion packages and close out the project.

What to do:

  • Delete each Matillion package from the Matillion environment after confirming it has been stable on the new platform for at least two weeks; retain the exported JSON files in version control or cloud storage for at least 12 months as an archive in case the original logic needs to be referenced later
  • Update your team's runbook to remove Matillion references and document the new platform's pipeline names, schedules, and owners
  • Cancel the Matillion subscription once all packages are decommissioned; confirm with finance that all three billing components (cloud compute, Matillion license, and warehouse compute) are addressed in the cancellation

Output of this step: A fully decommissioned Matillion environment with all packages archived, the new platform running all pipelines, and the team's runbook updated with current pipeline names, schedules, and owners.

Common Mistakes to Avoid

  • Attempting a big-bang migration: shutting down Matillion before the new platform is ready breaks downstream dashboards and data consumers immediately; always run both platforms in parallel until validation passes for every pipeline
  • Trying to import Matillion JSON packages directly: Matillion's export format is proprietary; even when import tools exist, they produce pipelines that require extensive manual review; treat the JSON as reference documentation and translate the logic manually
  • Skipping the downstream dependency audit: a pipeline that feeds a dashboard, a dbt model, and three other pipelines has four potential breakage points; identify all downstream consumers before cutting over each pipeline, not after
  • Migrating complex packages before simple ones: complex packages take longer, have more failure modes, and are harder to validate; starting with simple packages builds team confidence in the new platform and surfaces connector issues early when they are cheaper to fix
  • Leaving both schedules active simultaneously: if the Matillion schedule and the new platform's schedule are both running the same pipeline, the destination table receives duplicate data; always disable the Matillion schedule on the same day you enable the new platform's schedule
  • Discarding the JSON archives too early: original pipeline logic is frequently needed months after migration when investigating a data discrepancy or designing a new pipeline that builds on the same source; keep the exports in version control for at least 12 months after decommissioning

Conclusion

Migrating data pipelines from Matillion to a modern ETL platform is a phased process: inventory all packages and classify by complexity, recreate in priority order starting with simple packages, validate output in a parallel run, cut over one pipeline at a time, and decommission after two weeks of stable runs. The seven steps above give data engineering teams a repeatable sequence that protects downstream consumers regardless of package count.

Integrate.io is a common destination for Matillion migrations because it covers the connector types and transformation logic that Matillion packages rely on, with fixed-fee pricing that replaces Matillion's three-part cost structure (cloud compute, license, and warehouse compute billed separately). Dedicated onboarding engineers join setup and pipeline recreation sessions to reduce the time spent in the most labor-intensive phase.

Teams that complete Matillion migrations typically reduce pipeline maintenance time significantly. The new platform's connector library and visual canvas replace the Matillion-specific proprietary steps that required specialized knowledge to build and debug, making pipelines accessible to a broader set of engineers on the team.

Integrate.io: Delivering Speed to Data
Reduce time from source to ready data with automated pipelines, fixed-fee pricing, and white-glove support
Integrate.io