>

Using components: Python Transformation

The Python Transformation component allows you to run custom Python code as part of your Integrate.io ETL pipeline. It is designed to transform records without changing the schema of the input data.

Overview

  • Accepts an array of JSON records as input
  • Requires a defined transform function
  • Input and output schemas must match exactly
  • Useful for string manipulation, cleaning, enrichment, or any custom logic that doesn't modify schema
  • You can remove records from the array (e.g. for filtering), but:
    • If returned, records must follow the original schema
    • You may return an empty array, but not a different schema

Example Transformation Code

import json

def transform(event, context):
    """
    Transforms input records by converting all string values to uppercase.
    IMPORTANT: The output schema must match the input schema exactly.
    Any structural changes to the data will cause errors in the pipeline.
    
    Args:
        event (list): Array of records to process. Each record is a dictionary.
        context: Lambda context object (not used in this example)
    
    Returns:
        list: Transformed records with the same schema as input. Can be empty.
    
    Raises:
        Exception: If any error occurs during processing
    """
    try:
        data = event
        results = []
        
        for record in data:
            transformed_record = {}
            for key, value in record.items():
                if isinstance(value, str):
                    transformed_record[key] = value.upper()
                else:
                    transformed_record[key] = value
            results.append(transformed_record)
        
        return results
    
    except Exception as e:
        raise e

Example Input and Output

Input

[
  { "id": 1, "name": "Belgian Waffles", "price": 5.95 },
  { "id": 2, "name": "Pancakes", "price": 4.95 }
]

Output

[
  { "id": 1, "name": "BELGIAN WAFFLES", "price": 5.95 },
  { "id": 2, "name": "PANCAKES", "price": 4.95 }
]

 You may remove some records entirely (e.g. to filter invalid ones), but every returned record must exactly match the input schema. You can return an empty array, but not a different structure.

Configuration

Batch Size

Define the number of records per batch. The total payload size should not exceed 6 MB.

Test Code

You can test your transformation directly in the Package Designer:

  1. Provide a sample payload in table format.
  2. Click Run Code.
  3. View the transformed output below.

The test table mirrors the schema of the connected input component, making it easy to verify transformations before deploying.

Variables

You can use package, secret and global variables using python import format, for example:

from package_variables import variable_name

Note that package_variables includes also secrets and global. All variables will be evaluated before passing to Lambda function.

Best Practices

  • Always ensure the output matches the input schema (field names and types).
  • It is valid to return fewer records or an empty array if you're filtering data.
  • Catch and log exceptions to help debug errors during transformation.
  • For complex transformations, break logic into helper functions within the same script.