The Python Transformation component allows you to run custom Python code as part of your Integrate.io ETL pipeline. It is designed to transform records without changing the schema of the input data.
Overview
- Accepts an array of JSON records as input
- Requires a defined
transform
function
- Input and output schemas must match exactly
- Useful for string manipulation, cleaning, enrichment, or any custom logic that doesn't modify schema
- You can remove records from the array (e.g. for filtering), but:
- If returned, records must follow the original schema
- You may return an empty array, but not a different schema
Example Transformation Code
import json
def transform(event, context):
"""
Transforms input records by converting all string values to uppercase.
IMPORTANT: The output schema must match the input schema exactly.
Any structural changes to the data will cause errors in the pipeline.
Args:
event (list): Array of records to process. Each record is a dictionary.
context: Lambda context object (not used in this example)
Returns:
list: Transformed records with the same schema as input. Can be empty.
Raises:
Exception: If any error occurs during processing
"""
try:
data = event
results = []
for record in data:
transformed_record = {}
for key, value in record.items():
if isinstance(value, str):
transformed_record[key] = value.upper()
else:
transformed_record[key] = value
results.append(transformed_record)
return results
except Exception as e:
raise e
Example Input and Output
Input
[
{ "id": 1, "name": "Belgian Waffles", "price": 5.95 },
{ "id": 2, "name": "Pancakes", "price": 4.95 }
]
Output
[
{ "id": 1, "name": "BELGIAN WAFFLES", "price": 5.95 },
{ "id": 2, "name": "PANCAKES", "price": 4.95 }
]
You may remove some records entirely (e.g. to filter invalid ones), but every returned record must exactly match the input schema. You can return an empty array, but not a different structure.
Configuration
Batch Size
Define the number of records per batch. The total payload size should not exceed 6 MB.
Test Code
You can test your transformation directly in the Package Designer:
- Provide a sample payload in table format.
- Click Run Code.
- View the transformed output below.
The test table mirrors the schema of the connected input component, making it easy to verify transformations before deploying.
Variables
You can use package, secret and global variables using python import format, for example:
from package_variables import variable_name
Note that package_variables includes also secrets and global. All variables will be evaluated before passing to Lambda function.
Best Practices
- Always ensure the output matches the input schema (field names and types).
- It is valid to return fewer records or an empty array if you're filtering data.
- Catch and log exceptions to help debug errors during transformation.
- For complex transformations, break logic into helper functions within the same script.