Skip to main content
Once you define a package, you can verify it, and, as in any development lifecycle, fix any errors and re-verify until the package is ready to run as a job on a cluster. Click the following links for information on using packages:
  • Creating a new package
  • Creating a new package from a template
  • Working in the package designer
  • Using and setting variables in your packages
  • Validating a package
  • Using pattern matching in source component paths
  • Using ISO 8601 date/time functions
  • Using functions in components
  • Components:
    ComponentDescription
    Amazon Redshift SourceRead data stored in an Amazon Redshift table, view or using a query.
    Bing Ads SourceRead Bing Ads report data.
    Database SourceRead data stored in a database table, view or using a query.
    Facebook Ads Insights SourceRead Facebook Ads Insights reports data.
    File Storage SourceRead data stored in a file or multiple files in object stores such as Amazon S3, Google Cloud Storage or Azure Blob Storage or file servers such as SFTP.
    Google Ads sourceRead Google Ads report data.
    Google Analytics SourceRead Google Analytics report data.
    Google Analytics (GA4) SourceRead Google Analytics 4 (GA4) report data.
    Google BigQuery SourceRead data stored in a Google BigQuery table or using a query.
    Google Cloud Spanner SourceRead data stored in a Google Cloud Spanner table or using a query.
    MongoDB SourceRead data stored in a MongoDB collection.
    NetSuite SourceRead NetSuite standard and custom records (tables) using the NetSuite JDBC drivers (SuiteAnalytics Connect).
    Salesforce sourceRead Salesforce sales cloud standard and custom objects using the Bulk API.
    Rest API SourceRead data from HTTP endpoints such as Rest Web Services. Use the Rest API source component to define the authentication method, request parameters and response fields to use in the package.
    Aggregate TransformationUse the Aggregate transformation to group the input dataset by one or more fields and use aggregate functions such as Count, Average, Minimum, Maximum, etc
    Assert TransformationUse the Assert transformation to make sure that all data in the source complies with the conditions you specify in the component. If a record does not comply, the job fails and a message is added to the error log.
    Clone transformationUse the Clone component to split a dataflow into two dataflows in order to apply multiple transformations to the same data.
    Cross Join TransformationUse the Cross Join transformation to combine records from two different inputs. The cross join returns the Cartesian product of records from the two inputs. That is, it will produce records that combine each record from the left input with each record from the right input.
    Distinct TransformationUse the Distinct transformation to filter out duplicate records that have the same values in all fields, leaving only unique records. For example, you might need to filter out users’ double-clicks in events.
    Filter TransformationUse the Filter transformation to filter input data by defining conditions that must be met by records in the input.
    Join TransformationUse the Join transformation to combine records from two different inputs. The join component can be used to add information from one data source to another data source or to filter data that exists in both data sources or exists in only one of them.
    Limit TransformationUse the Limit transformation to limit the number of records in the output for the entire dataset or per partition or group within the data set.
    Rank TransformationUse the Rank component to sort input data by one or more fields, in an ascending or descending order and add a rank field that reflects the sort order.
    Select TransformationUse the Select transformation to choose which fields from the input will be available in the next component and transform them using expressions in order to parse input data, enrich it, extract information from it or manipulate it.
    Sort TransformationUse the Sort component to sort input data by one or more fields, in an ascending or descending order.
    Union TransformationUse the Union transformation to combine records from two inputs with the same schema (same fields and data types).
    Window TransformationUse the Window component to apply window functions to incoming data, similar to window functions in SQL. These functions let you rank or distribute data, provide moving averages, running totals and other useful data. The output of the Window component contains all records and fields from the input data flow with the addition of the calculated window functions.
    Sample TransformationUse the Sample component to return a percentage of random records from the input
    Cube transformationUse the Cube and Rollup component to group the input dataset by combinations of fields and use aggregate functions such as Count, Average, Minimum, Maximum, etc.
    Amazon Redshift DestinationUse the Cube and Rollup component to group the input dataset by combinations of fields and use aggregate functions such as Count, Average, Minimum, Maximum, etc.
    Database DestinationUse the database destination component to store the output of a data flow in a relational database table.
    File Storage DestinationUse the File storage destination component to store the output of a data flow into files in a designated directory on a file server (SFTP, HDFS) or object store (Amazon S3, Google Cloud Storage, Azure Blob Storage).
    Google BigQuery DestinationUse the Google BigQuery destination component to store the output of a data flow in a BigQuery table.
    Google Spanner DestinationUse the Google Spanner destination component to store the output of a data flow in a Google Spanner table.
    MongoDB DestinationUse the MongoDB destination component to store the output of a data flow in a MongoDB collection.
    Salesforce DestinationUse the Salesforce destination component to store the output of a data flow in Salesforce Sales cloud object.
    Snowflake DestinationUse the Snowflake destination component to store the output of a data flow in a Snowflake table.
    Salesforce SOAP DestinationUse the Salesforce SOAP destination component to store the output of a data flow in Salesforce Sales cloud object using Salesforce SOAP connection.
    Netsuite SOAP DestinationUse the Netsuite SOAP destination component to store the output of a data flow in Netsuite cloud object using Netsuite SOAP connection.
    Facebook Ads DestinationUse the Facebook ads destination component to store the output of a data flow in Facebook ads cloud object.
    Google Ads DestinationUse the Google ads destination component to store the output of a data flow in Google ads cloud object.
    Tiktok Ads DestinationUse the Tiktok destination component to store the output of a data flow in Tiktok ads cloud object.
    HubSpot DestinationUse the HubSpot destination component to store the output of a data flow in HubSpot cloud object.
Last modified on April 20, 2026