> ## Documentation Index
> Fetch the complete documentation index at: https://www.integrate.io/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# ETL: Aggregate Transformation

> Use the Aggregate transformation to group data by fields and apply functions like Count, Average, Min, and Max in your ETL pipeline.

<Frame>
  <img src="https://mintcdn.com/integrateio/2ttHYDu3EKov-VoY/images/creating-packages/using-components-aggregate-transformation/image-1.png?fit=max&auto=format&n=2ttHYDu3EKov-VoY&q=85&s=5d1550dddc7d4eb59129a6313cb5c6fa" alt="Aggregate component configuration panel" width="1200" height="828" data-path="images/creating-packages/using-components-aggregate-transformation/image-1.png" />
</Frame>

## Grouping fields

Select `Treat entire input as one group` to output a single record for the entire input data with aggregate functions or `Group input data by field values` to select the grouping key fields.

<Frame>
  <img src="https://mintcdn.com/integrateio/2ttHYDu3EKov-VoY/images/creating-packages/using-components-aggregate-transformation/image-2.png?fit=max&auto=format&n=2ttHYDu3EKov-VoY&q=85&s=8e883d3d1129222428943e505540f29d" alt="Treat entire input as one group option selected" width="1200" height="828" data-path="images/creating-packages/using-components-aggregate-transformation/image-2.png" />
</Frame>

<Frame>
  <img src="https://mintcdn.com/integrateio/2ttHYDu3EKov-VoY/images/creating-packages/using-components-aggregate-transformation/image-3.png?fit=max&auto=format&n=2ttHYDu3EKov-VoY&q=85&s=be4f55ce06e6cc2ce4a373bdfadb6550" alt="Group input data by field values with key fields selected" width="1200" height="828" data-path="images/creating-packages/using-components-aggregate-transformation/image-3.png" />
</Frame>

## Aggregate functions

Select the aggregate function and input arguments (see below) and assign each an output alias. The names of the grouping fields and output aliases must be unique.

### Aggregate functions list

* **Count** - returns the number of non-null values in the field you specify in the field column, according to the groupings. Return value data type is long.
* **Count Distinct** - returns the number of unique values in the field you specify in the field column, according to the groupings. Return value data type is long.
* **Count All** - returns the number of records, according to the groupings. Return value data type is long.
* **HLL** - uses the HyperLogLog++ algorithm to return a cardinality estimate or an approximate number of distinct values in the field you specify, according to the groupings. Return value data type is long.
* **Average** - returns the average for numeric fields you specify in the field column, according to the groupings. See the following table for return value data types:
  | Argument field data type | Return value data type |
  | :----------------------- | :--------------------- |
  | int, long                | long                   |
  | float, double            | double                 |
* **Sum** - returns the sum for numeric fields you specify in the field column, according to the groupings. See the following table for return value data types:
  | Argument field data type | Return value data type |
  | :----------------------- | :--------------------- |
  | int, long                | long                   |
  | float, double            | double                 |
* **Min** - returns the minimum value for the field you specify in the field column, according to the groupings. Return value data type is the same as the input argument's data type.
* **Min By** - for the minimum value in the field you specify in the field column, and according to the groupings, returns the value defined by projected field. Return value data type is the same as the projected field's data type.
* **Max** - calculates the maximum value for the field you specify in the field column, according to the groupings. Return value data type is the same as the input argument's data type.
* **Max By** - for the maximum value in the field you specify in the field column, and according to the groupings, returns the value defined by projected field. Return value data type is the same as the projected field's data type.
* **VAR** - returns the statistical variance for all values in the field you specify in the field column and according to the groupings. Return value data type is double.
* **VARP** - returns the statistical variance for the population of all values in the field you specify in the field column and according to the groupings. Return value data type is double.
* **STDEV** - returns the statistical standard deviation for all values in the field you specify in the field column and according to the groupings. Return value data type is double.
* **STDEVP** - returns the statistical standard deviation for the population of all values in the field you specify in the field column and according to the groupings. Return value data type is double.
* **Collect** - returns a collection (bag) of the values in the field you specify in the field column, according to the groupings. The bag can be manipulated further in a Select component using bag functions. Returned data type is bag.

## Related

<CardGroup cols={2}>
  <Card title="Window Transformation" icon="arrow-right" href="/etl/using-components-window-transformation" horizontal />

  <Card title="Distinct Transformation" icon="arrow-right" href="/etl/using-components-distinct-transformation" horizontal />

  <Card title="Sort Transformation" icon="arrow-right" href="/etl/using-components-sort-transformation" horizontal />
</CardGroup>
