> ## Documentation Index
> Fetch the complete documentation index at: https://www.integrate.io/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# ETL: Processing Different Encodings

> Process source files with non-UTF-8 character encodings in Integrate.io ETL by configuring the encoding option to read CSV, TSV, and text files correctly.

While Integrate.io [ETL platform](https://www.integrate.io/blog/ai-etl-tools/) works with primarily UTF-8 encoded data, other character encodings can be processed with steps as shown in this example:

<Columns cols={2}>
  <Frame>
    <img src="https://mintcdn.com/integrateio/fpWCvrjvoCDC-WOb/images/how-do-i/image-25.webp?fit=max&auto=format&n=fpWCvrjvoCDC-WOb&q=85&s=a969ca01f6fed1f1e17ba6880da99d3a" alt="Dataflow for converting non-UTF-8 encoded data with transformation steps" width="195" height="1139" data-path="images/how-do-i/image-25.webp" />
  </Frame>
</Columns>

This dataflow used the functions as detailed in the following steps. These would need to be replaced with the relevant encoding, fields delimiters specific to your use case

<Steps>
  <Step>
    Read data as raw and binary data type.
  </Step>

  <Step>
    Convert the byte array data in the given encoding to a string type using [`ByteArrayToString`](/etl/bytearraytostring/)`(body, 'UTF-16LE')`.
  </Step>

  <Step>
    Split the data from step 2 using [`STRSPLITTOBAG`](/etl/strsplittobag/)`(body,'\n')` and then a [`Flatten()`](/etl/flatten/) to get individual records or lines.
  </Step>

  <Step>
    Remove headers as applicable (if it is from an API) with the [filter transformation](/etl/using-components-filter-transformation/). Text matches(regex) options can be useful here.
  </Step>

  <Step>
    Individual lines are split based on the relevant delimiter using [`CSVSPLIT`](/etl/csvsplit/)`(line, '\t')`.
  </Step>

  <Step>
    Extract the required fields from the tuple as line.$0, line.$1,line.\$2 and so on.
  </Step>
</Steps>
