> ## Documentation Index
> Fetch the complete documentation index at: https://www.integrate.io/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# ETL: Processing JSON Data

> Parse, extract, and transform JSON data in Integrate.io ETL pipelines using built-in functions like JsonExtract, JsonStringToMap, and ToBag.

## Background

[JSON](https://www.json.org/) is the de-facto format for semi-structured data storage and interchange. The basic data types in JSON are number, string, boolean, object, and array. Objects and arrays allow you to have nested data structures. This makes JSON ideal for storing any data structure, but also requires extra work to handle. JSON data can be stored as either of the following:

* files where each file contains a single object
* files that contain multiple objects delimited by newlines

MongoDB documents are read as JSON objects. Many Rest API sources also return JSON objects, arrays, or line-delimited Jsons.

## Processing JSON Objects in Integrate.io ETL - Tutorial

Integrate.io ETL allows you to [process JSON](https://www.integrate.io/blog/processing-json-data-on-the-cloud/) objects and extract data from them in various ways. Throughout this tutorial, we'll be using the following JSON example (with indentation and line breaks to make it human-readable):

```json expandable theme={null}
{
    "id": 152,
    "name": "Adam",
    "surname": "Thompson",
    "birthdate": "1980-05-02T00:00:00Z",
    "active": true,
    "billing_address": {
    "street": "780 California st.",
        "city": "San Francisco",
        "zipcode": "94103",
        "country": "USA"
    },
    "products": [
        {
            "product_id": 27
            ,"plan_id": 6
        },
        {
            "product_id": 22
            ,"plan_id": 7
        }
    ],
    "location_data":
    {
        "city": "Pittsburgh",
        "country": "USA",
        "coordinates":
        {
            "longitude": 80.0154,
            "latitude": 40.3964
        }
    }
}
```

## Loading JSON Data from files

When loading JSON data from files, simply select the JSON record type and proceed to the next step to select the input fields.

<Frame>
  <img src="https://mintcdn.com/integrateio/HSGmKiO3soPLENmi/images/how-do-i/image-14.webp?fit=max&auto=format&n=HSGmKiO3soPLENmi&q=85&s=e1d8059b324622b723869f6790c38b42" alt="File Storage source with JSON record type selected" width="1052" height="1221" data-path="images/how-do-i/image-14.webp" />
</Frame>

Integrate.io ETL automatically detects fields at the first level of the JSON hierarchy. Complex types such as JSON objects or arrays are automatically transformed into Integrate.io ETL's complex data types - **map** and **bag** respectively, which allows you to easily work with these fields as you will see below. Integrate.io ETL also allows you to use [JSONPath](https://goessner.net/articles/JsonPath/) to traverse JSON objects in a dynamic fashion. In some cases, you may prefer to use JsonPath to extract data from nested objects and arrays at the first level of the hierarchy. To do this, change the data type at the source from **map** or **bag** to **string**.

## Navigating objects

In our example, location\_data is a JSON object at the first level of the hierarchy and is automatically transformed into a map. The following examples show how to extract field values from a map in a Select component:

`location_data#'city'` returns the string value `'Pittsburgh'`

`location_data#'coordinates'` returns the nested JSON object as a string value `'{"longitude": 80.0154,"latitude": 40.3964}'`, which you may also want to break into its elements. There are two ways to process deeper nested objects:

1. You can transform the JSON string into a map again, and extract a field value:

```sql theme={null}
JsonStringToMap(location_data#'coordinates') -> coordinates
```

```sql theme={null}
coordinates#'longitude' -> longitude
```

2. You can use JsonExtractScalar to extract a field value:

```sql theme={null}
JsonExtractScalar(location_data#'coordinates','$.longitude')
```

<Frame>
  <img src="https://mintcdn.com/integrateio/HSGmKiO3soPLENmi/images/how-do-i/image-15.webp?fit=max&auto=format&n=HSGmKiO3soPLENmi&q=85&s=ca1936cf6389c64ec5931abc09a42bcf" alt="Select component extracting nested JSON fields using map and JsonExtractScalar" width="959" height="386" data-path="images/how-do-i/image-15.webp" />
</Frame>

Generally speaking, you should [transform a JSON object to a map](/etl/jsonstringtomap/) if you're extracting multiple fields from it. Use [JsonExtractScalar](/etl/jsonextractscalar/) if you are only extracting a single field or a deeply nested field. Sometimes, fields in an object are unknown and may be used as key-value arrays. In this case, you may want to extract the object's key names and values and flatten the object:

```sql theme={null}
billing_address = Flatten(MapToBag(billing_address))
```

```sql theme={null}
key = billing_address#'key'
value = billing_address#'value'
```

## Navigating arrays

JSON arrays at the first level of the hierarchy are transformed into bags. Bags are collections of data items that you can either transform to [bag functions](/etl/integrateio-etl-functions/#complex) or flatten out to items using the [Flatten](/etl/flatten/) operator so that each item in the array becomes a separate record. If the JSON array items are JSON objects (and not scalar values like string, number, boolean), the result of flattening an array are JSON object strings from which you can extract data from again by using [JsonStringToMap](/etl/jsonstringtomap/) or JsonExtract\* functions. For example:

```sql theme={null}
product = Flatten(products)
```

```sql theme={null}
product = JsonStringToMap(product)
```

```sql theme={null}
product_id = product#'product_id'
plan_id = product#'plan_id'
```

JsonExtract\* functions also allow you to extract data from specific items within the array. If the product field is defined as string instead of Json array, you can use the following JsonPath expression to get the first product id:

```sql theme={null}
ToJson(products)JsonExtractScalar(products,'$.[0].product_id')
```

## Checking if a string is a valid JSON

The JsonExtract\* functions fail if the first argument is not a valid JSON. JsonStringToMap and JsonStringToBag are less stringent and simply return null if the argument is not a valid JSON. Apply the appropriate function to a string and compare it to null in a Filter component or in a CASE statement (within a Select component) in order to check if a field is a valid JSON.

### Example 1:

```sql theme={null}
CASE WHEN JsonStringToMap('{"id":1,"name":"John"}') is null THEN 'not valid' ELSE 'valid json object' END
```

returns 'valid JSON object'

### Example 2:

```sql theme={null}
CASE WHEN JsonStringToMap('{"id":1,"name"}') is null THEN 'not valid' ELSE 'valid json object' END
```

returns 'not valid'

### Example 3:

```sql theme={null}
CASE WHEN JsonStringToBag('[1,2,3]') is null THEN 'not valid' ELSE 'valid json array' END
```

returns 'valid JSON array'

### Example 4:

```sql theme={null}
CASE WHEN JsonStringToBag('{"id":1,"name":"John"}') is null THEN 'not valid' ELSE 'valid json array' END
```

returns 'not valid' (because it's not a json array).
