Allowing Integrate.io ETL access to my data on Shopify

Integrate.io ETL can read your Shopify data. This article details creating the Shopify connection and the process of building a data pipeline to read Shopify data in Integrate.io ETL. There are also pre-built templates to read data from Shopify. Please see Create a package from a template for instructions on using a template.

To create a Shopify connection in Integrate.io ETL:

  1. Click the Connections icon (lightning bolt) on the top left menu.
  2. To create a connection, click New connection. thumbnail image
  3. Choose Shopify. 
  4. Enter the Shopify store name you'd like to connect to and click Authenticate.
  5. (If required, sign in to Shopify)
  6. Click Connect to authorize Integrate.io ETL access to your Shopify account.
  7. In the new new Shopify connection window, name the connection and click Create Shopify connection.

To modify Shopify connections in Integrate.io ETL:

  1. Click the Connections icon (lightning bolt) on the top left menu.
  2. Click a connection to open it.  Make any necessary changes, then click Reconnect, and Save changes. To exit the Shopify connection window without changes, click Back to connections (grey tab on the left side) on the Shopify connection window.
  3. To delete a Shopify connection, click on the three vertical dots on the far right of the connection listing and select the Delete connection option.

To build a data pipeline to read Shopify customer data in Integrate.io ETL: thumbnail image

REST API Component: shopify_customers

  • Authentication - Click Connection and then select your Shopify connection. If you haven't created your Shopify connection yet, click + New and follow the instructions found above. thumbnail image
  • URL - Enter the URL for the Customers endpoint of the Shopify API: https://$shop_name/admin/api/2020-04/customers.json?limit=250&updated_at_min=$last_updated. Replace the variables $shop_name and $last_updated with your shop name and a timestamp for the minimum age of the updated data. Make sure the method is set to GET. thumbnail image
  • Pagination - Check the box Use pagination and select Automatic from the Pagination scheme drop down menu.thumbnail image
  • Response - Make sure the JSON response type is selected.  Edit the Base record JSONPath Expression field like this: $.customers[*].thumbnail image
  • Input fields - Click Select all to move all the Available fields over to the Selected fields or pick and choose which fields you would like by clicking on the + icon next to the field name in the Available Fields column. Then click Save.thumbnail image

Clone Component

  • No setup required. The clone component allows you to perform multiple transformations on the same input data.

Select Component: flatten_customer_addresses (left-hand pipeline)

  • This section of the pipeline will pass through the customer id and addresses field in order to flatten the array of addresses and build out a customer address table. Click in the first text field in the Expression column. Select "id" from the list of fields (#1). Type "id" in the Alias column (#2) (or click on the magic wand icon [#3].) Then click on the plus icon to the right of the Alias value to add another field (#4). In the Expression column for the second field select "addresses" (#5).  Pass the addresses field into the Flatten function like this: Flatten(addresses) (#6) Type "address" in the Alias column (#7). For more information on the Flatten function see this article. Click Save. thumbnail image

Select Component: map_customer_addresses (left-hand pipeline)

  • Click Autofill to bring in all the fields from the previous component. Pass the address field into the JsonStringToMap function like this: JsonStringToMap(address) Type "address" in the Alias column. For more information on the JsonStringToMap function see this article. Click Save.thumbnail image

Select Component: parse_customer_address (left-hand pipeline)

  • Click Autofill to bring in all the fields from the previous component. Parse the nested address fields using this syntax: field_name#'key'.For example, address#'first_name'. For any of the non-string fields, explicitly cast them to the appropriate data type. For example, (long)address#'id'. For more information on parsing JSON data see this article. Click Save. thumbnail image

Destination Component: customers_addresses_destination

  • The template shows a Redshift destination component; however, if you'd prefer to use a different destination, delete the Redshift component and select a destination component of your choice. thumbnail image
  • Choose target connection - Select your target connection. If you haven't created your connection yet, click + New. thumbnail image
  • Destination properties - Fill in the values for target schema and table, select an operation type, pre or post-action SQL, and advanced options.
  • Schema mapping - Click Auto-fill to bring in all of the fields. If you've selected a Merge operation type, click the Key box next to the merge key field(s). Click Save.thumbnail image

Select Component: parse_customer_address (right-hand pipeline)

  • This section of the pipeline will pass through all of the fields except for the addresses field and builds out a customers table. Add another Select component after the Clone component. Click Autofill to bring in all the fields from the previous component. Then click the x icon next to the addresses field to remove it. Parse the nested fields inside the default_address field using this syntax: field_name#'key'. For example, default_address#'first_name'. Setting up this component is very similar to setting up the parse_customer_address component on the left-hand pipeline. For more information on parsing JSON data see this article. Click Save.

Destination Component: customers_destination (right-hand pipeline)

  • Follow the same steps to set up this component that you did to set up the customer_address_destination component on the left-hand pipeline.