Allowing ETL access to my data on Google Sheets ETL can access your Google Sheets data. This article details creating the Google Sheets connection and the process of building a data pipeline to read Google Sheets data in ETL. There are also pre-built templates to read data from a Google Sheet or write data to a Google Sheet. Please see Create a package from a template for instructions on using a template.

To create a Google Sheets connection in ETL:

  1. Click the Connections icon (lightning bolt) on the top left menu.
  2. To create a connection, click New connection.

    thumbnail image 
  3. Choose Google Sheets.
  4. Click Authenticate. (If required, sign in to Google Sheets)
  5. Click Connect to authorize ETL access to your Google Sheets account.
  6. In the new new Google Sheets connection window, name the connection and click Create Google Sheets connection.

To modify Google Sheets connections in ETL:

  1. Click the Connections icon (lightning bolt) on the top left menu.
  2. Click a connection to re-connect if your password has changed or the token has expired.
  3. To delete a Google Sheets connection, click on the three vertical dots on the far right of the connection listing and select the Delete connection option. 

To build a data pipeline to read Google Sheets data in ETL:

thumbnail image

REST API Component: read_spreadsheet

  • Authentication - Click Connection and then select your Google Sheets connection. If you haven't created your Google Sheets connection yet, click + New and follow the instructions found above. thumbnail image
  • URL - Enter the URL for the Google Sheets API:$spreadsheetId/values/$sheetName. Make sure the method is set to GET. thumbnail image
  • Sheet ID and name - Replace the variables $spreadsheetId and $sheetName with the values specific to your spreadsheet. You can find them located in the URL and at the bottom left.  thumbnail image
  • Response - Type $.values[*] in the Base record JSONPath Expression field. * If the first record in your sheet contains field names, type $.values[1:] instead. thumbnail image
  • Input fields - Click Select all to move values over to the Selected fields. Then click Save. thumbnail image

Select Component: bag_to_tuple

  • Pass the values field into the BagToTuple function (for more information on the BagToTuple function see this article) - BagToTuple(values). Click Save. thumbnail image

Select Component: parse_data

  • Parse the individual fields in the Expression column using the tuple syntax - values.$0, values.$1, etc. Assign field names in the Alias column. When you parse the tuple, the items are returned as byte array datatypes. Explicitly cast them to the correct datatype. In the template, all fields have been cast to chararray (string). If you prefer a different datatype, change chararray to one of the datatypes that supports. See this doc for supported datatypes and methods of casting and parsing. Click Save.
  • thumbnail image

Destination Component: sheet_data

  • The template shows a Redshift destination component; however, if you'd prefer to use a different destination, delete the Redshift component and select a destination component of your choice. thumbnail image
  • Choose target connection - Select your target connection. If you haven't created your connection yet, click + New. thumbnail image
  • Destination properties - Fill in the values for target schema and table, select an operation type, pre or post-action SQL, and advanced options.
  • Schema mapping - Click Auto-fill to bring in all of the fields. If you've selected a Merge operation type, click the Key box next to the merge key field(s). Click Save. thumbnail image