Using components: Amazon Redshift Source

Use the Amazon Redshift source component to read data stored in an Amazon Redshift table, view or using a query. The source component uses Amazon Redshift's UNLOAD statement to pull data into files in Amazon S3 and then reads the files.

Connection

Select an existing Amazon Redshift connection or create a new one (for more information, see Allowing Integrate.io ETL access to my Redshift cluster.) 

Source Properties

  • Access mode - select table to extract an entire table/view or query to execute a query.
  • Source schema - the source table's schema. If empty, the default schema is used.
  • Source table/view - the table or view name from which the data will be imported.
  • where clause - optional. You can add predicates clauses to the WHERE clause as part of the SQL query that is built in order to get the data from the database. Make sure to skip the keyword WHERE.
    Good prod_category = 1 AND prod_color = 'red'
    Bad WHERE prod_category = 1 AND prod_color = 'red'
  • Query - type in a SQL query. Make sure to name all columns uniquely.
  • Null string - NULL values in string columns will be replaced with the string specified here. By default NULL values will appear like empty strings.

Source Schema

After defining the source table/view/query select the fields to use in the source.

With table access mode, the fields you select are used to build the query that will be executed to read the data.

With query access mode, select all the fields that are defined in the query and make sure to use the same column names

Define the data type for the field. Use the following table when matching Redshift data types to Integrate.io ETL data types.

Amazon Redshift Integrate.io ETL
varchar, nvarchar, text String
smallint, int Integer
bigint Long
decimal, real Float
double precision Double
timestamp, date DateTime