Skip to main content

Connection Setup

Connect Integrate.io ETL to your Redshift cluster in order to write data to it, or read data from it. Watch this short video walkthrough or read below for further instructions.

In Redshift console

1
Start by navigating to the Redshift console —> Clusters screen and clicking on the Properties tab.
Redshift cluster Properties tab in the AWS console
2
Scroll down to the Network and Security section. Make sure that the cluster is set with the value for Publicly Accessible to Yes. Then, click on the VPC Security Group to verify and/or modify the security rules.
Network and Security section with Publicly Accessible setting and VPC Security Group
3
In the Security Group screen, select the Inbound tab.
Security Group Inbound rules tab
4
There should be rules for the IP addresses listed here. If those rules need to be altered or don’t exist, click Edit.
Inbound rules with IP address entries for Integrate.io access
5
Edit any existing rules or click Add Rule to add a new rule. For each rule, select the type of database and enter the Redshift port. Then click Save.
Adding a new inbound rule with database type and Redshift port
To configure your cluster to only accept SSL encrypted connections: First, access the parameter group and edit it to set require_ssl to true.
Redshift parameter group with require_ssl set to true
Then, Navigate to Config —> Workload management. If you created the cluster with a default parameter group, create a new parameter group and modify the cluster to associate to that parameter group. Click Edit on the cluster homepage, then go to Database Configurations to associate the parameter group with the current cluster.
Associating a parameter group with the Redshift cluster

To verify or modify the security rules of an instance on EC2-Classic (without VPC)

In your Redshift Cluster Security Group, modify a rule or add a new rule for each IP address listed here:
EC2-Classic Security Group with CIDR/IP authorization fields
  • In the Connection Type dropbox, choose CIDR/IP.
  • In the CIDR/IP to Authorize field, enter the IP addresses from this list.

Create a Redshift user

  • Create a Redshift user.
  • Grant it the following permissions:
    • If you intend to only append data into a table, give the user minimal permissions required to execute the COPY command.
    • If you intend to merge data into a table, give the user minimal permissions required to execute the COPY command, create a table, and insert and update to your target tables.
    • Note that truncate requires Integrate.io ETL to either be the owner of the target table or have superuser access.

To define a connection in Integrate.io ETL to Amazon Redshift

1
Click Connections (the lightning bolt) on the left side of the dashboard.
2
Click New connection.
Connections page with New connection button
3
Select Amazon Redshift from the Analytical Database category.
Selecting Amazon Redshift from the Analytical Database category
4
Type a name for the connection.
5
If you allow direct access from Integrate.io ETL’s IP addresses, enter the hostname and port. If direct access is not allowed, read more about setting a tunnel connection here.
6
Enter the user name and password you created for Integrate.io ETL to use.
7
Enter the default database to use. If you leave it empty, the user’s default database will be used.
8
Set the region to the AWS region in which the Redshift cluster was created. If the region requires AWS Signature v4 (see list here) you may need our support team’s help with allowing Integrate.io ETL access to read from this Amazon Redshift connection.
9
Click Test connection. If the credentials are correct, a message that the cloud storage connection test was successful appears.
10
Click Create connection and the connection is created and appears in the list of connections.
Completed Redshift connection form with test confirmation
Newly created Redshift connection in the connections list

Use the Amazon Redshift source component to read data stored in an Amazon Redshift table, view or using a query. The source component uses Amazon Redshift’s UNLOAD statement to pull data into files in Amazon S3 and then reads the files.
Amazon Redshift source component in the pipeline designer

Connection

Select an existing Amazon Redshift connection or create a new one (for more information, see Allowing Integrate.io ETL access to my Redshift cluster.)

Source Properties

Redshift source properties with access mode, schema, table, and where clause
  • Access mode - select table to extract an entire table/view or query to execute a query.
  • Source schema - the source table’s schema. If empty, the default schema is used.
  • Source table/view - the table or view name from which the data will be imported.
  • where clause - optional. You can add predicates clauses to the WHERE clause as part of the SQL query that is built in order to get the data from the database. Make sure to skip the keyword WHERE.
    Goodprod_category = 1 AND prod_color = 'red'
    BadWHERE prod_category = 1 AND prod_color = 'red'
  • Query - type in a SQL query. Make sure to name all columns uniquely.
  • Null string - NULL values in string columns will be replaced with the string specified here. By default NULL values will appear like empty strings.

Source Schema

Source schema with field selection and data type mapping
After defining the source table/view/query select the fields to use in the source. With table access mode, the fields you select are used to build the query that will be executed to read the data. With query access mode, select all the fields that are defined in the query and make sure to use the same column names Define the data type for the field. Use the following table when matching Redshift data types to Integrate.io ETL data types.
Amazon RedshiftIntegrate.io ETL
varchar, nvarchar, textString
smallint, intInteger
bigintLong
decimal, realFloat
double precisionDouble
timestamp, dateDateTime
Last modified on April 20, 2026