This section provides information on how to enable
connectivity and create one or more connections to
- Read: Supported
- Write: Supported
Before you begin, please verify that your meets the following requirements:
NOTE: If you are connecting to any relational source of data, such as or , you must add the to your whitelist for those resources.
Tip: If the credentials used to connect to S3 do not provide access to Redshift, you can create an independent IAM role to provide access from to S3. If this separate role is available, the connection uses it instead. There may be security considerations.
Access to requires:
- Each user is able to access S3
- S3 is the base storage layer
If the credentials used to connect to S3 do not provide access to , you can create an independent IAM role to provide access from to S3. If this separate role is available, the connection uses it instead.
NOTE: There may be security considerations with using an independent role to govern this capability.
- The IAM role must contain the required S3 permissions. See Required AWS Account Permissions.
- The cluster should be assigned this IAM role. For more information, see https://docs.aws.amazon.com/redshift/latest/mgmt/authorizing-redshift-service.html.
- You can publish any specific job once to through the export window. See Publishing Dialog.
- The cluster with which you are integrating must be hosted in a public subnet.
- When publishing to Redshift through the Publishing dialog, output must be in Avro or JSON format. This limitation does not apply to direct writing to .
- Management of nulls:
- Nulls are displayed as expected in the .
- When jobs are run, the UNLOAD SQL command in Redshift converts all nulls to empty strings. Null values appear as empty strings in generated results, which can be confusing. This is a known issue with .
- No schema validation is performed as part of writing results to Redshift.
- Credentials and permissions are not validated when you are modifying the destination for a publishing job.
- For Redshift, no validation is performed to determine if the target is a view and is therefore not a supported target.
You can create
connections through the following methods.
Tip: SSL connections are recommended. Details are below.
Create through application
Any user can create a Redshift connection through the application.
- Login to the application.
- In the menu, click the Connections icon.
- In the Create Connection page, click the connection card.
Specify the properties for your database connection:
Hostname of the
NOTE: This value must be the full hostname of the cluster, which may include region information.
Port number used to access the
cluster. Default is
|Connect String Options||Please insert any connection options as a string here. See below.|
database to which to connect on the cluster
Options: Basic authentication with optional IAM role ARN: Basic authentication credentials specified in this window are used to connect to the
database. Additional permissions may be governed by any ARN specified in the IAM role used for the account. Use this option if you are planning to specify a database username/password combination as part of the connection. IAM Role:
is governed by the IAM role associated with the user's account.
Username with which to connect to the
Password associated with the
|IAM Role ARN for Redshift/S3 connectivity|
(Optional) You can specify an IAM role ARN that enables role-based connectivity between and the S3 bucket that is used as intermediate storage during
bulk COPY/UNLOAD operations. Example:
For more information on the other options, see Create Connection Window.
Enable SSL connections
To enable SSL connections to
, you must enable them first on your
cluster. For more information, see https://docs.aws.amazon.com/redshift/latest/mgmt/connecting-ssl-support.html
In your connection to
, please add the following string to your Connect String Options:
Save your changes.
The properties that you provide are inserted into the following URL, which connects
to the connection:
Connect string options
The connect string options are optional. If you are passing additional properties and values to complete the connection, the connect string options must be structured in the following manner:
<prop> : the name of the property
<val> : the value for the property
; : any set of connect string options must begin and end with a semi-colon.
; : all additional property names must be prefixed with a semi-colon.
= : property names and values must be separated with an equal sign (
Access through AWS key-secret
The following example connection URL uses an AWS key/secret combination (IAM user) to access
<redshift_clustername>: the name of the cluster
<region_name>: region identifier where the cluster is located
<port_number>: port number to use to access the cluster
<database_name>: name of the Redshift database to which to connect
<access_key_value>: identifier for the AWS key
<secret_key_value>: identifier for the AWS secret
<database_user_name>: user identifier for connecting to the database
Access through IAM role and temporary credentials
The following example connection URL uses an AWS/Key secret combination using temporary credentials:
This connection uses the following driver:
API: API Reference
For more information, see https://docs.aws.amazon.com/redshift/latest/mgmt/troubleshooting-connections.html.
Import a dataset from
. Add it to a flow, and specify a publishing action. Run a job.
NOTE: When publishing to through the Publishing dialog, output must be in Avro or JSON format. This limitation does not apply to direct writing to .
Using Redshift Connections
Uses of Redshift
can use Redshift for the following tasks:
- Create datasets by reading from Redshift tables.
Write to Redshift tables with your job results.
Ad-hoc publication of data to Redshift.
Before you begin using Redshift
Read Access: Your Redshift administrator must configure read permissions. Your administrator should provide a database for upload to your Redshift datastore.
Write Access: You can write and publish jobs results to Redshift.
SSL is required.
Storing data in Redshift
Your Redshift administrator should provide database access for storing datasets. Users should know where shared data is located and where personal data can be saved without interfering with or confusing other users.
NOTE: does not modify source data in Redshift. Datasets sourced from Redshift are read without modification from their source locations.
Reading from Redshift
You can create a
from a table or view stored in Redshift.
NOTE: The Redshift cluster must be in the same region as the default S3 bucket.
NOTE: If a Redshift connection has an invalid iamRoleArn, you can browse, import datasets, and open the data in the Transformer page. However, any jobs executed using this connection fail. If the iamRoleArn is invalid, the only samples that you can generate are Quick Random samples; other sampling jobs fail.
For more information, see Database Browser.
Writing to Redshift
NOTE: You cannot publish to a Redshift database that is empty. The database must contain at least one table.
You can write back data to Redshift using one of the following methods:
Data Validation issues:
- No validation is performed for the connection and any required permissions during job execution. So, you can be permitted to launch your job even if you do not have sufficient connectivity or permissions to access the data. The corresponding publish job fails at runtime.
- Prior to publication, no validation is performed on whether a target is a table or a view, so the job that was launched fails at runtime.
Supported Versions: n/a
- Read: Supported
- Write: Supported