This section describes how you interact through with your Redshift data warehouse.
The can use Redshift for the following tasks:
Write to Redshift tables with your job results.
Ad-hoc publication of data to Redshift.
Enable S3 Sources: Redshift integration requires the following:
Read Access: Your Redshift administrator must configure read permissions. Your administrator should provide a database for upload to your Redshift datastore.
Write Access: You can write and publish jobs results to Redshift.
SSL is required.
Your Redshift administrator should provide database access for storing datasets. Users should know where shared data is located and where personal data can be saved without interfering with or confusing other users.
NOTE: |
You can create a from a table or view stored in Redshift.
NOTE: The Redshift cluster must be in the same region as the default S3 bucket. |
NOTE: If a Redshift connection has an invalid iamRoleArn, you can browse, import datasets, and open the data in the Transformer page. However, any jobs executed using this connection fail. If the iamRoleArn is invalid, the only samples that you can generate are Quick Random samples; other sampling jobs fail. |
For more information, see Redshift Browser.
NOTE: You cannot publish to a Redshift database that is empty. The database must contain at least one table. |
You can write back data to Redshift using one of the following methods:
As needed, you can publish results to Redshift for previously executed jobs.
NOTE: You cannot re-publish results to Redshift if the original job published to Redshift. However, if the dataset was transform but publication to Redshift failed, you can publish from the Publishing dialog. |
NOTE: To publish to Redshift, the source results must be in Avro or JSON format. |
Data Validation issues: