Amazon Redshift

Connection Type

ODBC (64-bit)

Type of Support

Read & Write, In-Database

Validated On

Database Version: 1.0.35649

ODBC Client Version: v2

For more information about the Simba Athena ODBC driver, see the Simba ODBC documentation.

Note

To avoid error when saving your workflow to Server, select the Encrypt Password For: All Users of This Machine checkbox in Simba Amazon Redshift ODBC Driver DSN Setup.

Alteryx Tools Used to Connect

Standard Workflow Processing

Input Data ToolInput Data Tool

Output Data ToolOutput Data Tool

In-database Workflow Processing

Configure an ODBC Connection

In the ODBC Data Source Administrator...

Select the Redshift driver and select Configure.
Enter your ConnectionSettings and credentials.
In the AdditionalOptions area, select the Retrieve Entire Results Into Memory option.
Note
This setting would fetch the entire dataset into the physical memory. If the physical memory is low, this setting is subjected to change based on the data volume and the available physical memory, and you may need to involve your DBA for a recommended setting.
Select OK to save the connection.

Configure an Amazon Redshift Bulk Connection

To use the bulk connection via the Output Data tool...

Select the Write to File or Database dropdown and select Other Databases > Amazon Redshift Bulk.
Select a Data Source Name (or select ODBC Admin to create one). See ODBC and OLEDB Database Connections.
(Optional) Enter a User Name and Password.
In the AmazonS3 section, enter or paste your AWS Access Key and AWS Secret Key to access the data for upload.
In the Secret Key Encryption dropdown, select an encryption option:
- Hide: Hide the password using minimal encryption.
- Encrypt for Machine: Any user on the computer is able to fully use the connection.
- Encrypt for User: The signed-in user can use the connection on any computer.
In the Endpoint dropdown, select Default to allow Amazon to determine the endpoint automatically based on the bucket you select. To specify an endpoint for private S3 deployments, or if you know a specific bucket region, you can alternately select an endpoint (S3 region), enter a custom endpoint, or select from one of ten previously-entered custom endpoints.
If the Bucket you select is not in the region of the endpoint you specify, this error occurs: "The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint." Select Default to clear the error.
(Optional) Select Use Signature V4 for Authentication to use Signature Version 4 instead of the default Signature Version 2. This increases security, but connection speeds might be slower. This option is automatically enabled for regions requiring Signature Version 4.
Regions That Require Signature Version 4: Regions created after January 30, 2014 support only Signature Version 4. These regions require Signature Version 4 authentication:
- US East (Ohio) Region
- Canada (Central) Region
- Asia Pacific (Mumbai) Region
- Asia Pacific (Seoul) Region
- EU (Frankfurt) Region
- EU (London) Region
- China (Beijing) Region
Select a Server-SideEncryption method for uploading to an encrypted Amazon S3 bucket. For more information on Amazon S3 encryption methods, see the Amazon Simple Storage Service Developer Guide.
- None (Default): No encryption method is used.
- SSE-KMS: Use server-side encryption with AWS KMS-managed keys. Optionally provide a KMSKeyID. When you select this method, Use Signature V4 for Authentication is enabled by default.
In Bucket Name, enter the name of the AWS bucket in which your data objects are stored.

Optionally select Use Redshift Spectrum to connect to Spectrum tables.

Field mapping

When bulk loading data to Amazon Redshift, data is written to incorrect fields when order of fields in workflow output is different from the one in the Redshift database. To workaround this:

Select the Append Field Map option in the Output Data tool to configure it, even if you don't change the default setting. In the workflow XML of the Output Data tool, this will populate the <AppendMapping mode="ByName" /> tag.
Change Output Option to Overwrite Table (Drop).

Configure Output Options

You can optionally specify or adjust the following Redshift options. For more information, see the Amazon Redshift Database Developer Guide.

Note

To create Spectrum tables with the Output Data tool, specify both the schema and table name.

spectrum_schema.tablename

Distribution Key is ignored if 'Key' is not selected for Distribution Style. Sort Key is ignored if 'None' is selected for Sort Style.

PrimaryKey: Select columns for the Primary Key and adjust the order of columns.
DistributionStyle: Select Even, Key, or All.
DistributionKey: Select a column for the Distribution Key.
SortStyle: Select None, Compound, or Interleaved.
SortKey: Select columns for the Sort Key and adjust the order of columns.
Enable Vacuum and Analyze Operations: (Bulk connections only) Enabled by default. When enabled, VACUUM and ANALYZE maintenance commands are executed after a bulk load APPEND to the Redshift database.
Size of Bulk Load Chunks (1 MB to 102400 MB): To increase upload performance, large files are split into smaller files with a specified integer size, in megabytes. The default value is 128.
Enable backslash (\) as escape character: (Bulk connections only) Enabled by default. When enabled, a character that immediately follows a backslash character is loaded as column data, even if that character normally is used for a special purpose (for example, delimiter character, quotation mark, embedded newline character, or escape character).

Important

An identifiers are folded to lowercase in the database. In query results, tables and column names are returned as lowercase by default. For more information, see Amazon Names and Identifiers documentation.

Known Issues

When inserting a record into an Amazon Redshift table with a column containing an attribute IDENTITY(seed, step), the value of the first column to be inserted is null instead of the value being passed into the Output Data tool.
Redshift Server does not support executing multiple queries in a single prepared statement.