- Snowflake is an S3-based data warehouse service hosted in the cloud. Auto-scaling, automatic failover, and other features simplify the deployment and management of your enterprise's data warehouse. For more information, see https://www.snowflake.com.
- S3 base storage layer: Snowflake access requires installation of Trifacta software in the AWS infrastructure and use of S3 as the base storage layer, which must be enabled. See Set Base Storage Layer.
Integration: Your Trifacta instance is connected to an EMR cluster. See Configure for EMR.
- Deployment: Trifacta platform is deployed in EC2.
- PUBLIC schema: If you do not create an external staging database:
PUBLICschema is required in your default database.
- If you do not provide a stage database, then a temporary stage is created for you under the
PUBLICschema in the default database.
- S3 bucket: The user-created stage must point to the same S3 bucket as the default bucket in use by Trifacta Wrangler Enterprise.
- Same region: The Snowflake cluster must be in the same region as the default S3 bucket.
- IAM role requirements: If you are accessing AWS and Snowflake using IAM roles, please verify that the appropriate permissions have been assigned to the role to access Snowflake and its backing S3 buckets. For more information, see Required AWS Account Permissions.
- Staging database: Snowflake supports the use of a stage for reading and writing data to S3 during job executions.
NOTE: If a stage is not deployed, then the user must have write permissions to the default database, which is used instead for staging your data in Snowflake. These permissions must be included in the AWS credentials applied to the user account.
For more information, see Enable Snowflake Connections.
Pre-requisites for OAuth 2.0
If you are connecting to your Snowflake deployment using OAuth 2.0 authentication, additional configuration is required:
- OAuth 2.0 must be enabled and configured for use in the product. For more information, see Enable OAuth 2.0 Authentication.
- OAuth 2.0 requirements:
- Create a security integration in your Snowflake deployment.
- Create an OAuth 2.0 client in the Trifacta application that connects using the security integration.
- For more information, see OAuth 2.0 for Snowflake.
- You cannot perform ad-hoc publication to Snowflake.
- SSO connections are not supported.
You can create Snowflake connections through the following methods.
Create through application
Any user can create a Snowflake connection through the application.
- Login to the application.
- In the left nav bar, click the Connections icon.
- In the Create Connection page, click the Snowflake connection card.
Specify the properties for your Snowflake database connection. The following parameters are specific to Snowflake connections:
NOTE: In Snowflake connections, property values are case-sensitive. Snowflake-related locations are typically specified in capital letters.
Snowflake account to use. Suppose your hostname is the following:
Your account name is the following:
NOTE: Your full account name might include additional segments that identify the region and cloud platform where your account is hosted.
The name of the warehouse to use when connected. This value can be an empty string.
If specified, the warehouse should be an existing warehouse for which the default role has privileges.
If you have deployed a Snowflake stage for managing file conversion to tables, you can enter its name here. A stage is a database object that points to an external location on S3. It must be an external stage containing access credentials.
If a stage is used, then this value is typically the schema and the name of the stage. Example value:
If a stage is not specified, a temporary stage is created using the current user's AWS credentials.
NOTE: Without a defined stage, you must have write permissions to the database from which you import. This database is used to create the temporary stage.
For more information on stages, see https://docs.snowflake.net/manuals/sql-reference/sql/create-stage.html.
Select the type of credentials to provide with the connection:
|Database for Stage|
(optional) If you are using a Snowflake stage, you can specify a database other than the default one to host the stage.
NOTE: If you are creating a read-only connection to Snowflake, this field is required. The accessing user must have write permission to the specified database.
If no value is specified, then your stage must be in the default database.
For more information, see Create Connection Window.
Create via API
For more information, see https://api.trifacta.com/ee/8.2/index.html#operation/createConnection
The properties that you provide are inserted into the following URL, which connects Trifacta Wrangler Enterprise to the connection:
<database>= name of the default database to which to connect. This value can be empty.
Connect string options
The connect string options are optional. If you are passing additional properties and values to complete the connection, the connect string options must be structured in the following manner:
<prop>: the name of the property
<val>: the value for the property
&: any set of connect string options must begin with an ampersand (
=: property names and values must be separated with an equal sign (
Disable SSL connections
By default, connections to Snowflake use SSL. To disable, please add the following string to your Connect String Options:
Connect through proxy
If you require connection to Snowflake through a proxy server, additional Connect String Options are required. For more information, see https://docs.snowflake.net/manuals/user-guide/jdbc-configure.html#specifying-a-proxy-server-in-the-jdbc-connection-string.
This connection uses the following driver:
- Driver version:
- Driver documentation: https://docs.snowflake.com/en/user-guide/jdbc.html
|Null values in some columns for all rows|
When there are spaces/special characters in columns names, null values can be inserted for all rows in the column. The workaround is to remove any special characters and spaces from column names.
For more information, see Snowflake Browser.
For more information about interacting with data on Redshift, see Using Snowflake.
Import a dataset from Snowflake. Add it to a flow, and specify a publishing action back to Snowflake. Run a job.
For more information, see Verify Operations.
This page has no comments.