Contents:
Welcome to Trifacta® Wrangler Pro!
- Administrators should complete the first section to set up the product for use.
- After set up is complete, individual users should complete the second section to get started using the product.
Step 1 - Begin process
You can begin using the product in either of the following ways, which are described in the following sections:
Register for Free Trial | Sign up for a free trial of the product, which provides limited access to the full product. See below. |
Create Workspace | If you have licensed the full Trifacta Wrangler Pro product, you begin by submitting a request to Trifacta Support. See below. |
Register for Free Trial
To begin the process, an administrator should complete the registration form available here: https://www.trifacta.com/gated-form/free-trial-redshift/.
Limitations:
- 100 Trifacta Compute Units
- 10 users
After you submit the registration form, an email is sent to your provided email address to confirm registration.
NOTE: This process can take up to 24 hours to complete.
Key fields:
Field | Description |
---|---|
This email address will receive a registration email, which contains a link that you must follow to complete registration. | |
Current AWS services utilized | Please add a comma-separated list of the AWS services that are currently used by your organization. Example: AWS, S3, EC2, Redshift, VPC |
Primary AWS region | The region you select should be the same as your S3 and Redshift storage locations, if possible. NOTE: If you are integrating with Redshift, the region for your Redshift resources must be in the same location as your default S3 bucket, which is specified later. |
Create Workspace
When you are ready to create your workspace, please contact Trifacta Support to create the workspace.
Key considerations:
- Number of workspace members
- Data volumes
- Primary AWS region
After the workspace has been created, an email is sent to your registered address with next steps.
Step 2 - Choose an AWS access mode
Trifacta Wrangler Pro supports the following access modes:
Method | Description |
---|---|
workspace | Access to AWS resources is granted through a single set of credentials, which are configured by an administrator and shared by everyone in the workspace. Tip: This configuration is easiest to manage. After the administrator configures credentials, all invited members can immediately access the product. However, all workspace users have the same permissions, which may be problematic for security reasons. |
per-user | Each user must enter their own configuration settings in the Storage Config page after login. Tip: This method is more secure. However, each user must enter his or her own AWS credentials to access the product, which requires extra steps. These steps are described later for non-admin users. |
Each of the above modes can be managed through one of the following credential methods:
IAM role that provides access to the designated bucket(s)
Tip: This method is recommended.
Access Key / Secret Key pair
NOTE: For this method, you should create a new service account. Avoid generating credentials using your existing AWS account, since it grants access to more resources than required by the Trifacta service.
Step 3 - Take note of S3 encryption method (if in use)
Trifacta Wrangler Pro supports the following types of encryption. Review if you have enabled any of the following encryption methods in your S3 environment.
- None
- SSE-S3
- SSE-KMS
NOTE: If some form of S3 encryption is enabled, additional configuration is required. The method of encryption must be provided to the product to communicate with your S3 resources. If per-user authentication is in use individual users must configure the appropriate setting in their accounts.
Step 4 - Create policy to grant access to S3 bucket
To use your own S3 bucket(s) with Trifacta Wrangler Pro,create a policy and assign it to either the user or IAM Role selected to grant access to AWS resources. In this section, you create the policy. Later, it will be applied.
- For more information on creating policies, see https://console.aws.amazon.com/iam/home#/policies.
Below is an example policy template. You should use this template to create the policy.
NOTE: You should not simply use one of the predefined AWS policies or an existing policy you have as it will likely give access to more resources than required.
Template Notes:
- One of the statements grants access to the public demo asset buckets.
- Replace
<my_default_S3_bucket>
with the name of your default S3 bucket. - To grant access to multiple buckets within your account, you can extend the resources list to accommodate the additional buckets.
Policy Template
{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "s3:PutObject", "s3:GetObject", "s3:ListBucket", "s3:DeleteObject", "s3:GetBucketLocation" ], "Resource": [ "arn:aws:s3:::<my_default_S3_bucket>", "arn:aws:s3:::<my_default_S3_bucket>/*" ] }, { "Sid": "VisualEditor1", "Effect": "Allow", "Action": [ "s3:GetObject", "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::aws-saas-samples-prod", "arn:aws:s3:::aws-saas-samples-prod/*", "arn:aws:s3:::aws-saas-datasets", "arn:aws:s3:::aws-saas-datasets/*", "arn:aws:s3:::3fac-data-public", "arn:aws:s3:::3fac-data-public/*" "arn:aws:s3:::trifacta-public-datasets", "arn:aws:s3:::trifacta-public-datasets/*" ] } ] }
Step 5 - Update policy to accommodate SSE-KMS if necessary
If any accessible bucket is encrypted with SSE-KMS, another policy must be deployed. See https://docs.aws.amazon.com/kms/latest/developerguide/iam-policies.html.
Step 6 - Add policy for Redshift access
If you are connecting to Redshift databases through your workspace, you can enable access by creating a GetClusterCredentials
policy. This policy is additive to the the S3 access policies. All of these policies can be captured in a single IAM role.
Example:
{ "Version": "2012-10-17", "Statement": [ { "Sid": "GetClusterCredsStatement", "Effect": "Allow", "Action": [ "redshift:GetClusterCredentials" ], "Resource": [ "arn:aws:redshift:us-west-2:123456789012:dbuser:examplecluster/${redshift:DbUser}", "arn:aws:redshift:us-west-2:123456789012:dbname:examplecluster/testdb", "arn:aws:redshift:us-west-2:123456789012:dbgroup:examplecluster/common_group" ], "Condition": { "StringEquals": { "aws:userid":"AIDIODR4TAW7CSEXAMPLE:${redshift:DbUser}@yourdomain.com" } } }, } }
For more information on these permissions, see Required AWS Account Permissions.
Step 7 - Whitelist the IP address range of the Trifacta Service, if necessary
If you are enabling any relational source, including Redshift, you must whitelist the IP address range of the Trifacta service in the relevant security groups. The IP address range of the Trifacta service is:
35.245.35.240/28
For Redshift:
For Redshift, there are two ways to whitelist the IP range depending on if you are using EC2-VPC or EC2-Classic (not common).
- EC2-VPC (Security group): Add the IP address range to the inbound rule for the security group associated with the cluster. For more information, see https://docs.aws.amazon.com/redshift/latest/gsg/rs-gsg-authorize-cluster-access.html#rs-gsg-how-to-authorize-access-vpc-security-group.
- EC2-Classic: Add the IP address range to the inbound rule for the security group associated with the EC2 instance. For more information, see https://docs.aws.amazon.com/redshift/latest/gsg/rs-gsg-authorize-cluster-access.html#rs-gsg-how-to-authorize-access-cluster-security-group.
For details on this process with RDS in general, see https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Overview.RDSSecurityGroups.html
For more information, please contact Trifacta Support.
Step 8 - Storage configuration for the administrator
- Login to your workspace.
If this is your first login, a message similar to the following is displayed:
Figure: Missing Storage Settings
- Click the here link.
Define AWS access settings
In the AWS Config page, you can specify the following high-level settings to define your AWS access method. These settings were also listed in Step 2.
AWS Mode: In Workspace mode, the workspace administrator applies a single set of AWS credentials for all users in the workspace. These credentials are used by each member of the workspace to authenticate with AWS and to gain access to S3 buckets. Tip: This mode requires more up-front setup but is easy to manage. However, all members of the workspace have the same set of access controls. In Per User mode, individual members of the workspace must apply their AWS credentials to their accounts. Tip: This mode is easy to set up but turns responsibility for access controls over to the individual members. If members encounter problems gaining access to S3 assets, the workspace administrator may not be able to troubleshoot them. Credential Provider: For workspace or per-user mode, the following provider methods can be used to manage authentication with AWS.
Trifacta Wrangler Pro can use any IAM roles that have been defined for workspace members to access AWS data sources, such as S3 and Redshift. Tip: This credential provider method is recommended.Mode Description Workspace Per User Credential Provider Description IAM Role AWS Key and Secret You can apply key and secret combinations to gate access to AWS data sources. These combinations can be applied in workspace mode or in per-user mode by individual members.
After you have made your selections for the above settings, you can review the following sections, which contain some common configuration workflows. Please populate the settings according to your needs.
Common configurations for Workspace mode
Using an IAM role to grant access to your S3 bucket (Recommended method)
Setting | Value |
---|---|
Mode | Workspace |
Credential Provider | IAM Role |
- Please retain these two key pieces of information from the screen. These pieces of information must be applied in AWS:
- Account ID
- External ID
- Log into your AWS account and create a new IAM Role:
- When you create the role, you receive this prompt: "Select type of trusted entity", choose "Another AWS account".
- Enter the Account ID that you acquired from the Trifacta screen.
- Select the "Require external ID" checkbox. Enter the External ID provided to you from the Trifacta screen.
- Proceed to the Permissions page.
- Select the policy that you already created.
- Proceed to the Tags page. Enter tags, if desired.
- Proceed to the Review page. Select a name for your role.
- Finish creating the role.
- When you create the role, you receive this prompt: "Select type of trusted entity", choose "Another AWS account".
- You must insert a trust relationship in this IAM role. For more information, see Insert Trust Relationship in AWS IAM Role.
- Select IAM>Roles. Select your new role, and copy the Role ARN.
- In the Trifacta screen:
- Paste this value into the "Available IAM Role ARNs" textbox and press
ENTER
. Enter the name of your default S3 bucket.
NOTE: This bucket should already be granted access through the policy that you created.
- Select the encryption type.
- Paste this value into the "Available IAM Role ARNs" textbox and press
Using AWS Access Key and Secret Key to grant access to your S3 bucket
Setting | Value |
---|---|
Mode | Workspace |
Credential Provider | AWS Key and Secret |
- Log into your AWS account and create a new user.
- Select the "Programmatic access" checkbox.
- Proceed to the Permissions page.
- Select the "Attach existing policies directly" checkbox.
- Select the policy that you have already created.
- Proceed to the Tags page. Enter tags, if desired.
- Copy your Access Key and Secret Key.
- Paste these values into the appropriate textboxes in the Trifacta screen.
Common configurations for User mode
- For per-user mode, the administrator must still select the encryption type.
When each user logs in, the user must configure their storage settings.
NOTE: Depending on the required IAM permissions, non-admin users may not be able to complete this configuration without assistance.
Using an IAM role to grant access to your S3 bucket (Recommended method)
Setting | Value |
---|---|
Mode | Per user |
Credential Provider | IAM Role |
- Please retain these two key pieces of information from the screen. These pieces of information must be applied in AWS:
- Account ID
- External ID
- Log into your AWS account and create a new IAM Role:
- When you create the role, you receive this prompt: "Select type of trusted entity", choose "Another AWS account".
- Enter the Account ID that you acquired from the Trifacta screen.
Select the "Require external ID" checkbox. Enter the External ID provided to you from the Trifacta screen.
NOTE: In per-user mode, this value is different for each user.
- Proceed to the Permissions page.
- Select the policy that you already created.
- Proceed to the Tags page. Enter tags, if desired.
- Proceed to the Review page. Select a name for your role.
- Finish creating the role.
- When you create the role, you receive this prompt: "Select type of trusted entity", choose "Another AWS account".
- You must insert a trust relationship in this IAM role. For more information, see Insert Trust Relationship in AWS IAM Role.
- Select IAM>Roles. Select your new role, and copy the Role ARN.
- In the Trifacta screen:
- Paste this value into the "Available IAM Role ARNs" textbox and press
ENTER
. Enter the name of your default S3 bucket.
NOTE: This bucket should already be granted access through the policy that you created.
- Paste this value into the "Available IAM Role ARNs" textbox and press
Using AWS Access Key and Secret Key to grant access to your S3 bucket
Setting | Value |
---|---|
Mode | Per user |
Credential Provider | AWS Key and Secret |
- Log into your AWS account and create a new user.
- Select the "Programmatic access" checkbox.
- Proceed to the Permissions page.
- Select the "Attach existing policies directly" checkbox.
- Select the policy that you have already created.
- Proceed to the Tags page. Enter tags, if desired.
- Copy your Access Key and Secret Key.
- Paste these values into the appropriate textboxes in the Trifacta screen.
Step 9 - Access Documentation
At this point, you can access online documentation for the product.
NOTE: Content referenced in the PDF guide is not accessible through the PDF. You must login to the online documentation to access the referenced pages.
Steps:
- From the left navigation bar, select Help menu > Documentation.
- You are automatically logged in.
- PDF content is located in the following pages:
Initial Configuration
Before you invite members to the workspace, you should review and modify the basic configuration for the workspace. See Workspace Settings Page.
Step 10 - Verify Operations
NOTE: Workspace administrators should complete the following steps to verify that the product is operational end-to-end.
To complete this test, you should locate or create a simple dataset. Your dataset should be created in the format that you wish to test. Tip: The simplest way to test is to create a two-column CSV file with at least 25 non-empty rows of data. This data can be uploaded through the application. Characteristics: If you are testing an integration, you should store your dataset in the datastore with which the product is integrated. Tip: Uploading datasets is always available as a means of importing datasets. Steps: Login to the application.See Login. In the application menu bar, click Library. Tip: When you login for the first time, you can immediately import a dataset to begin transforming it. If options are presented, select the defaults. See Run Job Page. Checkpoint: You have verified importing from the selected datastore and transforming a dataset. If your job was successfully executed, you have verified that the product is connected to the job running environment and can write results to the defined output location. Optionally, you may have tested profiling of job results. If all of the above tasks completed, the product is operational end-to-end.Prepare Your Sample Dataset
Store Your Dataset
Verification Steps
Step 11 - Invite Members
- You can invite other people to join your workspace.
- When members initially join your workspace, they are assigned a non-admin role. Through the Workspace Members page, you can assign roles.
- For more information, see Workspace Users Page.
- If you have enabled per-user authentication, credentials must be provided for each workspace member account:
- Administrators can apply per-user authentication for individual accounts. See Workspace Users Page.
- If individual members need to apply the credentials, the process is the same as for administrators.
- Please share Step 7 (Common configurations - User Mode) with them.
- Similar content is also located online: AWS Config Page.
Example Flows
When a new workspace is created, the first user is provided a set of example flows. These flows are intended to teach by example and illustrate many recommended practices for building your own flows. For more information on example flows, see Workflow Basics.
Getting Started for Workspace Members
This section walks through the process of getting started as a new member of a Trifacta Wrangler Pro workspace.
Steps:
You should have received an email like the following:
Figure: Welcome email
- Click the link. If you see a Missing Storage Settings error message, then you must provide your individual user storage credentials and default bucket. To do so, click the Here link.
- In your Storage Settings page, you may be required to enter your S3 credentials. For more information, see Common configurations - User Mode. above. After the credentials have been entered, you can begin using the product.
- Access documentation: To access the full customer documentation, from the left nav bar, select Help menu > Documentation.
The following resources can assist workspace members in getting started with wrangling.
Tip: Check out the product walkthrough available through in-app chat! This tour steps through each phase of ingesting, transforming, and generating results for your data.
- For an overview of the product, see Product Overview.
- Check out the Trifacta Community: https://community.trifacta.com
- Try the free Wrangler certification course. See https://community.trifacta.com/s/academywelcome.
- For a basic summary of each step of the wrangling process, see Workflow Basics.
This page has no comments.