This section applies to getting started with Trifacta® SaaS, an AWS-native platform for data wrangling. The following following product tiers are available:
- Trifacta Premium
- Trifacta Standard
Trifacta® SaaS enables you to rapidly ingest, transform, and deliver clean, actionable data across your entire enterprise. Please review the following sections on how to prepare for and set up your Trifacta SaaS workspace.
NOTE: This section applies to both the free version and the licensed version of Trifacta SaaS. For more information on the differences, see Product Limitations.
This section provides an overview of how to get started using the product.
- Administrators should complete the first section to set up the product for use.
- After set up is complete, individual users should complete the second section to get started using the product.
Before you begin. You should prepare your AWS S3 buckets and access policies to ensure that Trifacta SaaS can integrate with them.
NOTE: If you do not have these AWS resources, they can be created for you. Details are below.
- Technical setup: Please share the technical setup section with your S3 administrator.
- Register. Complete the simple online workflow to license and create your Trifacta SaaS workspace.
- Workspace setup. Before you invite other users to your workspace, you should complete a few setup steps.
- Invite users. If you intend to share the workspace with other users, you can invite them from within it.
- Wrangle away!
Before You Begin
Hosted on Amazon Web Services, Trifacta SaaS is designed to natively interact with all of your AWS datasources, so that you can rapidly transform your data investments in AWS.
Below are the AWS objects that are required for setup.
Tip: If you do not have immediate access to these assets, some can be created as part of the workflow setup.
To create these objects are part of the setup process, you must have an AWS account. For more information, see https://aws.amazon.com/.
|Valid email address||Y||To validate your registration for a new workspace, you must have a valid email address to which the product can deliver the registration email.|
|Choice: cross-account role access or key-secret access||Y|
To integrate with your existing S3 resources, you must choose a method of authentication. Choices:
An IAM (Identity and Access Management) policy is an AWS resource used to define the low-level permissions for access to a specific resource. During setup, you can use or create a new IAM policy for the product to use for either access method.
For more information, see "Create policy to grant access to S3 bucket" below.
|cross-account role access: IAM role||Y|
An IAM role contains one or more IAM policies that can be used to define the set of available AWS services and the level of access to them for a user. In this case, the user is the Trifacta application.
|key-secret access: AWS key-secret||Y||An older AWS access method, the key-secret combination is essentially a username and password combination to one or more S3 buckets.|
|S3 bucket||Y||S3 (Simplified Storage Service) is a cloud-based file storage system hosted in AWS. An S3 bucket contains your data files and their organizing folders.|
|S3 bucket: encryption||N|
For better security, your S3 bucket may be encrypted, which means that the data is stored inside of S3 in a way that is not human-readable.
NOTE: The product can optionally integrate with encrypted S3 buckets. The following S3 encryption methods are supported: sse-s3 and sse-kms.
NOTE: If your bucket is encrypted with ss3-kms, additional configuration is required. See "Update policy to accommodate SSE-KMS if necessary" below.
For more information on your bucket's encryption, please contact your S3 administrator.
|S3 bucket: storage location||N|
If needed, you can change the location where results are stored in S3.
NOTE: The product must have write permission to this location. If you are changing the location from the default, please verify with your S3 administrator that the preferred location is enabled for writing through your access method.
|Workspace name||Y||During setup, you must create a unique workspace identifier. This identifier cannot contain spaces or special characters.|
|IAM role: Account ID||N|
The account ID identifies in the trust policy that Trifacta AWS account can use your IAM role.
Tip: This identifier is provided to you during registration and setup.
|IAM role: External ID||N|
The external ID identifies in the trust policy that Trifacta SaaS can use your IAM role only on your behalf.
Tip: This identifier is provided to you during registration and setup.
The following sections should be provided to your AWS administrator for setting up access to these resources, if required.
Create policy to grant access to S3 bucket
To use your own S3 bucket(s) with Trifacta SaaS, create a policy and assign it to either the user or IAM Role selected to grant access to AWS resources. In this section, you create the policy. Later, it will be applied.
- For more information on creating policies, see https://console.aws.amazon.com/iam/home#/policies.
Below is an example policy template. You should use this template to create the policy.
NOTE: You should not simply use one of the predefined AWS policies or an existing policy you have as it will likely give access to more resources than required.
- One of the statements grants access to the
trifacta-public-datasetsbucket, which contains resources used for the onboarding tour.
<my_default_S3_bucket>with the name of your default S3 bucket.
- To grant access to multiple buckets within your account, you can extend the resources list to accommodate the additional buckets.
Update policy to accommodate SSE-KMS if necessary
If any accessible bucket is encrypted with SSE-KMS, another policy must be deployed. See https://docs.aws.amazon.com/kms/latest/developerguide/iam-policies.html.
Whitelist the IP address range of the Trifacta Service, if necessary
If you are enabling any relational source, including Redshift, you must whitelist the IP address range of the Trifacta service in the relevant security groups. The IP address range of the Trifacta service is:
For Redshift, there are two ways to whitelist the IP range depending on if you are using EC2-VPC or EC2-Classic (not common).
- EC2-VPC (Security group): Add the IP address range to the inbound rule for the security group associated with the cluster. For more information, see https://docs.aws.amazon.com/redshift/latest/gsg/rs-gsg-authorize-cluster-access.html#rs-gsg-how-to-authorize-access-vpc-security-group.
- EC2-Classic: Add the IP address range to the inbound rule for the security group associated with the EC2 instance. For more information, see https://docs.aws.amazon.com/redshift/latest/gsg/rs-gsg-authorize-cluster-access.html#rs-gsg-how-to-authorize-access-cluster-security-group.
For details on this process with RDS in general, see https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Overview.RDSSecurityGroups.html
For more information, please contact Trifacta Support.
Register for Trifacta SaaS
To begin the registration process, please visit https://www.trifacta.com/start-wrangling.
After you have completed registration, please login to the application.
NOTE: You can now access online documentation through the application. From the left menu bar, select Help menu > Documentation.
Review Workspace Settings
As the first registered user, you are assigned the workspace admin role, which provides control over workspace-level settings. Before you invite members to the workspace, you should review and modify the basic configuration for the workspace. See Workspace Settings Page.
NOTE: Workspace administrators should complete the following steps to verify that the product is operational end-to-end.
To complete this test, you should locate or create a simple dataset. Your dataset should be created in the format that you wish to test. Tip: The simplest way to test is to create a two-column CSV file with at least 25 non-empty rows of data. This data can be uploaded through the application. Characteristics: Steps: Login to the application.See Login. Click Import and Add to Flow. If options are presented, select the defaults. See Run Job Page. Checkpoint: You have verified importing from the selected datastore and transforming a dataset. If your job was successfully executed, you have verified that the product is connected to the job running environment and can write results to the defined output location. Optionally, you may have tested profiling of job results. If all of the above tasks completed, the product is operational end-to-end.
Prepare Your Sample Dataset
To complete this test, you should locate or create a simple dataset. Your dataset should be created in the format that you wish to test.
Tip: The simplest way to test is to create a two-column CSV file with at least 25 non-empty rows of data. This data can be uploaded through the application.
Login to the application.See Login.
Click Import and Add to Flow.
If options are presented, select the defaults.
See Run Job Page.
Checkpoint: You have verified importing from the selected datastore and transforming a dataset. If your job was successfully executed, you have verified that the product is connected to the job running environment and can write results to the defined output location. Optionally, you may have tested profiling of job results. If all of the above tasks completed, the product is operational end-to-end.
- You can invite other people to join your workspace.
- When members initially join your workspace, they are assigned a non-admin role. Through the Workspace Members page, you can assign roles.
- For more information, see Workspace Users Page.
- The workspace administrators must provide credentials for each workspace member account. See Workspace Users Page.
Getting Started for Workspace Members
This section walks through the process of getting started as a new member of a Trifacta SaaS workspace.
You should have received an email like the following:
- Click the link. If you see a Missing Storage Settings error message, then you must provide your individual user storage credentials and default bucket. To do so, click the Here link.
- In your Storage Settings page, you may be required to enter your S3 credentials. After the credentials have been entered, you can begin using the product.
- Access documentation: To access the full customer documentation, from the left nav bar, select Help menu > Documentation.
The following resources can assist workspace members in getting started with wrangling.
- If product walkthroughs have been enabled, each new member can step through an onboarding tour of the product after first login.
This page has no comments.