Page tree

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 35 Next »

Trifacta SaaS



Contents:


   

Contents:


This section applies to getting started with  Trifacta®, an AWS-native platform for data wrangling. The following following product tiers are available:

  • Trifacta Premium 
  • Trifacta Enterprise Edition
  • Trifacta Professional Edition
  • Trifacta Starter Edition

Trifacta® enables you to rapidly ingest, transform, and deliver clean, actionable data across your entire enterprise. Please review the following sections on how to prepare for and set up your  Trifacta workspace.

NOTE: This section applies to both the free version and the licensed version of Trifacta. For more information on the differences, see Product Limitations.

This section provides an overview of how to get started using the product. 

  1. Administrators should complete the first section to set up the product for use. 
  2. After set up is complete, individual users should complete the second section to get started using the product.

Setup Process

Having difficulties? To speak to a support representative, click the icon in the corner and submit your question.

Steps:

  1. Before you begin. If you are using your own AWS S3 buckets, you should prepare them and their access policies to ensure that  Trifacta can integrate with them. 

    NOTE: If you do not have these AWS resources, they can be created for you. Details are below.

    1. Technical setup: Please share the technical setup section with your S3 administrator.
  2. Register. Complete the simple online workflow to license and create your  Trifacta workspace.
  3. Workspace setup. Before you invite other users to your workspace, you should complete a few setup steps.
  4. Invite users. If you intend to share the workspace with other users, you can invite them from within it. 
  5. Wrangle away! 

Before You Begin

Hosted on Amazon Web Services,  Trifacta is designed to natively interact with AWS datasources, so that you can rapidly transform your data investments in AWS.

When the product is first launched, a default storage environment is automatically created for you as part of this setup process. Trifacta File Storage is backed by AWS S3 buckets hosted by  Trifacta and secured by IAM policies.

This default storage environment is managed by Trifacta and is used for storing data assets as well as assets generated by use of the product.

  • If preferred, you can configure the use of S3 as the default storage environment.

    NOTE: When S3 is used as the default storage environment, you must provide the policies, buckets, and other AWS resources required to manage your datasets and generated results.

  • You can choose to enable other storage environments in addition to your default storage environment.

Any of the following storage environment options can be configured after completing sign-up:

Default Storage EnvironmentAdditional Storage Environment

TFS

none

TFS

S3
S3none
S3

TFS


NOTE: If you are using your own AWS/S3 resources, you must acquire configuration information before you can connect to your S3 assets. These requirements for these resources are covered later.

Whitelist the IP address range of the Trifacta Service

Feature Availability: This feature is available in the following editions:

  • Trifacta Enterprise Edition
  • Trifacta Professional Edition
  • Trifacta Starter Edition
  • Trifacta Premium

If you are enabling any relational source, including Redshift, you must whitelist the IP address range of the Trifacta Service in the relevant security groups.  

NOTE: The database to which you are connecting must be available from the Trifacta Service over the public Internet.


The IP address range of the Trifacta Service is:

35.245.35.240/28

For Redshift:

For Redshift, there are two ways to whitelist the IP range depending on if you are using EC2-VPC or EC2-Classic (not common).

For details on this process with RDS in general, see https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Overview.RDSSecurityGroups.html

For more information, please contact  Trifacta Support.

Register for  Trifacta

Tip: You can begin using Trifacta pre-configured with a template of interest. Please visit the Templates page and select the template of interest. Then, click Sign up for Free Trial. For more information, see https://www.trifacta.com/templates.

To begin the registration process, please visit https://www.trifacta.com/start-wrangling.

Login

After you have completed registration, please login to the application. The Home page is displayed.

NOTE: You can now access online documentation through the application. From the left menu bar, select Help menu > Documentation.

S3 Configuration

For more information on changing the default storage environment or enabling S3 as a storage environment, see Configure Storage Environment.

Workspace Setup

Review Workspace Settings

As the first registered user, you are assigned the workspace admin role, which provides control over workspace-level settings. Before you invite users to the workspace, you should review and modify the basic configuration for the workspace.

Tip: You can also rename the workspace.

For more information, see  Workspace Settings Page .

Verify Operations

NOTE: Workspace administrators should complete the following steps to verify that the product is operational end-to-end.

Prepare Your Sample Dataset

To complete this test, you should locate or create a simple dataset. Your dataset should be created in the format that you wish to test.

Tip: The simplest way to test is to create a two-column CSV file with at least 25 non-empty rows of data. This data can be uploaded through the application.

Characteristics:

  • Two or more columns. 
  • If there are specific data types that you would like to test, please be sure to include them in the dataset.
  • A minimum of 25 rows is required for best results of type inference.
  • Ideally, your dataset is a single file or sheet. 


Store Your Dataset

If you are testing an integration, you should store your dataset in the datastore with which the product is integrated.

Tip: Uploading datasets is always available as a means of importing datasets.

 

  • You may need to create a connection between the platform and the datastore.
  • Read and write permissions must be enabled for the connecting user to the datastore.
  • For more information, see Connections Page.

Verification Steps

Steps:

  1. Login to the application.See Login.

  2. In the application menu bar, click Library.

    Tip: When you login for the first time, you can immediately import a dataset to begin transforming it.

  3. Click Import Data. See Import Data Page.
    1. Select the connection where the dataset is stored. For datasets stored on your local desktop, click Upload.
    2. Select the dataset.
    3. Click Continue.
  4. The initial sample of the dataset is opened in the Transformer page, where you can edit your recipe to transform the dataset.
    1. In the Transformer page, some steps are automatically added to the recipe for you. So, you can run the job immediately.
    2. You can add additional steps if desired. See Transformer Page.
  5. Click Run
    1. If options are presented, select the defaults.

    2. To generate results in other formats or output locations, click Add Publishing Destination. Configure the output formats and locations. 
    3. To test dataset profiling, click the Profile Results checkbox. Note that profiling runs as a separate job and may take considerably longer. 
    4. See Run Job Page.

  6. When the job completes, you should see a success message under the Jobs tab in the Flow View page. 
    1. Troubleshooting: Either the Transform job or the Profiling job may break. To localize the problem, try re-running a job by deselecting the broken job type or running the job on a different running environment (if available). You can also download the log files to try to identify the problem. See Job Details Page.
  7. Click View Results from the context menu for the job listing. In the Job Details page, you can see a visual profile of the generated results. See Job Details Page.
  8. In the Output Destinations tab, click a link to download the results to your local desktop. 
  9. Load these results into a local application to verify that the content looks ok.

Checkpoint: You have verified importing from the selected datastore and transforming a dataset. If your job was successfully executed, you have verified that the product is connected to the job running environment and can write results to the defined output location. Optionally, you may have tested profiling of job results. If all of the above tasks completed, the product is operational end-to-end.

Invite Users

NOTE: First-time users of the product should access it by invitation only. Do not provide direct URLs to first-time users.

  1. You can invite other people to join your workspace. 
    1. When users initially join your workspace, they are assigned a non-admin role. Through the Workspace Users page, you can assign roles.
    2. Select User menu > Admin Console > Users. Then, click Invite Users.
    3. For more information, see Workspace Users Page.
  2. The workspace administrators must provide credentials for each workspace member account. See Workspace Users Page .

Example Flows

When a new workspace is created, the first user is provided a set of example flows. These flows are intended to teach by example and illustrate many recommended practices for building your own flows. For more information on example flows, see Workflow Basics.

Getting Started for Workspace Users

This section walks through the process of getting started as a new member of a Trifacta workspace. 

Steps:

  1. You should have received an email like the following:

    Figure: Welcome email

  2. Click the link. If you see a Missing Storage Settings error message, then you must provide your individual user storage credentials and default bucket. To do so, click the Here link.
  3. In your Storage Settings page, you may be required to enter your S3 credentials. After the credentials have been entered, you can begin using the product. 
  4. Access documentation: To access the full customer documentation, from the left nav bar, select Help menu > Documentation.

The following resources can assist workspace users in getting started with wrangling.

Tip: Check out the product walkthrough available through in-app chat! This tour steps through each phase of ingesting, transforming, and generating results for your data.

  • No labels

This page has no comments.