Page tree

Trifacta Dataprep


On January 27, 2021, Google is changing the required permissions for attaching IAM roles to service accounts. If you are using IAM roles for your Google service accounts, please see Changes to User Management.


reference dataset is a reference to a recipe's outputs that has been added to a flow other than the one where the recipe is located. When a reference dataset is selected, details are available in the context menu.

NOTE: A reference dataset is a read-only object in the flow where it is referenced. You cannot select or use a reference dataset until a reference has been created in the source flow from the recipe to use. See View for Recipes above.

To add a reference dataset, you can:

  • From the source flow, select the reference object for a recipe. In the context panel, click Add to Flow....
  • Click Add Datasets from the main Flow View page and select one from a different flow.

Figure: Reference Dataset icons

Menu Options:

  • Add:
    • Recipe: Add a recipe for this dataset. If a recipe already exists for it, this new recipe is created as a branch in the flow.
    • Join: Join this dataset with another recipe or dataset. If this dataset does not have a recipe for it, a new recipe object is created to store this step. See Join Window.
    • Union: Union this dataset with one or more recipes or datasets. If this dataset does not have a recipe for it, a new recipe object is created to store this step. See Union Page.
  • View details: Explore details of the dataset. See Dataset Details Page.
  • Go to original reference: Open in Flow View the flow containing the original dataset for this reference. 
  • Remove from Flow: Remove the dataset from the flow.
    • All dependent flows, outputs, and references are not removed from the flow. You can replace the source for these objects as needed.

      NOTE: References to the deleted dataset in other flows remain broken until the dataset is replaced.

Figure: View for referenced dataset in a new flow

NOTE: Reference datasets marked with a red dot no longer have a source dataset for them in the other flow. These upstream dependencies should be fixed. See Fix Dependency Issues.

When you select a reference dataset in flow view, the following are available in the right-hand panel.

Key Fields:

Source Flow: Flow that contains the dataset. Click the link to open the Flow View page for that dataset.

This page has no comments.