Page tree

 

Contents:


Through the Flow View page, you can access and manage all objects in your flow. For each imported dataset, recipe, or other object in your flow, you can perform a variety of actions to effectively manage flow development and job execution through a single page.

If you have enabled deployment management, avoid making changes in Flow View on a Production instance of the platform.

  • Scheduling executions through Flow View in a Prod environment is not supported. Job executions must be executed through the APIs. See API Endpoints.
  • Some Flow View options may not be available in a Prod environment.
  • You should apply changes to your flow in the Dev instance and then re-deploy to Production. For more information, see Overview of Deployment Management.

NOTE: If the displayed flow has been shared with you, some options are not available.

Figure: Flow View page

The imported datasets in the flow or reference datasets added to the flow are listed on the left side of the screen. Associated with each dataset can be one or more recipes, which are used to transform the source data.

NOTE: Objects marked with a red dot indicate a problem with the object's configuration. Please select the object to begin investigating the error. Error information may be displayed in the right panel.

Datasets:

  • To begin working with an imported dataset, select it and click Add new recipe. A new, empty recipe is associated with the dataset. To open in the Transformer page, click the recipe icon and select Edit Recipe. See Transformer Page.
  • When created, these objects are connected together by lines flowing between them, which show the relationships between the objects in the flow.
  • For any object, any objects on which it depends are displayed to the left of the object on one of the flowing lines leading from it. 

    Tip: When you run a job for a recipe, all of the recipes steps for the preceding datasets are executed as part of the job, and only the results of the terminal dataset are generated.

     

    • In the above example, the POS-01 recipe is dependent on all of the objects in the flow. 

    • The other datasets have been integrated with the POS-01 dataset and have not yet had a recipe created for them. 

Recipes:

A recipe is a set of steps to transform source data into the results you desire.

  • A recipe can be created from the following objects:
    • An imported dataset, as above.
    • A reference dataset. A reference dataset is an object that has been pulled into a flow from another flow. See below.
    • Another recipe. You can chain together recipes. For example, you may have a set of steps that you always apply at the beginning of transforming a specific type of feed. This recipe can be added into each flow as the first recipe chained to an imported dataset of that feed type.
  • The following objects can be created off of a recipe:
    • An output object is a set of publishing targets for which you can execute jobs.
    • A reference object is a reference to one of your flow's recipes in another flow. When a reference object is created, the target flow receives the output of the executed recipe.  
    • A reference object is a reference to one of your flow's recipes that can be used in another flow. 
      • In the target flow, this object appears as a reference dataset.
      • When a reference dataset is used in a flow, the target flow receives the output of the executed recipe.

For more information on these objects, see Object Overview.

Select an object from your flow to open an object-specific panel on the right side of the screen.

Tip: You can right-click any object in Flow View to see the list of available actions that appear when you select it and choose from the right panel.

Tip: Double-click any recipe to edit it. See Transformer Page.


Actions:

Rename: Select the name of the object to rename it within the platform. This rename does not apply to the source of the object, if it exists elsewhere. 

Add Datasets: Add new datasets to the flow. Details are below.

Add Schedule: To add a scheduled execution of the recipes in your flow:

  1. Define the scheduled time and interval of execution at the flow level. See Add Schedule Dialog.
    1. After the schedule has been created, you can review, edit, or delete the schedule through the Clock icon.
  2. Define the scheduled destinations for each recipe through its output object. These destinations are targets for the scheduled job. See View for Outputs below.

Share: Collaborate with others on the same flow.

You can also send a copy to other users for separate work.

NOTE: When a flow containing one or more connections is shared, its connections are also shared. By default, credentials are included. If the sharing of credentials has been disabled, the new users must provide their own credentials for the shared connection. See Configure Sharing.

See Share Flow Dialog.

When a user is given access to a flow, all of the following actions are available to that user, except for editing details and deleting the flow.

Make a copy: Create a copy of the flow for another user.

 

NOTE: The copied flow is independent of the source flow, but the original source datasets are connected.

Export: (Available to flow owner only) Export the flow for archive or transfer. For more information, see Export Flow.

Edit name and description:  (Available to flow owner only) Change the name and description of the flow.

Delete:  (Available to flow owner only) Delete the flow.

Deleting a flow removes all recipes that are contained in the flow. If copies of these objects exist in other flows, they are not touched. Imported datasets are not deleted by this action.

Add Datasets to Flow

From the Flow View page, you can add imported or reference datasets to your flow. These datasets are added as independent objects in the flow and can be joined, unioned, or referenced by other datasets in the flow.

Figure: Add datasets to current flow

  1. Search for or select the dataset to add.
    1. Use the page view controls to browse for other datasets, or select the appropriate tab to filter the list to imported or reference datasets.
    2. To import new datasets from external sources, click Import Datasets. See Import Data Page.
  2. When you have made your selections, click Add.
  3. The dataset is added as a new object in flow view.

View for Imported Datasets

When you select an imported dataset, you can preview the data contained in it, replace the source object, and more from the right-side panel.

Figure: Imported Dataset view

Key Fields:

  • Data Preview: In the Data Preview window, you can see a small section of the data that is contained in the imported dataset. This window can be useful for verifying that you are looking at the proper data.

    Tip: Click the preview to open a larger dialog, where you can select and copy data.

  • Type: Indicates where the data is sourced or the type of file.
  • Location: Path to the location of the imported dataset.

  • File Size: Size of the file. Units may vary.
  • More details: Review details on the flows where the dataset is used.
  • Column Data Type Inference: 
    • enabled - Data types have been applied to the dataset during import.
    • disabled - Data types were not globally applied to the dataset during import. However, some columns may have had overrides applied to them during the import process. See Import Data Page.
    • For more information, see Configure Type Inference.

  • ConnectionName: If the data is accessed through a connection, you can click this link to review connection details in the right-side panel. See View for Connections below.

Actions:

  • ReplaceReplace the dataset with a different dataset or reference dataset.
  • Add new Recipe: Add a new recipe for the object. If a recipe already exists for it, this new recipe is created as a branch in the flow.
  • Edit name and description: (Available to flow owner only) Change the name and description for the object.
  • Remove structure: (If applicable) Remove the initial parsing structure. When the structure is removed:
    1. The dataset is converted to a raw dataset. A raw dataset is the source data converted into a flat file format.
    2. All steps to shape the dataset are removed. You must break up columns in manual steps in any recipe created from the object.
    3. See View for Raw Datasets below.
  • Remove from Flow: Remove the dataset from the flow.
    • All dependent flows, outputs, and references are not removed from the flow. You can replace the source for these objects as needed.

      NOTE: References to the deleted dataset in other flows remain broken until the dataset is replaced.

 

View for Datasets with Parameters

Flow View for any flow containing a dataset with parameters has some variations.

Parameters Panel

In addition to the standard view of your flow, the Parameters panel contains information about the parameters that are applied to datasets in the flow.

Figure: Parameters Panel in Flow View

The above information is useful for reviewing parameters and specifying overrides at execution time.

Tip: You can change the default value. Click in the Value column and set a new parameter value. When a sample is executed or a job is scheduled without manual overrides, this new parameter value is applied.

Parameters tab

When you select a dataset with parameters in Flow View, you can review the parameters that have been specified for the selected dataset in the right panel.

Figure: Parameters tab in Flow View

Actions:

  • To edit the parameters for the dataset, select Edit parameters... from the context menu in the right panel.

 

View for Recipes

For each recipe, you can review or edit its steps or create new recipes altogether. You can also create references to the recipe, modify outputs, and create new recipes off of the recipe.

When you select a recipe:

  • You can create an output object.
  • You can create a reference object.
  • The following options are available in the context panel.

Figure: Recipe view

Actions:

  • Edit Recipe: Open the recipe and begin editing. See Transformer Page .
  • Add new Recipe: Add a new recipe from the recipe. This new recipe is operates on the outputs of the original recipe.
  • Edit name and description: (Available to flow owner only) Change the name and description for the object.
  • Assign Target to Recipe: Create a target and assign it to this recipe. For more information, see Create Target.

  • Remove Target: Remove the currently assigned target from this recipe.

  • Create Reference Dataset: Create a reference to the output of this recipe. This object can then be added as a reference dataset in another flow. See View for Reference Dataset below.
  • Change input: Change the input dataset associated with the recipe.

    NOTE: This action substitutes only the primary input from a recipe, which does not include any datasets that are integrated from joins, unions, lookups, or other multi-dataset options.

  • Make a copy: Create a copy of the recipe and its related objects. You can create the copy with the same inputs or without inputs at all. The copied recipe is owned by the user who copied it.
  • Move: Move the recipe to a different flow, or create a new flow to contain it.
  • Download Recipe: Download the recipe in Wrangle format to your local desktop.
  • Delete: Delete the recipe.

    This step cannot be undone.

Recipe tab

Preview the first steps in the recipe.

Key Fields:

  • Steps: Total count of the steps in the recipe.

Data tab

Preview the data as reflected by the recipe.

NOTE: To render this data preview, some of the data must be loaded, and all steps in the recipe must be executed to generate the preview. Some delays may be expected.

Key Fields:

  • Size: Total count of columns and data types in the dataset.

 

Target tab

When a target has been assigned for this recipe, you can review its schema information in the Target tab. This tab appears only after a target has been assigned to the recipe.

To remove the current target, select Remove Target from the context menu.

Columns:

  • Position: Left-to-right position of the column in the target.
  • Name: Name of the column in the target.
  • Type:  Trifacta data type of the column in the target.

 

View for Outputs

Associated with each recipe is one or more outputs, which are publishing destinations. Through outputs, you can execute and track jobs for the related recipe.

Destinations tab

The Destinations tab contains all configured destinations associated with the recipe. 

  • Manual destinations are executed when the job is run through the application interface.
  • Scheduled destinations are executed when the job is triggered based on a schedule you have defined.

Figure: Destinations tab

Key Fields:  

  • (Action)-(Format): 
    • Field name describes the output action and the file format in which the results are written.
    • Field value is the location where the results are written.
  • Environment: The running environment where the job is configured to be executed.

  • Profiling: If profiling is enabled for this destination, this value is set to yes.

For more information, see Run Job Page.

Scheduled destinations:

When a scheduled execution of the flow is triggered, these destinations are populated with the results. If any input datasets are missing, the job is not run.

NOTE: Flow collaborators cannot modify publishing destinations.

Actions:

 

  • Run Job: Click Run Job to queue for immediate execution a job for the manual destinations. You can track the progress and results of this task through the Jobs tab. 

  • Delete Output: Remove this output from the flow. This operation cannot be undone.
    • Removing an output does not remove the jobs associated with the output. You can continue working with those executed jobs.See Jobs Page

       

  • Edit: Click this link to modify the selected destination's properties.

 

  • Download CLI Script: Select to export the necessary package of files for use by the Trifacta command line interface , which enables command-line automation of your data wrangling operations. See CLI for Jobs.

 

Jobs tab

Figure: Jobs tab

Each entry in the Jobs tab identifies a job that has been queued for execution. You can track the progress, success, or failure of execution. When a job has finished execution you can review the results. Click the link to the job. For more information, see Job Results Page.Actions:

  • View Results: Click to view the results of your completed job.

    For more information, see Job Results Page.

     

     

  • View steps and dependencies: View steps of the recipe being executed and any dependencies referenced in the recipe.
  • Export Results: Click to export or publish the results from your completed job.

    For more information, see Export Results Window.

     

     

  • Cancel job: Select to cancel a job that is currently being executed.
  • Download Logs: Download the logs for the job. If the job is in progress, log information is likely to be incomplete.

 

View for References

When you select a recipe, you can choose to create a reference dataset off of that recipe. A reference dataset is a dataset that is a reference to the output generated from a recipe contained in another flow. Whenever the upstream recipe and its output data are changed, the results are automatically inherited through the reference to the reference dataset. 

NOTE: You cannot select or use a reference dataset until a reference has been created in the source flow from the recipe to use.

To create a reference dataset from a recipe, click the Paper Clip icon. The following options appear in the right panel.

Figure: Reference view

Key Fields:

  • Used In: Indicates the number of flows where the reference appears. If this number is greater than one, click More details to review the flows. See Dataset Details Page .

Actions:

  • Add to Flow: Click to add the reference dataset to a new or existing flow.
  • Edit name and description: (Available to flow owner only) Change the name and description for the object.
  • Delete Reference Dataset: Remove the reference dataset from the flow.

    Deleting a reference dataset in the source flow causes all references to it to be broken in the flows where it is referenced. These broken references should be fixed by swapping in new sources.

View for Raw Datasets

A raw dataset is an imported dataset that does not contain any initial parsing steps. All parsing steps must be added through recipes that are applied to the dataset. 

Tip: You can remove initial parsing during import or through the context menu for an imported dataset. See Initial Parsing Steps.

Figure: Raw Dataset view

Key Fields:

  • Data Preview: In the Data Preview window, you can see a small section of the data that is contained in the imported dataset. This window can be useful for verifying that you are looking at the proper data.

    Tip: Click the preview to open a larger dialog, where you can select and copy data.

  • Type: Indicates where the data is sourced or the type of file.
  • File Size: Size of the file. Units may vary.
  • Location: Path to the location of the imported dataset.

Actions:

  • Add new Recipe: Add a new recipe for the object. If a recipe already exists for it, this new recipe is created as a branch in the flow.
  • Edit name and description: (Available to flow owner only) Change the name and description for the object.
  • Remove from Flow: Remove the dataset from the flow. All dependent flows, outputs, and references are removed from the flow as well.

 

View for Connections

For flows that require connections to source data, you can review the details of the connection, whether you created it or it was shared with you.

Figure: Connections view

Key Fields:

  • Connection Type: For more information, see Connection Types .
  • Owner: User that owns the connection. This user can modify connection properties.
  • Server information: You can review information about the source to which the connection links.
  • Shared:
    • Private - Connection is available for use only for specified users of the platform.
    • Public - Connection is available for all users.
    • For more information, see Share Connection Window.

Actions:

Edit Connection: Select to modify the connection.

NOTE: For shared connections, you may only modify the username and password if they were not provided to you. All other fields are read-only.

Share...: Click to share the connection with other users.

NOTE: You can share connections that have been shared with you. You cannot make these connections public or modify their properties.

See Share Connection Window.

 

View for Reference Datasets

 reference dataset is a reference to a recipe's outputs that has been added to a flow other than the one where the recipe is located.

NOTE: A reference dataset is a read-only object in the flow where it is referenced. You cannot select or use a reference dataset until a reference has been created in the source flow from the recipe to use. See View for Recipes above.


To add a reference dataset, you can:

  1. From the source flow, select the reference object for a recipe. In the context panel, click Add to Flow....
  2. Click Add Datasets from the main Flow View page and select one from a different flow.


Figure: View for referenced sataset in a new flow

NOTE: Reference datasets marked with a red dot no longer have a source dataset for them in the other flow. These upstream dependencies should be fixed. See Fix Dependency Issues.

When you select a reference dataset in flow view, the following are available in the right-hand panel.

Key Fields:

  • Source Flow: Flow that contains the dataset. Click the link to open the Flow View page for that dataset.

Actions:

  • Replace: Replace the dataset with a different dataset or reference dataset.
  • Add new Recipe: Add a new recipe for the object. If a recipe already exists for it, this new recipe is created as a branch in the flow.
  • Remove: Remove the reference dataset from the flow. The source dataset in the other flow is untouched.

This page has no comments.