Explore the objects that you create and their relationships. Flows, imported datasets, and recipes are created to transform your sampled data. After you build your output objects, you can run a job to transform the entire dataset based on your recipe and deliver the results according to your output definitions.

Flow Structure and Objects

Within , the basic unit for organizing your work is the flow. The following diagram illustrates the component objects of a flow and how they are related:

Objects in a Flow

Flow

flow is a container for holding one or more datasets, associated recipes and other objects. This container is a means for packaging  for the following types of actions:




Imported Dataset

Data that is imported to the platform is referenced as an imported dataset. An imported dataset is simply a reference to the original data; the data does not exist within the platform. An imported dataset can be a reference to a file, multiple files, database table, or other type of data.

NOTE: An imported dataset is a pointer to a source of data. It cannot be modified or stored within .



After you have created an imported dataset, it becomes usable after it has been added to a flow. You can do this as part of the import process or later.

Recipe

recipe is a user-defined sequence of steps that can be applied to transform a dataset.

In a flow, the following objects are associated with each recipe, which are described below:

Outputs and Publishing Destinations

Outputs contain one or more publishing destinations, which define the output format, location, and other publishing options that are applied to the results generated from a job run on the recipe. 

When you select a recipe's output object in a flow, you can:

References and Reference Datasets

References allow you to create a reference to the output of the recipe's steps in another dataset. References are not depicted in the above diagram.

When you select a recipe's reference object, you can add it to another flow. This object is then added as a reference dataset in the target flow. A reference dataset is a read-only version of the output data generated from the execution of a recipe's steps.

Flow Example

The following diagram illustrates the flexibility of object relationships within a flow. 

Flow Example


TypeDatasetsDescription
Standard job executionRecipe 1/Job 1

Results of the job are used to create a new imported dataset (I-Dataset 2). See Job Details Page.

Create dataset from generated resultsRecipe 2/Job 2

Recipe 2 is created off of I-Dataset 2 and then modified. A job has been specified for it, but the results of the job are unused.


Chaining datasetsRecipe 3/Job 3

Recipe 3 is chained off of Recipe 2. The results of running jobs off of Recipe 2 include all of the upstream changes as specified in I-Dataset 1/Recipe1 and I-Dataset 2/Recipe 2.

Reference datasetRecipe 4/Job 4I-Dataset 4 is created as a reference off of Recipe 3. It can have its own recipe, job, destinations, and results.

Flows are created in the Flows page. See Flows Page.

Working with recipes

Recipes are edited in the Transformer page, which provides multiple methods for quickly selecting and building recipe steps.

Samples: Within the Transformer page, you build the steps of your recipe against a sample of the dataset. 


Flow parameters: You can specify flow parameters that can be referenced in your recipes. When invoked in a step, a flow parameter replaces its reference with the default string value associated or any override value that you have specified for it. See Overview of Parameterization.Macros: As needed, you can create reusable sequences of steps that can be parameterized for use in other recipes. For more information, see Overview of Macros


Run Jobs: When you are satisfied with the recipe that you have created in the Transformer page, you can execute a job. A job may be composed of one or more of the following job types:


Connections

connection is a configuration object that provides a personal or global integration to an external datastore. Reading data from remote sources and writing results are managed through connections.

Flow Schedules

You can associate a schedule with a flow. A schedule is a combination of one or more triggers and the outputs that are generated from them.

NOTE: A flow can have only one schedule associated with it.

Below, you can see the object hierarchy within a schedule.

+ schedule for Flow 1
  + trigger 1
  + trigger 2
  + scheduled destination a
  + scheduled destination b
+ schedule for Flow 2
  + trigger 3
  + scheduled destination c
  + scheduled destination d

For more information, see Overview of Automator.

Plans

A plan is a sequence of triggers and tasks that can be executed across multiple flows. A plan is executed on a snapshot of all objects at the time that the plan is triggered.

For more information, see Overview of Operationalization.