For more information on the objects available in the platform, see Object Overview.

Release 7.1

Introducing Plans

Beginning in Release 7.1, you can build plans, which are sequences of flow tasks (job executions). When a flow task completes, the next one is executed, and so on. Plans are a very useful means of automating job execution. For more information, see Overview of Operationalization.

See Plans Page.

Release 6.8

Version 2 of flow definition

This release introduces a new specification for the flow object.

NOTE: This version of the flow object now supports export and import across products and versions of those products in the future. There is no change to the capabilities and related objects of a flow.

Beginning in Release 6.8:

Release 6.4


This release introduces macros, which are reusable sequences of parameterized steps. These sequences can be saved independently and references in other recipes in other flows. See Overview of Macros.

Release 6.0


Release 5.1


Release 5.0

Datasets with parameters

Beginning in Release 5.0, imported datasets can be augmented with parameters, which enables operationalizing sampling and jobs based on date ranges, wildcards, or variables applied to the input path. For more information, see Overview of Parameterization

Release 4.2

In Release 4.2, the object model has undergone the following revisions to improve flexibility and control over the objects you create in the platform.

Wrangled datasets are removed

In Release 3.2, the object model introduced the concepts of imported datasets, recipes, and wrangled datasets. These objects represented data that you imported, steps that were applied to that data, and data that was modified by those steps. 

In Release 4.2, the wrangled dataset object has been removed in place of two objects listed below. All of the functionality associated with a wrangled dataset remains, including the following actions. Next to these actions are the new object with which the action is associated.

Wrangled Dataset actionRelease 4.2 object
Run or schedule a jobOutput object
Preview dataRecipe object
Reference to the datasetReference object

NOTE: At the API level, the wrangledDataset endpoint continues to be in use. In a future release, separate endpoints will be available for recipes, outputs, and references. For more information, see API Reference.

These objects are described below.

Recipes can be reused and chained

Since recipes are no longer tied to a specific wrangled dataset, you can now reuse recipes in your flow. Create a copy with or without inputs and move it to a new flow if needed. Some cleanup may be required.

This flexibility allows you to create, for example, recipes that are applicable to all of your datasets for initial cleanup or other common wrangling tasks.

Additionally, recipes can be created from recipes, which allows you to create chains of recipes. This sequencing allows for more effective management of common steps within a flow.

Introducing References

Before Release 4.2, reference datasets existed and were represented in the user interface. However, these objects existed in the downstream flow that consumes the source. If you had adequate permissions to reference a dataset from outside of your flow, you could pull it in as a reference dataset for use. 

In Release 4.2, a reference is a link between a recipe in your flow to other flows. This object allows you to expose your flow's recipe for use outside of the flow. So, from the source flow, you can control whether your recipe is available for use. 

This object allows you to have finer-grained control over the availability of data in other flows. It is a dependent object of a recipe.

NOTE: For multi-dataset operations such as union or join, you must now explicitly create a reference from the source flow and then union or join to that object. In previous releases, you could directly join or union to any object to which you had access.


Introducing Outputs

In Release 4.1, outputs became a configurable object that was part of the wrangled dataset. For each wrangled dataset, you could define one or more publishing actions, each with its own output types, locations, and other parameters. For scheduled executions, you defined a separate set of publishing actions. These publishing actions were attached to the wrangled dataset. 

In Release 4.2, an output is a defined set of scheduled or ad-hoc publishing actions. With the removal of the wrangled dataset object, outputs are now top-level objects attached to recipes. Each output is a dependent object of a recipe.

Flow View Differences

Below, you can see the same flow as it appears in Release 4.1 and Release 4.2. In each Flow View:

Release 4.1 Flow View

Release 4.2 Flow View

Flow View differences

Other differences

Connections as a first-class object

In Release 4.1.1 and earlier, connections appeared as objects to be created or explored in the Import Data page. Through the left navigation bar, you could create or edit connections to which you had permission to do so. Connections were also selections in the Run Job page.

In Release 4.2, the Connections Manager enables you to manage your personal connections and (if you're an administrator) global connections. Key features:

NOTE: Beginning in Release 4.2, all connections are initially created as private connections, accessible only to the user who created. Connections that are available to all users of the platform are called, public connections. You can make connections public through the Connections page.

For more information, see Connections Page.