For more information on the objects available in the platform, see Object Overview.
This release introduces a new specification for the flow object.
NOTE: This version of the flow object now supports export and import across products and versions of those products in the future. There is no change to the capabilities and related objects of a flow.
Beginning in Release 6.8:
You can export a flow from one product and imported it into another. For example, you can develop a flow in and then import it into , assuming that the product receiving the import is on the same build or a later one.
NOTE: Cloud-based products, such as free are updated on a periodic basis, as often as once a month. These products are likely to be on a version that is later than your installed version of . For compatibility reasons, you should develop your flows in your earliest instance of on Release 6.8 or later.
You can export a flow from Release 6.8 or later of and later import into Release 7.0 after upgrading the platform.
NOTE: You cannot import a pre-Release 6.8 flow into a Release 6.8 or later instance of . You should re-import those flows before you upgrade to Release 6.8 or later.
This release introduces macros, which are reusable sequences of parameterized steps. These sequences can be saved independently and references in other recipes in other flows. See Overview of Macros.
Beginning in Release 5.0, imported datasets can be augmented with parameters, which enables operationalizing sampling and jobs based on date ranges, wildcards, or variables applied to the input path. For more information, see Overview of Parameterization.
In Release 4.2, the object model has undergone the following revisions to improve flexibility and control over the objects you create in the platform.
In Release 3.2, the object model introduced the concepts of imported datasets, recipes, and wrangled datasets. These objects represented data that you imported, steps that were applied to that data, and data that was modified by those steps.
In Release 4.2, the wrangled dataset object has been removed in place of two objects listed below. All of the functionality associated with a wrangled dataset remains, including the following actions. Next to these actions are the new object with which the action is associated.
|Wrangled Dataset action||Release 4.2 object|
|Run or schedule a job||Output object|
|Preview data||Recipe object|
|Reference to the dataset||Reference object|
NOTE: At the API level, the
These objects are described below.
Since recipes are no longer tied to a specific wrangled dataset, you can now reuse recipes in your flow. Create a copy with or without inputs and move it to a new flow if needed. Some cleanup may be required.
This flexibility allows you to create, for example, recipes that are applicable to all of your datasets for initial cleanup or other common wrangling tasks.
Additionally, recipes can be created from recipes, which allows you to create chains of recipes. This sequencing allows for more effective management of common steps within a flow.
Before Release 4.2, reference datasets existed and were represented in the user interface. However, these objects existed in the downstream flow that consumes the source. If you had adequate permissions to reference a dataset from outside of your flow, you could pull it in as a reference dataset for use.
In Release 4.2, a reference is a link between a recipe in your flow to other flows. This object allows you to expose your flow's recipe for use outside of the flow. So, from the source flow, you can control whether your recipe is available for use.
This object allows you to have finer-grained control over the availability of data in other flows. It is a dependent object of a recipe.
NOTE: For multi-dataset operations such as union or join, you must now explicitly create a reference from the source flow and then union or join to that object. In previous releases, you could directly join or union to any object to which you had access.
In Release 4.1, outputs became a configurable object that was part of the wrangled dataset. For each wrangled dataset, you could define one or more publishing actions, each with its own output types, locations, and other parameters. For scheduled executions, you defined a separate set of publishing actions. These publishing actions were attached to the wrangled dataset.
In Release 4.2, an output is a defined set of scheduled or ad-hoc publishing actions. With the removal of the wrangled dataset object, outputs are now top-level objects attached to recipes. Each output is a dependent object of a recipe.
Below, you can see the same flow as it appears in Release 4.1 and Release 4.2. In each Flow View:
In Release 4.1.1 and earlier, connections appeared as objects to be created or explored in the Import Data page. Through the left navigation bar, you could create or edit connections to which you had permission to do so. Connections were also selections in the Run Job page.
In Release 4.2, the Connections Manager enables you to manage your personal connections and (if you're an administrator) global connections. Key features:
NOTE: Beginning in Release 4.2, all connections are initially created as private connections, accessible only to the user who created. Connections that are available to all users of the platform are called, public connections. You can make connections public through the Connections page.
For more information, see Connections Page.