Page tree

 

Support | BlogContact Us | 844.332.2821

 

Contents:

This documentation applies to Trifacta Wrangler. Download this free product.
Registered users of this product or Trifacta Wrangler Enterprise should login to Product Docs through the application.

Contents:


Beginning in Release 3.2, changes are being applied to the Trifacta® platform object model. These changes are intended to improve overall operationalization of the platform, enable better reuse of objects, and drive the platform toward a more flexible, workflow-based usage. These changes are to be applied over multiple releases.

 These changes may have impacts on how you access features, although most features perform as expected from previous releases. In some cases:

  • Features may behave differently.
  • Features may be temporarily disabled in the current release, in favor of a new and improved implementation in a future release.
  • Features may be removed altogether.

These changes are described in detail below.

Release 4.1

None.

Release 4.0

None.

Release 3.2

Overview

In Release 3.2, the object model has been moved from a dataset-oriented structure to a flow-based structure. Previously, datasets created in the application represented the central data objects. In the new flow-based model, all datasets that have been touched in the application are contained in a new object, called a flow. A flow is essentially a replacement of the project object with a different set of behaviors, including automatic change propagation between datasets. In the future, it will support even greater flexibility and connectivity. 

Similarly, scripts created in the Transformer page are now called recipes, which will become much more flexible and reusable objects in the future. 

In prior releases, datasources were references to source data that existed outside of the application and were controlled by Trifacta administrators. Beginning in Release 3.2, these objects, now called imported datasets, are independent objects and are associated with the dataset that uses them. For an overview diagram, see Object Overview.

Terminology Changes

Old TermNew TermNotes
projectflowA flow is a more generalized container for datasets, which will enable greater reuse of assets. See Flows Page.
scriptrecipeA recipe contains all of the transform steps of a script, as well as interfaces for their reuse in other scripts and datasets. See Recipe Panel.
datasourceimported datasetAn imported dataset is one or more files imported from outside of the platform. Functionally identical to the datasource in previous releases. Imported datasets can be associated with your flow.
datasetWrangled datasetTo distinguish from an imported dataset, a Wrangled dataset refers to any dataset that has been opened and edited in the Transformer page. A Wrangled dataset is a separate object in your flow.
execution enginerunning environment

This is a simple terminology change. When you configure your jobs, you select the appropriate running environment, where your job is executed.See Generate Results Dialog.

For more information on these changes, see Object Overview.

Functional Changes

Old FeatureNew FeatureDescription
Projects pageFlows pageProjects have been replaced by flows, which will offer much broader functionality and connectivity over the course of several releases. A flow is a storage container for imported and Wrangled datasets. See Flows Page.
Datasets pageDatasets pageThe Datasets page is used to import data from an outside source. In Release 3.2 and later, you interact with both imported datasets and Wrangled datasets through the Datasets page. The workflow has changed a little bit. See Datasets Page.
Datasources pageREMOVED

Imported datasets (datasources) are now managed through the Datasets page. This page is no longer available in the application.

NOTE: This page was not part of the Trifacta Wrangler application.

N/ADependencies Browser

Explore the dependencies in your datasets through the Transformer page. Identify dependency issues in the target dataset and then quickly navigate to the source issue to fix it. See Dataset Navigator.

Changes to System Behavior due to Object Model Changes

Automatic change propagation: Changes in one dataset automatically propagate to dependent datasets.

Imported and Wrangled datasets can be integrated into other datasets at any time. The changes to the object model support the propagation of changes in one dataset to be automatically applied in any datasets that consume the source dataset. This applies to the following:

  • Joins
  • Lookups
  • Unions

    NOTE: This propagation does not apply to:

    1. Datasets that are created from the generated output of the dataset. Since the new dataset is the product of an executed job, it no longer has any connection to the changes in the source dataset. If you wish to propagate those changes, however, you can re-run the job and write out a new dataset. 2. Copies of the dataset. Dataset copies are independent objects. 

     

Implications:

In Release 3.1.2 and earlier, multi-dataset operations, such as union, join, and lookup, were executed on a snapshot of the other dataset. For example, if dataset A performed a lookup into dataset B, the application internally performed a snapshot of dataset B and used the snapshot for completing the lookup. This snapshot is maintained separately.

When the platform is upgraded to Release 3.2 and later, this snapshotting behavior is preserved. Instead of maintaining the internal snapshot, the snapshot is migrated into a wrangled dataset of the same name.

NOTE: If your upgraded datasets included multi-dataset operations, you will see additional copies of the dataset that is used in the join or union. This dataset is saved such that the pre-migration snapshot is preserved. This method maintains the pre-upgrade state of the dataset and disables change propagation on the affected dataset.

If desired, you can edit this dataset or switch to the true source dataset to enable automated change propagation.

Additional impacts of automated change propagation of specific multi-dataset operations:

  • Joins: In prior releases, joins were executed on a snapshot of data. With automated change propagation, snapshotting is no longer necessary. The target dataset is automatically updated with any changes to the joined-in dataset. 

    NOTE: Automated change propagation can cause breakages in downstream datasets. For example, if you make changes to a dataset that is used in a join, those changes can break steps in the dataset into which it is joined. The Recipe panel can be used to identify these issues, which you can navigate to fix through the new Dependencies browser in the Transformer page. See Dataset Navigator.

     

  • Lookups: Similar to joins, changes in lookup data are automatically propagated. See Lookup Wizard.

  • Unions: In prior releases, when a dataset that was part of a union transform was changed, an alert appeared in the Recipe panel of the target dataset to indicate that there was a change. Beginning in Release 3.2:

    • The data is automatically updated in the target dataset. 

    • If the changes cause breakages, you can see the effects and the source dataset in the Recipe panel for the target dataset. 

    • You can trace back these issues through the Dependencies browser. See Dataset Navigator.

 

Undo/redo of dataset swap has been removed.

Changes to the object model mean that you cannot use undo/redo controls in the Transformer page to change the dataset. 

Tip: You can still select the previous dataset.

Your Rating: Results: PatheticBadOKGoodOutstanding! 9 rates

This page has no comments.