Registered users of this product or Trifacta Wrangler Enterprise should login to Product Docs through the application.
Where possible, changes made in one dataset propagate to the datasets that consume it. Datasets that join, union, or lookup against your dataset are likely to be impacted if you delete columns or rows or otherwise change the data. In some cases, the recipes of these dependent datasets can break.
This section describes how to identify these dependency issues and includes general steps for fixing them.
How to Identify
When making edits to a dataset, you can verify if your changes potentially impact other datasets that rely on it. In the Transformer page, click the drop-down next to the current dataset's name to open the Dataset Navigator.
Tip: If your current dataset is connected to datasets to the right of it, those datasets are dependent on the current one. After you make changes to the current one, you should use the Dataset Navigator to open wrangled datasets that are connected to it and to the right of it in flow view.
See Dataset Navigator.
Broken data integrations
When you make some changes in an upstream dataset, the recipes for any downstream datasets can break, such that you cannot generate satisfactory results. In the downstream dataset, you may see errors in the Recipe panel, such as the following:
In the above, the column
Day does not exist in the current dataset, which is causing problems in the last two recipe steps. These types of errors are generated when a column in the upstream dataset has been dropped or renamed.
- In the Transformer page, open the Dataset Navigator from the drop-down next to the current dataset name. In the Flow View tab, open the dataset referenced in the error message.
In the Recipe panel, locate the step where the column was removed.
Tip: In some cases, it may be easier to download the recipe from the panel and search it for the name of the column (
Fix the issue. Details are below.
If you make changes to specific values in a dataset, recipe steps in downstream datasets can break if they rely on detecting specific values. Depending on the usage, the step may not actually be broken, but the generated results are incorrect.
For example, a downstream dataset recipe includes the following step:If the
company_namecolumn is sourced from another dataset and the
My Co.value is changed to
My Company, the downstream dataset that includes this transform doesn't break in an easily noticeable way. The data is simply not removed from the dataset and any generated results.
When you locate a dependency issue in the upstream dataset, you can fix it using one of the following methods:
Fix the issue in the source dataset. Verify that the change does not impact other datasets.
NOTE: If you fix the issue in the source dataset, you should verify if any other downstream datasets are impacted by this change.
Change the input dataset to use a dataset that is not broken.
Tip: If you must freeze the data in the dataset that you are using as an input, you can create a copy of the dataset as a snapshot. See Dataset Details Page.
To use the copy, repair or rebuild the integration using the copied version.
- Fix the issue in the dataset that depends on it. In this case, you must redefine the transformation that brings in the data.
This page has no comments.