Page tree



Contents:

The cloud-based version of Trifacta Wrangler is now available! Read all about it, and register for your free account.

You can create a new column of data from an existing one by providing example values for the new column for values in the source column. With each successive example value, Transformation by Example (TBE) improves the quality of the output values, until you have the desired set of values for your newly generated column.

Limitations:

  • Transformation by Example works best for text-based inputs. Non-text inputs are treated as String type by the feature.

    NOTE: Multi-value inputs, such as Object or Array data types, must be converted to String data type prior to transformation by example.

  • In the Transformer page, TBE is applied across the currently displayed sample. In the entire dataset, there may be outlier values that do not match any of the examples that you have provided.

    Tip: If your column data is quite varied, you should collect additional samples to verify that your TBE is properly matching all values in the column.

For more information, see Overview of TBE.

Steps:

  1. In the Transformer page, locate the column to use as your source. From the column menu, select Create column from examples.
  2. In the Transform Builder, enter the name of the new column.
  3. In the following example, a new column called zip is being created from the address column:


    Figure: Selected column and first value is specified

  4. Double-click an empty cell in the Preview column to populate it with an example. In the above, the zip code from the first value has been entered into the Preview column: 52001.

    Tip: You can copy values from the source column and paste them into the Preview column.

  5. While many of the zip code values from other rows have been accurately populated, there are still some values that need fixing. In the preceding image, you can see that the zip code for the third row was not properly extracted. Double-click in the Preview column for the third row and fix the value: 48239:

    Figure: Populating multiple example rows improves the overall quality of transformation across all rows

  6. A quick scroll through the rest of the rows in the sample indicate that you have properly extracted the zip code values for all rows. 
  7. Click Add to Recipe.

For more information on previewing changes, see Transform Preview.

Your Rating: Results: 1 Star2 Star3 Star4 Star5 Star 3 rates

This page has no comments.