Page tree

 

Support | BlogContact Us | 844.332.2821

 

Contents:

This documentation applies to Trifacta Wrangler. Download this free product.
Registered users of this product or Trifacta Wrangler Enterprise should login to Product Docs through the application.

Contents:


The Transform Builder enables you to rapidly assemble complete transform steps through a simple menu-driven interface. After you select the transform to apply, all relevant parameters can be configured through selection or type-ahead fields, so that you can choose from only the elements that are appropriate for the selected transform.

To open the Transform Builder, edit an existing step or create a new one in the Recipe panel.

Figure: Transform Builder

Keyboard shortcuts:

KeyAction
TABAccept the currently selected item and move to the next field.
SHIFT + TABMove to the previous field.

Step 1 - (required) Choose a transformation

Begin typing to see the list of available transforms. Select or type the entire name.

Tip: You can paste the entire text from transform steps into this textbox, and all appropriate fields in the Transform Builder are specified based on the values in the pasted transform. Invalid transforms cannot be pasted.

join and union transforms have separate tools. You cannot paste join or union transforms into the Transform Builder. You can enter join or union as the transformation to open the corresponding tool.

For a list of available transforms and functions, see Language Index.

Step 2 - Specify the column(s), formula, or condition

Depending on the transform that you have selected, you must specify one or more of the following types of parameters.

  • Some transforms support combinations of the following. 
  • Some transforms, like deduplicate, require no parameters. 

Columns

Select or specify the column or columns to which to apply the transform.

Tip: To specify a range of columns, insert a tilde (~) after the first column. The second column you select defines the last column in the range. Some transforms do not support multiple columns or column ranges.

  • Literal values. A literal, or constant, value is a fixed numeric, string, Boolean, or other type of value, which does not change depending on the row under evaluation.
  • Functions. Trifacta® Wrangler supports a wide variety of numerical, statistical, and other function types. For a list of available transforms and functions, see Language Index.
  • Columns. When a column name is used in a formula, the transform uses the value in the named column for the currently evaluated row. 
  • Operators. You can apply logical, numeric, or comparison operators as part of your formula.

Patterns

For some transforms, you can specify patterns to identify conditions or elements of the data on which to take action. These matching patterns can be specified using one of the following types. 

Tip: After you have create a pattern in one transform step, you can apply it in another. In the pattern field in the Transform Builder, click Browse Pattern History.

Pattern TypeDescriptionExample
Literal value An exact string or value.

The following matches on the exact value between the quotes:

  'This is what I want to match.'

Trifacta pattern

Trifacta Wrangler supports a variety of macro-like pattern identifiers, which can be used in place of more complex regular expressions.

The following matches when two digits appear at the beginning of a value:

`{start}{digit} {digit}`

Regular expression pattern

Regular expressions are a standard method of describing matching patterns.

NOTE: The syntax of regular expressions can be complex and can lead to unexpected results if they are improperly specified. Regex is considered a developer-level skill.

The following matches on all numerical values from 0 to 99:

/^\d$|^\d\d$/

For more information on pattern-based matching, see Text Matching.

Pattern Groups

In the Transform Builder, transforms that require patterns are organized into pattern groups, so that you specify only the elements of a pattern that work together. Pattern groups apply to the following transforms:

Pattern groups are listed below. For more information on the underlying syntax for pattern groups, see Pattern Clause Position Matching.

On pattern

Matches one of the following:

  1. Exact pattern
  2. Pattern starting after and ending before sub-patterns
  3. Exact pattern that occurs starting after and ending before sub-patterns (all parameters)

Between two patterns

Matches any values that appear between two sub-patterns. One sub-pattern describes the beginning of the match, and the other describes the end of the match.

Each sub-pattern can either include or exclude the matching value:

Transform Builder optionSub-patternInclude/Exclude
Starting afterBeginningExcludes sub-pattern
Starting fromBeginningIncludes sub-pattern
Ending beforeEndingExcludes sub-pattern
Ending atEndingIncludes sub-pattern

Sequence of patterns

NOTE: This pattern group applies to the split transform only. Split can be applied to a single column. See Split Transform.

You can specify a sequence of patterns in the source column to identify delimiters. For example, the first delimiter might be the first digit in the value (`{digit}`), while the second delimiter might be a vertical pipe character. An individual pattern can be a string literal, Trifacta pattern, or regular expression, and the sequence can contain combinations of these pattern types.

Between two positions

Matches any values that appear between two indexed positions in the column, inclusive. Index values start at 1.

For example, if the value in the column is ABCDEF, then the following configuration matches on the value BCD:

Transform Builder optionValue
Starting from2
Ending at4

Every nth position

NOTE: This pattern group applies to the split transform only. Split can be applied to a single column. See Split Transform.

Splits the source column every N characters, regardless of content. This pattern works well for fixed-width columnar data.

Sequence of positions

NOTE: This pattern group applies to the split transform only. Split can be applied to a single column. See Split Transform.

Split the source column at a sequence of character points that you define. For example, if your sequence of positions is specified as 2,8,12, then the source column is split at the second, eighth, and twelfth characters into three separate columns.

Condition

A condition is an expression that yields a true or false value. A condition may include all of the elements of a formula. This value determines whether the transformation is applied to the evaluated row.

Step 3 - Grouping, Ordering, and Naming

A number of transforms support the following parameters.

Group parameter: For transforms that aggregate data, such as aggregate or window, you can specify the column by which you wish to group the computed aggregations. In the following example, all values in the Sales column are summed up for each value in the State column.

aggregate value:SUM(Sales) group:State

Assuming that there are entries in the State column for each state in the United States, the resulting transform step has 50 rows, each of which contains the total sales for the listed state.

Order parameter: Some transforms support the order parameter, which allows you to specify the column of values that are used to sort the output. In the following example, all aggregates Sales values are ordered by the contract date and grouped by State:

aggregate value:SUM(Sales) group:State order:contractDate

The output can always be ordered using the sort transform. See Sort Transform.

New Column Name parameter: For transforms that generate new columns, such as derive and extract, you can optionally specify the name of the new column, which saves adding a step to rename it. In the following example, the values of colA and colB are summed and written to the new column colC:

derive value:(colA + colB) as:'colC'

Step 3 - Specify other parameters

Depending on the transform, you may be presented with other required or optional parameters to specify. See Transforms.

Step 4 - Add to Recipe

When you have finished your transform step, review the preview in the data grid.

If the results look ok, click Add.

The step is added to your recipe and applied to the data grid.

Edit a transform

After you have added a step, you can modify it as needed. In the Recipe panel, select the Pencil icon next to the recipe step. The step is displayed for editing in the Transform Builder.

 

Your Rating: Results: PatheticBadOKGoodOutstanding! 3 rates

This page has no comments.