Contents:
The cloudbased version of Trifacta Wrangler is now available! Read all about it, and register for your free account.
Contents:
Wrangle is the domainspecific language used to build transformation recipes in Trifacta® Wrangler.
A Wrangle recipe is a sequence of transforms, which are applied to your dataset in order to produce your results.
 A transform is a single action applied to your dataset. For most transforms, you can pass one or more parameters to define the context (columns, rows, or conditions) where the transform is applied to your dataset.
 Within some parameters of a transform, you can specify one or more functions. A function is a computational action performed on one or more columns of data in your dataset.
 These terms are described below.
 Recipes are built in the Transformer Page. See Transformer Page.
When you select suggestions in the Transformer Page, your selection is converted into a Wrangle command and added to your recipe.
Tip: Where possible, you should make selections in the data grid to build transform steps. These selections prompt a series of cards to be displayed at the bottom of the screen. You can select different cards to specify a basic transform for your selected data, choose a variant of that transform, and then modify the underlying Wrangle recipe as necessary. For more information, see Overview of Predictive Transformation.
For more information on the suggestion cards, see Suggestion Cards Panel.
Some complex transforms, such as joins and unions, must be created through dedicated screens. See Transformer Page .
group
parameter may result in nondeterministic ordering in the data grid. When jobs are executed at scale, any transform can potentially induce unwanted reordering of rows due to distributed processing.
To enforce row order, use the sort
transform. For more information, see Sort Transform.
Wrangle Syntax
Wrangle transform steps follow this general syntax:
(transform) param1:(expression) param2:(expression)
Transform Element  Description 

transform  A transform (or verb) is a single keyword that identifies the type of change you are applying to your dataset.
The other elements in each step are contextual parameters for the transform. Some transforms do not require parameters. 
parameter1: , parameter2:  Additional parameters may be optional or required for any transform. NOTE: A parameter is always followed by a colon. A parameter may appear only one time in a transform step. 
Common Parameters
Depending on the transform, one or more of value
, col
, and row
parameters may be used. For example, the set
transform can use all three or just value
and col
.
Transform Element  Description 

value:  When present, the An expression can contain combinations of the following:

col:  When present, the Some transforms may support multiple columns as a list, as a range of columns (e.g., 
row:  When present, the row parameter defines the expression to evaluate to determine the rows on which to perform the transform. If the row expression evaluates to true for a row, the transform is performed on the row. 
Parameter Inputs
The following types of parameter inputs may be referenced in a transform's parameters.
Other Trifacta data types can be referenced as column references. For literal values of these data types, you can insert them into your expressions as strings. Transforms cause the resulting values to be reinferred for their data type.
Column reference
A reference to the values stored in a column in your dataset. Columns can be referenced by the plaintext value for the column name.
Example: value
parameter references the myCol
column.
derive type:single value: myCol as:'myNewCol'
Functions
Some parameters accept functions as inputs. Where values or formulas are calculated, you can reference one of the dozens of functions available in Wrangle.
Example:
derive type:single value:MULTIPLY(3,2) as:'six'
Nested expressions
Individual parameters within a function can be computed expressions themselves. These nested expressions can be calculated using constants, other functions, and column references.
Example: Computes a column whose only value is ten divided by three, rounded to the nearest integer (3):
derive type:single value:ROUND(DIVIDE(10,3),0) as:'three'
Integer
A valid integer value within the accepted range of values for the Integer datatype. For more information, see Supported Data Types.
Example: Generates a column called, my13
which is the sum of the Integer values 5
and 8
:
derive type:single value: (5 + 8) as:'my13'
Decimal
A valid floating point value within the accepted range of values for the Decimal datatype. For more information, see Supported Data Types.
Example: Generates a column of values that computes the approximate circumference of the values in the diameter
column:
derive type:single value: (3.14159 * diameter) as: 'circumference'
Boolean
A true
or false
value.
Example: If the value in the order
column is more than 1,000,000, then the value in the bigOrder
column is true
.
derive type:single value:IF(order > 1000000, true, false) as:'bigOrder'
String
A string literal value is the baseline datatype. String literals must be enclosed in single quotes.
Example: Creates a column called, StringCol
containing the value myString
.
derive type:single value:'myString' as:'StringCol'
Trifacta pattern
Trifacta Wrangler supports a special syntax, which simplifies the generation of matching patterns for string values.
Patterns must be enclosed in accent marks ( `MyPattern`
). For more information, see Text Matching.
Example: Extracts up to 10 values from the MyData
column that match the basic pattern for social security numbers (XXXXXXXXX
):
extract col: MyData on:`%{3}%{2}%{4}` limit:10
Regular expression
Regular expressions are a common standard for defining matching patterns. Regex is a very powerful tool but can be easily misconfigured.
Regular expressions must be enclosed in slashes ( /MyPattern/
).
Example: Deletes all twodigit numbers from the qty
column:
replace col: qty on: /^\d$^\d\d$/ with: '' global: true
Datetime
A valid date or time value that matches the requirements of the Datetime datatype. See Supported Data Types.
Datetime values can be formatted with specific formatting strings. See DATEFORMAT Function.
Example: Generates a new column containing the values from the myDate
column reformatted in yyyymmdd
format:
derive type:single value:DATEFORMAT(myDate, 'yyyymmdd')
Array
A valid array of values matching the Array data type.
Example:
[0,1,2,3,4,5,6,7,8]
See Supported Data Types.
Example: Generates a column with the number of elements in the listed array (7
):
derive type:single value: ARRAYLEN('["red", "orange", "yellow", "green", "blue", "indigo", "violet"]')
Object
A valid set of values matching the Object data type.
Example:
{"brand":"Subaru","model":"Impreza","color","green"}
See Supported Data Types.
Example: Generates separate columns for each of the specified keys in the object ( brand
, model
, color
), containing the corresponding value for each row:
unnest col:myCol keys:'brand','model','color'
Interactions between Wrangle and the Application
 As you build Wrangle steps in the Transform Builder, your syntax is validated for you. You cannot add steps containing invalid syntax.
 Error messages are reported back to the application, so you can make immediate modifications to correct the issue.
 Typeahead support can provide guidance to the supported transforms, functions, and column references.
 For more information, see Transform Builder.
 When you have entered a valid transform step, the results are previewed for you in the data grid.
This preview is generated by applying the transform to the sample in the data grid.
NOTE: The generated output applies only to the values displayed in the data grid. The function is applied across the entire dataset only during job execution.
 If the previewed transform is invalid, the data grid is grayed out.
 For more information, see Transform Preview.
 When you add the transform to your recipe:
 It is applied to the sample in the application, and the data grid is updated to the current state.
 Column histograms are updated with new values and counts.
 Column data types may be reinferred for affected columns.
 Making changes:
 You can edit any transform step in your recipe whenever needed.
 When you edit a transform step in your recipe, the context of the data grid is changed to display the state of your data up to the point of previewing the step you're editing.
 All subsequent steps are still part of the recipe, but they are not applied to the sample yet.
 You can insert recipe steps between existing steps.
 When you delete a recipe step, the state remains at the point where the step was removed.
 You can insert a new step if needed.
 When you complete your edit, select the final step of the recipe, which displays the results of all of your transform steps in the data grid. Your changes may cause some recipe steps to become invalid.
 See Recipe Panel.
 You can edit any transform step in your recipe whenever needed.
Transforms
A transform, or verb, is an action applied to rows or columns of your data. Transforms are the essential set of changes that you can apply to your dataset. For more information, see Transforms.
Function Categories
A function is an action that is applied to a set of values as part of a transform step. Functions can apply to the values in a transform for specific data types, such as strings, or to types of transforms, such as aggregate and window function categories. A function cannot be applied to data without a transform.
Function Category  Description 

These functions are used to perform aggregation calculations on your data, such as sum, mean, and standard deviation.  
Comparison Functions  Comparison functions enable evaluation between two data elements, which are typically nested (Object or Array) elements. 
Math Functions  Perform computations on your data using a variety of math functions and numeric operators. 
Date Functions  Use these functions to extract data from or perform operations on objects of Datetime data type. 
String Functions  Manipulate strings, including finding substrings within a string. 
Nested Functions  These functions are designed specifically to assist in wrangling nested data, such as Objects, Arrays, or JSON elements. 
Type Functions  Use the Type functions to identify valid, missing, mismatched, and null values. 
Window Functions  The Window functions enable you to perform calculations on relative windows of data within your dataset. 
Other Functions  Miscellaneous functions that do not fit into the other categories 
Operator Categories
An operator is a single character that represents an arithmetic function. For example, the Plus sign (+
) represents the add function.
Operator Category  Description 

Logical Operators  and, or, and not operators 
Numeric Operators  Add, subtract, multiply, and divide 
Comparison Operators  Compare two values with greater than, equals, not equals, and less than operators 
Ternary Operators  Use ternary operators to create if/then/else logic in your transforms. 
Documentation
Documentation for Wrangle is also available through Trifacta Wrangler. Select Help menu > Product Docs.
Tip: When searching for examples of transforms and functions, try using the following forms for your search terms within the Product Docs site:
 Transforms:
wrangle_transform_NameOfTransform
 Functions:
wrangle_function_NameOfFunction
All Topics
Topics:
 Transforms
 Case Transform
 Comment Transform
 Countpattern Transform
 Deduplicate Transform
 Delete Transform
 Derive Transform
 Drop Transform
 Extract Transform
 Extractkv Transform
 Extractlist Transform
 Filter Transform
 Flatten Transform
 Header Transform
 Keep Transform
 Merge Transform
 Move Transform
 Nest Transform
 Pivot Transform
 Rename Transform
 Replace Transform
 Set Transform
 Settype Transform
 Sort Transform
 Split Transform
 Splitrows Transform
 Unnest Transform
 Unpivot Transform
 Valuestocols Transform
 Window Transform
 Aggregate Functions
 ANY Function
 ANYIF Function
 AVERAGE Function
 AVERAGEIF Function
 COUNTA Function
 COUNTAIF Function
 COUNTDISTINCT Function
 COUNTDISTINCTIF Function
 COUNT Function
 COUNTIF Function
 KTHLARGEST Function
 KTHLARGESTIF Function
 KTHLARGESTUNIQUE Function
 LIST Function
 LISTIF Function
 MAX Function
 MAXIF Function
 MIN Function
 MINIF Function
 MODE Function
 MODEIF Function
 STDEV Function
 STDEVIF Function
 SUM Function
 SUMIF Function
 VAR Function
 VARIF Function
 LISTUNIQUE Function
 Logical Functions
 Comparison Functions
 Math Functions
 Numeric Operators
 ADD Function
 SUBTRACT Function
 MULTIPLY Function
 DIVIDE Function
 MOD Function
 NEGATE Function
 LCM Function
 NUMFORMAT Function
 ABS Function
 EXP Function
 LOG Function
 POW Function
 CEILING Function
 LN Function
 SQRT Function
 FLOOR Function
 ROUND Function
 TRUNC Function
 RADIANS Function
 DEGREES Function
 SIGN Function
 Date Functions
 String Functions
 CHAR Function
 UNICODE Function
 UPPER Function
 LOWER Function
 PROPER Function
 TRIM Function
 REMOVEWHITESPACE Function
 REMOVESYMBOLS Function
 LEN Function
 FIND Function
 RIGHTFIND Function
 SUBSTRING Function
 SUBSTITUTE Function
 LEFT Function
 RIGHT Function
 MERGE Function
 STARTSWITH Function
 ENDSWITH Function
 REPEAT Function
 EXACT Function
 STRINGGREATERTHAN Function
 STRINGGREATERTHANEQUAL Function
 STRINGLESSTHAN Function
 STRINGLESSTHANEQUAL Function
 PAD Function
 DOUBLEMETAPHONE Function
 DOUBLEMETAPHONEEQUALS Function
 Nested Functions
 Type Functions
 Window Functions
 PREV Function
 NEXT Function
 FILL Function
 ROLLINGAVERAGE Function
 ROLLINGMAX Function
 ROLLINGMIN Function
 ROLLINGSUM Function
 ROLLINGSTDEV Function
 ROLLINGVAR Function
 ROWNUMBER Function
 SESSION Function
 ROLLINGMODE Function
 ROLLINGCOUNTA Function
 ROLLINGLIST Function
 ROLLINGKTHLARGEST Function
 ROLLINGKTHLARGESTUNIQUE Function
 Other Functions
 Other Language Topics
 Language Index
This page has no comments.