Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space DEV and version next

...

Example:  value parameter references the myCol column.

d-

...

trans
RawWrangletrue
p03Value'myNewCol'
Typestep
WrangleTextderive type:single value: myCol as:'myNewCol'
p01NameFormula type
p01ValueSingle row formula
p02NameFormula
p02ValuemyCol
p03NameNew column name
SearchTermNew formula

Column names with spaces or special characters in a transformation must be wrapped by curly braces. 

Example: Below, srcColumn is renamed to src Column, which requires no braces because the new name is captured as a string literal:

d-

...

trans
RawWrangletrue
p03Valuesrc Column
Typestep
WrangleTextrename type: manual mapping: [srcColumn, 'src Column']

After the column has been renamed with a space, it must be referenced in curly braces to be renamed back to its original name:

...

p01NameOption
p01ValueManual rename
p02NameColumn
p02ValuesrcColumn
p03NameNew column name
SearchTermRename columns

Info

NOTE: Current column names that have a space in them must be bracketed in curly braces. The above column name reference is the following: {src Column}.


Functions

Some parameters accept functions as inputs. Where values or formulas are calculated, you can reference one of the dozens of functions available in

D s lang
.

Example:


d-

...

trans
RawWrangletrue
p03Value'six'
Typestep
WrangleTextderive type:single value:MULTIPLY(3,2) as:'six'
p01NameFormula type
p01ValueSingle row formula
p02NameFormula
p02ValueMULTIPLY(3,2)
p03NameNew column name
SearchTermNew formula

Metadata variables

D s lang
 supports the use of variable references to aspects of the source data or dataset. In the following example, the ABS function is applied to each column in a set of them using the $col reference. 

d-

...

trans
RawWrangletrue
Typestep
WrangleTextset col: val1,val2 value: ABS($col)
p01NameColumns
p01Valueval1,val2
p02NameFormula
p02ValueABS($col)
SearchTermEdit column with formula

$col returns the value of the current row. For more information on these variables, see Source Metadata References.

...

Example: Computes a column whose only value is ten divided by three, rounded to the nearest integer (3):

d-

...

trans
RawWrangletrue
p03Value'three'
Typestep
WrangleTextderive type:single value:ROUND(DIVIDE(10,3),0) as:'three'
p01NameFormula type
p01ValueSingle row formula
p02NameFormula
p02ValueROUND(DIVIDE(10,3),0)
p03NameNew column name
SearchTermNew formula

Integer

A valid integer value within the accepted range of values for the Integer datatype. For more information, see Supported Data Types.

Example: Generates a column called, my13 which is the sum of the Integer values 5 and 8:

d-

...

trans
RawWrangletrue
p03Value'my13'
Typestep
WrangleTextderive type:single value: (5 + 8) as:'my13'
p01NameFormula type
p01ValueSingle row formula
p02NameFormula
p02Value(5 + 8)
p03NameNew column name
SearchTermNew formula

Decimal

A valid floating point value within the accepted range of values for the Decimal datatype. For more information, see Supported Data Types.

Example: Generates a column of values that computes the approximate circumference of the values in the diameter column:

d-

...

trans
RawWrangletrue
p03Value'circumference'
Typestep
WrangleTextderive type:single value: (3.14159 * diameter) as: 'circumference'
p01NameFormula type
p01ValueSingle row formula
p02NameFormula
p02Value(3.14159 * diameter)
p03NameNew column name
SearchTermNew formula

Boolean

A true or false value.

Example: If the value in the order column is more than 1,000,000, then the value in the bigOrder column is true.

d-

...

trans
RawWrangletrue
p03Value'bigOrder'
Typestep
WrangleTextderive type:single value:IF(order > 1000000, true, false) as:'bigOrder'
p01NameFormula type
p01ValueSingle row formula
p02NameFormula
p02ValueIF(order > 1000000, true, false)
p03NameNew column name
SearchTermNew formula

String

A string literal value is the baseline datatype. String literals must be enclosed in single quotes.

Example: Creates a column called, StringCol containing the value myString.

d-

...

trans
RawWrangletrue
p03Value'StringCol'
Typestep
WrangleTextderive type:single value:'myString' as:'StringCol'
p01NameFormula type
p01ValueSingle row formula
p02NameFormula
p02Value'myString'
p03NameNew column name
SearchTermNew formula

D s item
pattern
pattern

D s product
supports a special syntax, which simplifies the generation of matching patterns for string values.

...

Example: Extracts up to 10 values from the MyData column that match the basic pattern for social security numbers (XXX-XX-XXXX):

d-

...

trans
RawWrangletrue
p03Value`%{3}-%{2}-%{4}`
Typestep
WrangleTextextract col: MyData on:

...

`%{3}-%{2}-%{4}` limit:10
p01NameColumn to extract from
p01ValueMyData
p02NameOption
p02ValueCustom text or pattern
p03NameText to extract
p04Value10
p04NameNumber of matches to extract
SearchTermExtract text or pattern

Regular expression

Regular expressions are a common standard for defining matching patterns. Regex is a very powerful tool but can be easily misconfigured.

...

Example: Deletes all two-digit numbers from the qty column:

d-

...

trans
RawWrangletrue
p03Value''
Typestep
WrangleTextreplace col: qty on: /^\d$|^\d\d$/ with: '' global: true
p01NameColumn
p01Valueqty
p02NameFind
p02Value/^\d$|^\d\d$/
p03NameReplace with
p04Valuetrue
p04NameMatch all occurrences
SearchTermReplace text or pattern

Datetime

A valid date or time value that matches the requirements of the Datetime datatype. See Supported Data Types.

...

Example: Generates a new column containing the values from the myDate column reformatted in yyyymmdd format:

d-

...

trans
RawWrangletrue
Typestep
WrangleTextderive type:single value:DATEFORMAT(myDate, 'yyyymmdd')
p01NameFormula type
p01ValueSingle row formula
p02NameFormula
p02ValueDATEFORMAT(myDate, 'yyyymmdd')
SearchTermNew formula

Array

A valid array of values matching the Array data type.

...

Example: Generates a column with the number of elements in the listed array (7):

d-

...

trans
RawWrangletrue
Typestep
WrangleTextderive type:single value: ARRAYLEN('["red", "orange", "yellow", "green", "blue", "indigo", "violet"]')
p01NameFormula type
p01ValueSingle row formula
p02NameFormula
p02ValueARRAYLEN('["red", "orange", "yellow", "green", "blue", "indigo", "violet"]')
SearchTermNew formula

Object

A valid set of values matching the Object data type.

...

Example: Generates separate columns for each of the specified keys in the object ( brand, model, color), containing the corresponding value for each row:

d-

...

trans
RawWrangletrue
Typestep
WrangleTextunnest col:myCol keys:'brand','model','color'
p01NameColumn
p01ValuemyCol
p02NamePaths to elements
p02Value'brand','model','color'
SearchTermUnnest Objects into columns

Interactions between
D s lang
 and the Application

...

transform, or verb, is an action applied to rows or columns of your data. Transforms are the essential set of changes that you can apply to your dataset.

D s transforms

For more information, see Transforms.

...