Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space DEV and version r0682

D toc

The following changes have been applied to 

D s lang
 in this release of
D s product
rtrue
.

Release 6.8

New Functions

This release introduces the sampling method of calculating statistical functions. The following are now available:

Function NameDescription
STDEVSAMP Function

D excerpt include
pageSTDEVSAMP Function
nopaneltrue

VARSAMP Function

D excerpt include
pageVARSAMP Function
nopaneltrue

STDEVSAMPIF Function

D excerpt include
pageSTDEVSAMPIF Function
nopaneltrue

VARSAMPIF Function

D excerpt include
pageVARSAMPIF Function
nopaneltrue

ROLLINGSTDEVSAMP Function

D excerpt include
pageROLLINGSTDEVSAMP Function
nopaneltrue

ROLLINGVARSAMP Function

D excerpt include
pageROLLINGVARSAMP Function
nopaneltrue


Release 6.6

New Functions

Function NameDescription
SINH Function

D excerpt include
pageSINH Function
nopaneltrue

COSH Function

D excerpt include
pageCOSH Function
nopaneltrue

TANH Function

D excerpt include
pageTANH Function
nopaneltrue

ASINH Function

D excerpt include
pageASINH Function
nopaneltrue

ACOSH Function

D excerpt include
pageACOSH Function
nopaneltrue

ATANH Function

D excerpt include
pageATANH Function
nopaneltrue


Release 6.5

New Functions

Function NameDescription
SIN Function

D excerpt include
pageSIN Function
nopaneltrue

COS Function

D excerpt include
pageCOS Function
nopaneltrue

TAN Function

D excerpt include
pageTAN Function
nopaneltrue

Cotangent FunctionSee TAN Function.
Secant FunctionSee COS Function.
Cosecant FunctionSee SIN Function.
ASIN Function

D excerpt include
pageASIN Function
nopaneltrue

ACOS Function

D excerpt include
pageACOS Function
nopaneltrue

ATAN Function

D excerpt include
pageATAN Function
nopaneltrue

Arccotangent FunctionSee ATAN Function.
Arcsecant FunctionSee ACOS Function.
Arccosecant FunctionSee ASIN Function.


Release 6.4

Improvements to metadata references

Broader support for metadata references: For Excel files, $filepath references now return the location of the source Excel file. Sheet names are appended to the end of the reference. See Source Metadata References.

Release 6.3

New Functions

Function NameDescription
PARSEDATE Function

D excerpt include
pagePARSEDATE Function
nopaneltrue

Optional input formats for DateFormat task

The DateFormat task now supports a new parameter: Input Formats. This parameter specifies the date format to use when attempting to parse the input column.

  • If the parameter is specified, then the value of the parameter is used to parse the inputs. 
  • (default) if the parameter is not specified, then the following common formats are used for parsing the input:

    Code Block
    'M/d/yy',
    'MM/dd/yy',
    'MM-dd-yy',
    'M-d-yy',
    'MMM d, yyyy',
    'MMMM d, yyyy',
    'EEEE, MMMM d, yyyy',
    'MMM d yyyy',
    'MMMM d yyyy',
    'MM-dd-yyyy',
    'M-d-yyyy',
    'yyyy-MM-ddXXX',
    'dd/MM/yyyy',
    'd/M/yyyy',
    'MM/dd/yyyy',
    'M/d/yyyy',
    'yyyy/M/d',
    'M/d/yy h:mm a',
    'MM/dd/yy h:mm a',
    'MM-dd-yy h:mm a',
    'MMM dd yyyy HH.MM.SS xxx',
    'M-d-yy h:mm a',
    'MMM d, yyyy h:mm:ss a',
    'EEEE, MMMM d, yyyy h:mm:ss a X',
    'EEE MMM dd HH:mm:ss X yyyy',
    'EEE, d MMM yyyy HH:mm:ss X',
    'd MMM yyyy HH:mm:ss X',
    'MM-dd-yyyy h:mm:ss a',
    'M-d-yyyy h:mm:ss a',
    'yyyy-MM-dd h:mm:ss a',
    'yyyy-M-d h:mm:ss a',
    'yyyy-MM-dd HH:mm:ss.S',
    'dd/MM/yyyy h:mm:ss a',
    'd/M/yyyy h:mm:ss a',
    'MM/dd/yyyy h:mm:ss a',
    'M/d/yyyy h:mm:ss a',
    'MM/dd/yy h:mm:ss a',
    'MM/dd/yy H:mm:ss',
    'M/d/yy H:mm:ss',
    'dd/MM/yyyy h:mm a',
    'd/M/yyyy h:mm a',
    'MM/dd/yyyy h:mm a',
    'M/d/yyyy h:mm a',
    'MM-dd-yy h:mm:ss a',
    'M-d-yy h:mm:ss a',
    'MM-dd-yyyy h:mm a',
    'M-d-yyyy h:mm a',
    'yyyy-MM-dd h:mm a',
    'yyyy-M-d h:mm a',
    'MMM.dd.yyyy',
    'd/MMM/yyyy H:mm:ss X',
    'dd/MMM/yy h:mm a',

    These formats are a subset of the date formatting strings supported by the product. For more information, see Datetime Data Type.

Release 6.2

New Functions

Function NameDescription
RANK Function

D excerpt include
pageRANK Function
nopaneltrue

DENSERANK Function

D excerpt include
pageDENSERANK Function
nopaneltrue

ARRAYELEMENTAT function accepts new inputs

In previous releases, the ARRAYELEMENTAT function accepted a second input parameter to specify the index value of the element to retrieve. This "at" parameter had to be an Integer literal.

Beginning in this release, the function also accepts for this second "at" parameter:

  • Names of columns containing Integer values
  • Functions that return Integer values

For more information, see ARRAYELEMENTAT Function.

Release 6.1

None.

Release 6.0

New Functions

Function NameDescription
ARRAYINDEXOF Function

D excerpt include
pageARRAYINDEXOF Function
nopaneltrue

ARRAYRIGHTINDEXOF Function

D excerpt include
pageARRAYRIGHTINDEXOF Function
nopaneltrue

ARRAYSLICE Function

D excerpt include
pageARRAYSLICE Function
nopaneltrue

ARRAYMERGEELEMENTS Function

D excerpt include
pageARRAYMERGEELEMENTS Function
nopaneltrue

Changes to LIST* inputs

The following LIST-based functions have been changed to narrow the accepted input data types. In previous releases, any data type was accepted for input, which was not valid for most data types. 

In Release 6.0 and later, these functions accept only Array inputs. Inputs can be Array literals, a column of Arrays, or a function returning Arrays.

Info

NOTE: You should references to these functions in your recipes.

Renamed functions

The following functions have been renamed in Release 6.0.

Release 5.9 and earlierRelease 6.0 and later
LISTUNIQUE FunctionUNIQUE Function

FILL Function has new before and after parameters

Prior to Release 6.0, the FILL function replaced empty cells with the most recent non-empty value. 

In Release 6.0, before and after function parameters have been added. These parameters define the window of rows before and after the row being tested to search for non-empty values. Within this window, the most recent non-empty value is used. 

The default values for these parameters are -1 and 0 respectively, which performs a search of an unlimited number of preceding rows for a non-empty value.

Info

NOTE: Upon upgrade, the FILL function retains its preceding behavior, as the default values for the new parameters perform the same unlimited row search for non-empty values.

For more information, see FILL Function.

Release 5.9

New functions

The following functions can now be applied directly to arrays to derive meaningful statistics about them. 

FunctionDescription
ARRAYSORT Function

D excerpt include
pageARRAYSORT Function
nopaneltrue

TRANSLITERATE Function

D excerpt include
pageTRANSLITERATE Function
nopaneltrue

Release 5.8

File lineage information using source metadata references

Beginning in Release 5.8, you can insert the following references into the formulas of your transformations. These source metadata references enable you to continue to track file lineage information from within your datasets as part of your wrangling project.

Info

NOTE: These references apply only to file-based sources. Some additional limitations may apply.

referenceDescription
$filepathReturns the full path and filename of the source of the dataset.
$sourcerownumber

Returns the row number for the current row from the original source of the dataset.

Info

NOTE: This reference is equivalent to the SOURCEROWNUMBER function, which is likely to be deprecated in a future release. You should begin using this reference in your recipes.

 For more information, see Source Metadata References.

New math and statistical functions for arrays

The following functions can now be applied directly to arrays to derive meaningful statistics about them. 

FunctionDescription
LISTSUM Function

D excerpt include
pageLISTSUM Function
nopaneltrue

LISTMAX Function

D excerpt include
pageLISTMAX Function
nopaneltrue

LISTMIN Function

D excerpt include
pageLISTMIN Function
nopaneltrue

LISTAVERAGE Function

D excerpt include
pageLISTAVERAGE Function
nopaneltrue

LISTVAR Function

D excerpt include
pageLISTVAR Function
nopaneltrue

LISTSTDEV Function

D excerpt include
pageLISTSTDEV Function
nopaneltrue

LISTMODE Function

D excerpt include
pageLISTMODE Function
nopaneltrue

 

Release 5.7

WEEKNUM function now behaves consistently across running environments

In Release 5.6 and earlier, the WEEKNUM function treated the first week of the year differently between the 

D s photon
 and Spark running environments:

  • D s photon
     week 1 of the year:
     The week that contains January 1.
  • Spark week 1 of the year: The week that contains at least four days in the specified year.

This issue was caused by Spark following an ISO-8601 standard and relying on the joda datetimeformatter. 

Beginning in Release 5.7, the WEEKNUM function behaves consistently for both 

D s photon
 and Spark:

  • Week 1 of the year: The week that contains January 1.

For more information, see WEEKNUM Function.

Release 5.6

URLPARAMS function returns null values

In Release 5.1 and earlier, the URLPARAMS function returned empty Objects when no answer was computed for the function. 

In Release 5.6 and later, this function returns null values in the above case.

See URLPARAMS Function.

Release 5.1

D s lang
 now supports nested expressions

Beginning in Release 5.1, all 

D s lang
 functions now supported nested expressions, which can be arithmetic calculations, column references, or other function calls.

Info

NOTE: This feature is enabled by default, as this change does not break any steps created in previous versions of the product. It can be disabled if needed. See Miscellaneous Configuration.

Info

NOTE: This capability represents a powerful enhancement to the language, as you can now use dynamic inputs for all functions.

The following expression is a valid transform in 

D s lang
. It locates the substring in myString that begins with the @ sign until the end of the string, inclusive:

D code

derive value: substring(myString, find(myString, '@', true, 0), length(myString)

Nested arithmetic expressions:

Suppose you wanted just the value after the @ sign until the end of the string. Prior to Release 5.1, the following generated a validation error:

D code

derive value: substring(myString, find(myString, '@', true, 0) + 1, length(myString)

In the above, the addition of +1 to the second parameter is a nested expression and was not supported. Instead, you had to use multiple steps to generate the string value. 

Beginning in Release 5.1, the above single-step transform is supported.

Nested column references:

In addition to arithmetic expressions, you can nested column references. In the following example, the previous step has been modified to replace the static +1 with a reference to a column containing the appropriate value (at_sign_offset) : 

D code

derive value: substring(myString, find(myString, '@', true, 0) + at_sign_offset, length(myString)

Nested function references:

Now, you can combine multiple function references into a single computation. The following computes the total volume of a cube of length side and then multiplies that volume by the number of cubes (cube_count) to compute the total cube_volume

D code

derive type: single value: MULTIPLY(POW(cube_side,3),cube_count) as: 'cube_volume'

For more information, see Wrangle Language.

SOURCEROWNUMBER function generates null values consistently

The SOURCEROWNUMBER function returns the row number of the row as it appears in the original dataset. After some operations, such as unions, joins, and aggregations, this row information is no longer available.

In Release 5.0.1 and earlier, the results were confusing. When source row information was not available, the function was simply not available for use.

In Release 5.1 and later, the behavior of the SOURCEROWNUMBER function is more consistent:

  • If the source row information is available, it is returned.
  • If it is not available:
    • The function can still be used.
    • The function returns null values in all cases.

For more information, see SOURCEROWNUMBER Function.

New Functions

Function NameDescription
ARRAYELEMENTAT FunctionReturns element value of input array for the provided index value.
DOUBLEMETAPHONE FunctionReturns primary and secondary phonetic spellings of an input string using the Double Metaphone algorithm.
DOUBLEMETAPHONEEQUALS FunctionReturns true if two strings match phonetic spellings using Double Metaphone algorithm. Tolerance threshold can be adjusted.
UNIQUE FunctionGenerates a new column containing an array of the unique values from a source column.

Release 5.0.1

RAND function generates true random numbers

In Release 5.0 and earlier, the RAND function produced the same set of random numbers within the browser, after browser refresh, and over subsequent runs of a job. 

  • During job execution, a default seed value was inserted as the basis for the function during the execution of the job. 
  • In some cases, this behavior is desired. 

In Release 5.0.1 and later, the RAND function accepts an optional integer as a parameter. When this new seed value is inserted, the function generates deterministic, pseudo-random values.

  • This version matches the behavior of the old function.

    Info

    NOTE: On all upgraded instances of the platform, references to the RAND function have been converted to use a default seed value, so that previous behavior is maintained in the upgraded version.

  • If no seed value is inserted as a parameter, the RAND function generates true random values within the browser, after browser refresh, and over subsequent job runs.

    Info

    NOTE: Be aware that modifying your dataset based on the generated values of RAND() may have unpredictable effects later in your recipe and downstream of it.

For more information, see RAND Function.

Release 5.0

Required type parameter

Prior to Release 5.0, the following was a valid 

D s lang
 step:

D code

derive value:colA + colB as:'colC'

Beginning in Release 5.0, the type parameter is required. This parameter defines whether the transform is a single or multi-row formula. In the Transform Builder, this value must be specified.

The following is valid in Release 5.0:

D code

derive type:single value:colA + colB as:'colC'

See Derive Transform.

See Transform Builder.

Deprecated aggregate transform

In Release 4.2.1 and earlier, the aggregate transform could be used to aggregate your datasets using aggregation functions and groupings. 

In Release 5.0 and later, this transform has been merged into the pivot transform. The aggregate transform has been deprecated and is no longer available. 

Info

NOTE: During upgrade to Release 5.0 and later, recipes that had previously used the aggregate transform are automatically migrated to use the pivot equivalent.

Example 1

Release 4.2.1 and earlier Aggregate:

D code

aggregate value:AVERAGE(Scores)

Release 5.0 and later Pivot:

D code

pivot value: AVERAGE(Score) limit: 1

The limit parameter defines the maximum number of columns that can be generated by the pivot.

Example 2

Aggregate:

D code

aggregate value:AVERAGE(Scores) group:studentId

Pivot:

D code

pivot group: StudentId value: AVERAGE(Score) limit: 1

For more information, see Pivot Transform.

New search terms

In the new Search panel, you can search for terms that can be used to select transformations for quick population of parameters. In the following table, you can see 

D s lang
 how terminology has changed in Release 5.0 for some common transforms from earlier release. 

Tip

Tip: You can paste the Release 5.0 terms in the Search panel to locate the same transformations used in earlier releases.

Release 4.2.1 and earlier transformsRelease 5.0 and later search terms

aggregate

pivot
keepfilter
deletefilter
extract on:extractpatterns
extract at:extractpositions
extract before:extractbetweendelimiters
extract after:extractbetweendelimiters
replace on:replacepatterns
replace at:replacepositions
replace before:replacebetweenpatterns
replace after:replacebetweenpatterns
replace from:replacebetweenpatterns
replace to:replacebetweenpatterns
split on:splitpatterns
split delimiters:splitpositions
split every:splitpositions
split positions:splitpositions
split after:splitpatterns
split before:splitpatterns
split from:splitpatterns
split to:splitpatterns

Support for <> operator

Prior to Release 5.0, the following operator was used to test "not equal" comparisons:

Code Block
!=

Beginning in Release 5.0, the following operators is also supported:

Code Block
<>

Example:

D code

derive value:IF ((col1 <> col2), 'different','equal') as:'testNotEqual'

Tip

Tip: Both of the above operators are supported, although the <> operator is preferred.

For more information, see Comparison Operators.

ROUND function takes optional number of digits

The ROUND function now supports rounding to a specified number of digits. By default, values are rounded to the nearest integer, as before. See ROUND Function.

New Functions

Function NameDescription
DEGREES FunctionGenerates the value in degrees for an input radians value.
EXACT FunctionCompares two strings to see if they are exact matches.
FILTEROBJECT FunctionFilters the keys and values from an Object based on specified keys.
HOST FunctionReturns the host value from a URL.
ISEVEN FunctionReturns true if an Integer, function returning an Integer, or a column contains an even value.
ISODD FunctionReturns true if an Integer, function returning an Integer, or a column contains an odd value.
KTHLARGESTUNIQUE FunctionComputes the kth-ranked unique value in a set of values.
LCM FunctionReturns the least common multiple between two input values.
MODE FunctionComputes the mode (most common) value for a set of values.
MODEIF FunctionComputes the mode based on a conditional test.
PAD FunctionPads the left or right side of a value with a specified character string.
PI FunctionGenerates the value for pi to 15 decimal places.
RADIANS FunctionGenerates the value in radians for an input degrees value.
RANDBETWEEN FunctionGenerates a random Integer in a range between two specified values.
RIGHTFIND Function Locates a substring by searching from the right side of an input value.
ROLLINGCOUNTA FunctionComputes count of non-null values across a rolling window within a column.
ROLLINGKTHLARGEST FunctionComputes the kth largest value across a rolling window within a column.
ROLLINGKTHLARGESTUNIQUE FunctionComputes the kth largest unique value across a rolling window within a column.
ROLLINGLIST FunctionComputes list of all values across a rolling window within a column.
ROLLINGMAX FunctionComputes maximum value across a rolling window within a column.
ROLLINGMIN FunctionComputes minimum value across a rolling window within a column.
ROLLINGMODE FunctionComputes mode (most common) value across a rolling window within a column.
ROLLINGSTDEV FunctionComputes standard deviation across a rolling window within a column.
ROLLINGVAR FunctionComputes variance across a rolling window within a column.
SIGN FunctionComputes the positive or negative sign of an input value.
TRUNC FunctionTruncates a value to the nearest integer or a specified number of digits.
URLPARAMS FunctionExtracts any query parameters from a URL into an Object.
WEEKNUM FunctionCalculates the week that the date appears during the year (1-52).

Release 4.2.1

None.

Release 4.2

New Filter transform

Perform a variety of predefined row filtrations using the new filter transform, or apply your own custom formula to keep or delete rows from your dataset. 

New Case transform

Beginning in Release 4.2, you can use the Transform Builder to simplify the construction of CASE statements. For each case, specify the conditional and resulting expression in separate textboxes.

Rename transform now supports multi-column rename

Use the rename transform to rename multiple columns in a single transform. 

Delete specified columns or delete the others

The drop transform now supports the option of deleting all columns except the ones specified in the transform. See Drop Transform.

New string comparison functions

Compare two strings using Latin collation settings. See below.

NOW function returns 24-hour time values

In Release 4.1.1 and earlier, the NOW function returned time values for the specified time zone in 12-hour time, which was confusing.

In Release 4.2 and later, this function returns values in 24-hour time. 

New Transforms

Transform NameDocumentation
caseCase Transform
filterFilter Transform

New Functions

Function NameDocumentation
STRINGGREATERTHAN

STRINGGREATERTHAN Function

STRINGGREATERTHANEQUAL

STRINGGREATERTHANEQUAL Function

STRINGLESSTHAN

STRINGLESSTHAN Function

STRINGLESSTHANEQUAL

STRINGLESSTHANEQUAL Function

SUBSTITUTESUBSTITUTE Function