Date: Sun, 25 Jul 2021 10:25:55 +0000 (GMT) Message-ID: <1245325934.25828.1627208755702@6a789edf488b> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_25827_93701703.1627208755702" ------=_Part_25827_93701703.1627208755702 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html ROLLINGVAR Function

# ROLLINGVAR Function

Contents:

=20

=20
=20
=20

=20
=20

Computes the rolling variance of values forward or backward of the c= urrent row within the specified column.
• If an input value is missing or null, it is not factored in the computa= tion. For example, for the first row in the dataset, the rolling variance o= f previous values is undefined.
• The row from which to extract a value is determined by the order in = which the rows are organized based on the `order` parameter.&nbs= p;

• If you are working on a randomly generated sample of your datas= et, the values that you see for this function might not correspond to the v= alues that are generated on the full dataset during job execution.

• The function takes a column name and two optional integer pa= rameters that determine the window backward and forward of the current row.=
• The default integer parameter values are `-1` and=  `0`, which computes the rolling function from the current = row back to the first row of the dataset.
• This function works with the Window transform. See Window Transform.

Terms...
=20

Relevant terms:

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
TermDescription
PopulationPopulation statistical functions are computed fr= om all possible values. See https://en.wikip= edia.org/wiki/Statistical_population.
Sample=20
=20

Sample-based statistical functions are computed from a subset or sample = of all values. See https://en.wikipedia.org/w= iki/Sampling_(statistics).

=20

These function names include `SAMP` in their name.

=20
=20

NOTE: Statistical sampling has no relationship to the = samples taken within the product. When statistical functions are computed d= uring job execution, they are applied across the entire dataset. Sample met= hod calculations are computed at that time.

=20
=20
=20

For more information on a non-rolling version of this function, see = ;VAR Function.

Wrangle vs. SQL: This function is part of Wrangle , a proprietary data transformation language.= Wrangle is not SQL. For more information, se= e Wrangle Language.

## Basic Usag= e

Column example:

=20

rollingvar(myCol)

Ou= tput: Returns the rolling variance of all values in the = `myCol` column from the fi= rst row of the dataset to the current one.

Rows before example:

=20

rollingvar(myNumber, 100)

Ou= tput: Returns the rolling variance of the current row and the 100 = previous row values in the ```myNumber<= /code> column.```

``` Rows before and after example:=20 rollingvar(myNumber, 3, 2) Output: Returns the rolling variance of the three previous row values, the curren= t row value, and the two rows after the current one in the myNum= ber column.      S= yntax and Arguments =20 rollingvar(col_ref, rowsBefore_integer, ro= wsAfter_integer) order: order_col [group: group_col] =20 =20 =20 =20 =20 Argument Required? Data Type Description col_ref Y string Name of column whose values are applied to the f= unction rowsBefore_integer N integer Number of rows before the current one to include= in the computation rowsAfter_integer N integer Number of rows after the current one to include = in the computation For more information on the order and = group parameters, see Window Transform. For more information on syntax standards, see Language Documentation Syntax Notes<= /a>. col_ref Name of the column whose values are used to compute the function. <= /p> Multiple columns and wildcards are not supported. Usage Notes: =20 =20 =20 =20 Required? Data Type Example Value Yes String (column reference to Integer or Decimal v= alues) myColumn ro= wsBefore_integer, rowsAfter_integer Integers representing the number of rows before or after the curre= nt one from which to compute the rolling function, including the current ro= w. For example, if the first value is 5, the current row = and the five rows before it are used in the computation. Negative valu= es for rowsAfter_integer compute the rolling functio= n from rows preceding the current one. ```
``` rowBefore=3D0 generates the current row value only. rowBefore=3D-1  uses all rows preceding the cur= rent one. ```
• `If rowsAfter is not specified, then the value 0` is applied.
• If a `group` parameter is applied, then these parameter= values should be no more than the maximum number of rows in the groups.

Usage Notes:

=20 =20 =20 =20
Required? Data Type Example Value
No Integer `4`

## Examples<= /span>

### Ex= ample - Rolling computations for racing splits

This example describes how to use the rolling computational functions:

Source:

In this example, the following data comes from times recorded at regular= intervals during a three-lap race around a track. The source data is in cu= mulative time in seconds (`time_sc`). You can use ROLLING and ot= her windowing functions to break down the data into more meaningful metrics= .

lap quarter time_sc
1 0 0.000
1 1 19.554
1 2 39.785
1 3 60.021
2 0 80.950
2 1 101.785
2 2 121.005
2 3 141.185
3 0 162.008
3 1 181.887
3 2 200.945
3 3 220.856

Transformation:

Primary key: Since the quarter information repeats= every lap, there is no unique identifier for each row. The following steps= create this identifier:

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
Transformation Name `Change column data type` `lap,quarter` `String`
=20

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 = =20 =20 =20 =20 =20 =20
Transformation Name <= code>New formula `Single row formula` `MERGE(['l',lap,'q',quarter])` `'splitId'`
=20

Get split times: Use the following transform to break d= own the splits for each quarter of the race:

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
Transformation Name <= code>New formula `Multiple row formula` `ROUND(time_sc - PREV(time_sc, 1), 3)` `splitId` `'split_time_sc'`
=20

Compute rolling computations: You can use the foll= owing types of computations to provide rolling metrics on the current and t= hree previous splits:

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
Transformation Name <= code>New formula `Multiple row formula` `ROLLINGAVERAGE(split_time_sc, 3)`=20 `splitId` `'ravg'`
=20

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 = =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
Transformation Name <= code>New formula `Multiple row formula` `ROLLINGMAX(split_time_sc, 3)` `splitId` `'rmax'`
=20

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 = =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
Transformation Name <= code>New formula `Multiple row formula` `ROLLINGMIN(split_time_sc, 3)` `splitId` `'rmin'`
=20

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
Transformation Name <= code>New formula `Multiple row formula` `ROUND(ROLLINGSTDEV(split_time_sc, 3), 3)` `splitId` `'rstdev'`
=20

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
Transformation Name <= code>New formula `Multiple row formula` `ROUND(ROLLINGVAR(split_time_sc, 3), 3)` `splitId` `'rvar'`
=20

Compute rolling computations using sample method: These= metrics compute the rolling STDEV and VAR on the current and three previou= s splits using the sample method:

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
Transformation Name <= code>New formula `Multiple row formula` ```ROUND(ROLLINGSTDEVSAMP(split_time_sc, 3), = 3)``` `splitId` `'rstdev_samp'`
=20

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
Transformation Name <= code>New formula `Multiple row formula` ```ROUND(ROLLINGVARSAMP(split_time_sc, 3), 3)= ``` `splitId` `'rvar_samp'`
=20

Results:

When the above transforms have been completed, the results look like the= following:

lap quarter splitId time_sc split_time_sc rvar_samp rstdev_samp rvar rstdev rmin rmax ravg
1 0 l1q0 0

1 1 l1q1 20.096 20.096

0 0 20.096 20.096 20.096
1 2 l1q2 40.53 20.434 0.229 0.479 0.029 0.169 20.096 20.434 20.265
1 3 l1q3 61.031 20.501 0.154 0.392 0.031 0.177 20.096 20.501 20.344
2 0 l2q0 81.087 20.056 0.315 0.561 0.039 0.198 20.056 20.501 20.272
2 1 l2q1 101.383 20.296 0.142 0.376 0.029 0.17 20.056 20.501 20.322
2 2 l2q2 122.092 20.709 0.617 0.786 0.059 0.242 20.056 20.709 20.39
2 3 l2q3 141.886 19.794 0.621 0.788 0.113 0.337 19.794 20.709 20.214
3 0 l3q0 162.581 20.695 0.579 0.761 0.139 0.373 19.794 20.709 20.373
3 1 l3q1 183.018 20.437 0.443 0.666 0.138 0.371 19.794 20.709 20.409
3 2 l3q2 203.493 20.475 0.537 0.733 0.113 0.336 19.794 20.695 20.35
3 3 l3q3 222.893 19.4 0.520 0.721 0.252 0.502 19.4 20.695 20.252

You can reduce the number of steps by applying a ```wind= ow``` transform such as the following:

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
Transformation Name ```= Window``` `lap` `rollingaverage(split_time_sc, 0, 3)`= `rollingmax(split_time_sc, 0, 3)`=20 `rollingmin(split_time_sc, 0, 3)`=20 ```round(rollingstdev(split_time_sc, 0, 3), 3= )``` ```round(rollingvar(split_time_sc, 0, 3), 3)<= /code>``` ```round(rollingstdevsamp(split_time_sc, 0, 3= ), 3)``` ```round(rollingvarsamp(split_time_sc, 0, 3),= 3)``` `lap` `lap`
=20

However, you must rename all of the generated `windowX` = columns.

=20
=20

=20