Date: Mon, 25 Oct 2021 00:38:29 +0000 (GMT) Message-ID: <176317928.54930.1635122309584@9c5033e110b2> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_54929_1001608755.1635122309584" ------=_Part_54929_1001608755.1635122309584 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html ROLLINGKTHLARGESTUNIQUE Function

# ROLLINGKTHLARGESTUNIQUE Function

Contents:

=20

=20
=20
=20

=20
=20

Computes the rolling unique=20 kth largest value forward or backward of the current row.=20 Inputs can be Integer, Decimal, or Datetime.

For purposes of this calculation, two instances of the same value are tr= eated at one value for k. So, if your dataset contains four rows with colum= n values `10``9``9`, an= d `8`, then `KTHLARGESTUNIQUE` re= turns `9` for `k=3D2` and 8 for `k=3D3`.

`ROLLINGKTHLARGESTUNIQUE` computes the ```KTHLAR= GESTUNIQUE``` value across a defined window of values within a col= umn.

• If an input value is missing or null, it is not factored in the computa= tion. For example, for the first row in the dataset, the rolling calculatio= n of previous values is undefined.
• The row from which to extract a value is determined by the order in = which the rows are organized based on the `order` para= meter.

• If you are working on a randomly generated sample of your datas= et, the values that you see for this function might not correspond to the v= alues that are generated on the full dataset during job execution.

• Inputs:
• Required column name
• Required kth value, which is a positive integer
• Two optional integer parameters that determine the window backward= and forward of the current row. The default integer parameter values are&n= bsp;`-1` and `0`, which computes the rollin= g function from the current row back to the first row of the dataset.
• This function works with the following transforms:

For more information on a non-rolling version of this function, see = ;KTHLARGESTUNIQUE Funct= ion.

Wrangle vs. SQL: This function is part of Wrangle , a proprietary data transformation language.= Wrangle is not SQL. For more information, se= e Wrangle Language.

## Basic Usage

Column example:

=20

rollingkthlargestunique(myCol, 2)

Ou= tput: Returns the rolling second largest unique value in the&= nbsp;`myCol` column from t= he first row of the dataset to the current one.

Rows before example:

=20

rollingkthlargestunique(myNumber, 2, 3)

Ou= tput: Returns the rolling second largest unique value of the = current row and the two previous row values in the `myNumber` column.

Rows before and after example:

=20

rollingkthlargestunique(myNumber, 4, 3, 2)=

Ou= tput: Returns the rolling fourth largest unique value of the = two previous row values, the current row value, and the two rows after the = current one in the `myNumber column.     `

``` Syntax and Arguments =20 rollingkthlargestunique(col_ref, rowsBefor= e_integer, rowsAfter_integer) order: order_col [group: group_col] Argument Required? Data Type Description col_ref Y string Name of column whose values are applied to the f= unction k_integer Y integer (positive) The ranking of the unique value to= extract from the source column rowsBefore_integer N integer Number of rows before the current one to include= in the computation rowsAfter_integer N integer Number of rows after the current one to include = in the computation For more information on the order and = group parameters, see Window Transform. For more information on syntax standards, see Language Documentation Syntax Note= s. col_ref Name of the column whose values are used to compute the function. I= nputs must be Integer, Decimal, or Datetime values. NOTE: If the input is in Datetime type, the output is i= n unixtime format. You can wrap these outputs in the DATEFORMAT function to= output the results in the appropriate Datetime format. See DATEFORMAT Function. Multiple columns and wildcards are not supported. Usage Notes: Required? Data Type Example Value Yes String (column reference) myColumn k_integer Integer representing the ranking of the unique value to extract from the= source column. Duplicate values are treated as a single value for purposes= of this function's calculation. NOTE: The value for k must = be an integer between 1 and 1,000 inclusive. ```
``` k=3D1 represents the maximum value in the column.&nbs= p; If k is greater than or equal to the number of values in the column, th= e minimum value is returned. ```
• `Missing and null values are not factored into the ranking of k`.

Usage Notes:

Required?
Data Type
Example Value
Yes Integer (positive) `4 `

### rowsBefore_integer, rowsAfter_integer

Integers representing the number of rows before or after the current one= from which to compute the rolling function, including the current row. For= example, if the first value is `5`, the current row and th= e four rows after it are used in the computation. Negative values for `rowsAfter_integer`&nbs= p;compute the rolling function from rows preceding the current one.<= /p>

• `rowBefore=3D1` generates the current row value only.
• `rowBefore=3D-1`  uses all rows preceding the curr= ent one.
• If `rowsAfter` is not specified, then the value&nb= sp;`0` is applied.
• If a `group` parameter is applied, then these para= meter values should be no more than the maximum number of rows in the group= s.

Usage Notes:

Required? Data Type Example Value
No Integer `4`

## Examples

### Example - ROLLINGKTHLARGEST functions

This example describes how to use the following rolling computational fu= nctions:

• `ROLLINGKTHLARGEST` - computes the kth largest= value from a rolling window of rows before and after the current row.= Duplicate values are treated as having the same k values. Se= e ROLLINGKTHLARGE= ST Function.
• `ROLLINGKTHLARGESTUNIQUE` - computes the unique kth largest value from a rolling window of rows before and= after the current row. Duplicate values are treated as having different&nb= sp;k values. See ROLLINGKTHLARGESTUNIQUE Function.

The following dataset contains daily counts of server restarts across th= ree servers over the preceding week. High server restart counts can indicat= e poor server health. In this example, you are interested in knowing for ea= ch server the rolling highest and second highest count of restarts per serv= er over the previous week.

Source:

Date Server Restarts
2/21/18 s01 4
2/21/18 s02 0
2/21/18 s03 0
2/22/18 s01 4
2/22/18 s02 1
2/22/18 s03 2
2/23/18 s01 2
2/23/18 s02 3
2/23/18 s03 4
2/24/18 s01 1
2/24/18 s02 0
2/24/18 s03 2
2/25/18 s01 5
2/25/18 s02 0
2/25/18 s03 4
2/26/18 s01 1
2/26/18 s02 2
2/26/18 s03 1
2/27/18 s01 1
2/27/18 s02 2
2/27/18 s03 2

Transformation:

First, you want to maintain the row information as a separate column. Si= nce data is ordered already by the `Date` column, you can use th= e following:

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
Transformation Name <= code>New formula `Single row formula` `ROWNUMBER()` `'entryId'`
=20

Use the following function to compute the rolling kth largest v= alue of server restarts per server over the previous week. In this case, yo= u can use the `ROLLINGKTHLARGEST` function, setting k=3D1. Uniqueness doesn't matter for calculating the highest value:<= /p>

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
Transformation Name <= code>New formula `Multiple row formula` `rollingkthlargest(Restarts, 1, 6, 0)` `Server` `Server` `'rollingkthlargest_1'`
=20

Use the following function to compute the rolling second highest value. = In this case, you can use `ROLLINGKTHLARGESTUNIQUE`: <= /p>

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
Transformation Name <= code>New formula `Multiple row formula` ```rollingkthlargestunique(Restarts, 2, 6, 0)= ``` `Server` `Server` `'rollingKthLargestUnique_2'`
=20

Results:

entryId Date Server Restarts rollingKthLargestUnique_2 rollingkthlargest_Restarts
3 2/21/18 s02 0 0 0
6 2/22/18 s02 1 0 1
9 2/23/18 s02 3 1 3
12 2/24/18 s02 0 1 3
15 2/25/18 s02 0 1 3
18 2/26/18 s02 2 2 3
21 2/27/18 s02 2 2 3
4 2/21/18 s03 0 0 0
7 2/22/18 s03 2 0 2
10 2/23/18 s03 4 2 4
13 2/24/18 s03 2 2 4
16 2/25/18 s03 4 2 4
19 2/26/18 s03 1 2 4
22 2/27/18 s03 2 2 4
2 2/21/18 s01 4 4 4
5 2/22/18 s01 4 4 4
8 2/23/18 s01 2 2 4
11 2/24/18 s01 1 2 4
14 2/25/18 s01 5 4 5
17 2/26/18 s01 1 4 5
20 2/27/18 s01 1 4 5

=20