Date: Sun, 17 Oct 2021 13:35:18 +0000 (GMT) Message-ID: <2021982684.47014.1634477718924@9c5033e110b2> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_47013_713742491.1634477718924" ------=_Part_47013_713742491.1634477718924 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html ROLLINGKTHLARGESTUNIQUE Function

# ROLLINGKTHLARGESTUNIQUE Function

Contents:

=20

=20
=20
=20

=20
=20

Computes the rolling unique=20 kth largest value forward or backward of the current row.

For purposes of this calculation, two instances of the same value are tr= eated at one value for k. So, if your dataset contains four rows with colum= n values `10``9``9`, an= d `8`, then `KTHLARGESTUNIQUE` re= turns `9` for `k=3D2` and 8 for `k=3D3`.

`ROLLINGKTHLARGESTUNIQUE` computes the ```KTHLAR= GESTUNIQUE``` value across a defined window of values within a col= umn.

• If an input value is missing or null, it is not factored in the computa= tion. For example, for the first row in the dataset, the rolling calculatio= n of previous values is undefined.
• The row from which to extract a value is determined by the order in = which the rows are organized based on the `order` para= meter.

If you are working on a randomly generated sample o= f your dataset, the values that you see for this function might not corresp= ond to the values that are generated on the full dataset during job executi= on.

• Inputs:
• Required column name
• Required kth value, which is a positive integer
• Two optional integer parameters that determine the window backward= and forward of the current row. The default integer parameter values are&n= bsp;`-1` and `0`, which computes the rollin= g function from the current row back to the first row of the dataset.
• This function works with the following transforms:

For more information on a non-rolling version of this function, see = ;KTHLARGESTUNIQUE Funct= ion.

## Basic Usage

Column example:

=20

derive type:single value:ROLLINGKTHLARGESTUNIQUE(= myCol, 2)

Output: Generates a new column containing the roll= ing second largest unique value in the `myCol` column = from the first row of the dataset to the current one.

Rows before example:

=20

window value:ROLLINGKTHLARGESTUNIQUE= (myNumber, 2, 3)

Output: Generates the new column, which contains t= he rolling second largest unique value of the current row and the two previ= ous row values in the `myNumber` column.

Rows before and after example:

=20

window value:value:ROLLINGKTHLARGESTUNIQUE<= /span>(myNumber, 4, 3, 2)

Output: Generates the new column, which contains t= he rolling fourth largest unique value of the two previous row values, the = current row value, and the two rows after the current one in the myNumber column.

## Syntax and Arguments

=20

window value:value:ROLLINGKTHLARGESTUNIQUE<= /span>(col_ref, rowsBefore_integer, rowsAfter_integer) order: order_col [gr= oup: group_col]

Argument Required? Data Type Description
col_ref Y string Name of column whose values are applied to the f= unction
k_integer Y integer (positive) The ranking of the unique value to= extract from the source column
rowsBefore_integer N integer Number of rows before the current one to include= in the computation
rowsAfter_integer N integer Number of rows after the current one to include = in the computation

For more information on the `order` and ```= group``` parameters, see Window Transform.

For more information on syntax standards, see Language Documentation Syntax Note= s.

### col_ref

Name of the column whose values are used to compute the function. <= /p>

• Multiple columns and wildcards are not supported.

Usage Notes:

Required? Data Type Example Value
Yes String (column reference to Integer or Decimal v= alues) `myColumn`

### k_integer

Integer representing the ranking of the unique value to extract from the= source column. Duplicate values are treated as a single value for purposes= of this function's calculation.

NOTE: The value for `k` must = be an integer between 1 and 1,000 inclusive.

• `k=3D1` represents the maximum value in the column.&nbs= p;
• If k is greater than or equal to the number of values in the column, th= e minimum value is returned.
• Missing and null values are not factored into the ranking of k.

Usage Notes:

Required?
Data Type
Example Value
Yes Integer (positive) `4 `

### rowsBefore_integer, rowsAfter_integer

Integers representing the number of rows before or after the current one= from which to compute the rolling function, including the current row. For= example, if the first value is `5`, the current row and th= e four rows after it are used in the computation. Negative values for&= nbsp;`k` compute the rolling average from rows preceding th= e current one.

• `rowBefore=3D1` generates the current row value only.
• `rowBefore=3D-1`  uses all rows preceding the curr= ent one.
• If `rowsAfter` is not specified, then the value&nb= sp;`0` is applied.
• If a `group` parameter is applied, then these para= meter values should be no more than the maximum number of rows in the group= s.

Usage Notes:

Required? Data Type Example Value
No Integer `4`

## Examples

### Example - ROLLINGKTHLARGEST functions

This example describes how to use the following rolling computational fu= nctions:

• `ROLLINGKTHLARGEST` - computes the kth largest= value from a rolling window of rows before and after the current row.= Duplicate values are treated as having the same k values. Se= e ROLLINGKTHLARGE= ST Function.
• `ROLLINGKTHLARGESTUNIQUE` - computes the unique kth largest value from a rolling window of rows before and= after the current row. Duplicate values are treated as having different&nb= sp;k values. See ROLLINGKTHLARGESTUNIQUE Function.

The following dataset contains daily counts of server restarts across th= ree servers over the preceding week. High server restart counts can indicat= e poor server health. In this example, you are interested in knowing for ea= ch server the rolling highest and second highest count of restarts per serv= er over the previous week.

Source:

Date Server Restarts
2/21/18 s01 4
2/21/18 s02 0
2/21/18 s03 0
2/22/18 s01 4
2/22/18 s02 1
2/22/18 s03 2
2/23/18 s01 2
2/23/18 s02 3
2/23/18 s03 4
2/24/18 s01 1
2/24/18 s02 0
2/24/18 s03 2
2/25/18 s01 5
2/25/18 s02 0
2/25/18 s03 4
2/26/18 s01 1
2/26/18 s02 2
2/26/18 s03 1
2/27/18 s01 1
2/27/18 s02 2
2/27/18 s03 2

Transform:

First, you want to maintain the row information as a separate column. Si= nce data is ordered already by the `Date` column, you can use th= e following:

=20

derive type:single value:ROWNUMBER() as:'entryId'=

Use the following function to compute the rolling kth largest v= alue of server restarts per server over the previous week. In this case, yo= u can use the `ROLLINGKTHLARGEST` function, setting k=3D1. Uniqueness doesn't matter for calculating the highest value:<= /p>=20

derive type: multiple value: rollingkthlargest(Re= starts, 1, 6, 0) group: Server order: Server as: 'rollingkthlargest_1'

Use the following function to compute the rolling second highest value. = In this case, you can use `ROLLINGKTHLARGESTUNIQUE`: <= /p>=20

derive type: multiple value: rollingkthlargestuni= que(Restarts, 2, 6, 0) group: Server order: Server as: 'rollingKthLargestUn= ique_2'

Results:

entryId Date Server Restarts rollingKthLargestUnique_2 rollingkthlargest_Restarts
3 2/21/18 s02 0 0 0
6 2/22/18 s02 1 0 1
9 2/23/18 s02 3 1 3
12 2/24/18 s02 0 1 3
15 2/25/18 s02 0 1 3
18 2/26/18 s02 2 2 3
21 2/27/18 s02 2 2 3
4 2/21/18 s03 0 0 0
7 2/22/18 s03 2 0 2
10 2/23/18 s03 4 2 4
13 2/24/18 s03 2 2 4
16 2/25/18 s03 4 2 4
19 2/26/18 s03 1 2 4
22 2/27/18 s03 2 2 4
2 2/21/18 s01 4 4 4
5 2/22/18 s01 4 4 4
8 2/23/18 s01 2 2 4
11 2/24/18 s01 1 2 4
14 2/25/18 s01 5 4 5
17 2/26/18 s01 1 4 5
20 2/27/18 s01 1 4 5

=20