Date: Fri, 20 May 2022 19:52:39 +0000 (GMT) Message-ID: <909557525.24176.1653076359083@c7585db71e40> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_24175_648281266.1653076359083" ------=_Part_24175_648281266.1653076359083 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html ROLLINGKTHLARGEST Function

# ROLLINGKTHLARGEST Function

Contents:

=20

=20
=20
=20

=20
=20

Computes the rolling=20 kth largest value forward or backward of the current row.

`KTHLARGEST` ex= tracts the ranked value from the values in a column, where ```k=3D1= ``` returns the maximum value. The value for `k`&n= bsp;must be between 1 and 1000, inclusive. For purposes o= f this calculation, two instances of the same value are treated as separate= values. So, if your dataset contains three rows with column values `10``9`, and `9`, then `KTHLARGEST`&nbs= p;returns `9` for ```k=3D2= ``` and `k=3D3`.

`ROLLINGKTHLARGEST` computes the ```KTHLARGEST<= /code> value across a defined window of values within a column.``````= ```

``` If an input value is missing or null, it is not factored in the computa= tion. For example, for the first row in the dataset, the rolling calculatio= n of previous values is undefined. The row from which to extract a value is determined by the order in = which the rows are organized based on the order para= meter. If you are working on a randomly generated sample o= f your dataset, the values that you see for this function might not corresp= ond to the values that are generated on the full dataset during job executi= on. Inputs: Required column name Required kth value, which is a positive integer  Two optional integer parameters that determine the window backward= and forward of the current row. The default integer parameter values are&n= bsp;-1 and 0, which computes the rollin= g function from the current row back to the first row of the dataset. This function works with the following transforms: Window Transform Set Transform Derive Transform For more information on a non-rolling version of this function, see KTHLARGEST Function. Bas= ic Usage Column example:=20 derive type:single value:ROLLINGKTHLARGEST(myCol,= 2) Output: Generates a new column containing the roll= ing second largest of all values in the myCol column= from the first row of the dataset to the current one. Rows before example:=20 window value:ROLLINGKTHLARGEST(myNum= ber, 2, 3) Output: Generates the new column, which contains t= he rolling second largest value of the current row and the two previous row= values in the myNumber column. Rows before and after example:=20 window value:value:ROLLINGKTHLARGEST= (myNumber, 4, 3, 2) Output: Generates the new column, which contains t= he rolling fourth largest value of the two previous row values, the current= row value, and the two rows after the current one in the myNumb= er column.      Syntax and Arguments =20 window value:value:ROLLINGKTHLARGEST= (col_ref, rowsBefore_integer, rowsAfter_integer) order: order_col [group: g= roup_col] Argument Required? Data Type Description col_ref Y string Name of column whose values are applied to the f= unction k_integer Y integer (positive) The ranking of the value to extrac= t from the source column rowsBefore_integer N integer Number of rows before the current one to include= in the computation rowsAfter_integer N integer Number of rows after the current one to include = in the computation For more information on the order and = group parameters, see Window Transform. For more information on syntax standards, see Language Documentation Syntax Note= s. col_ref Name of the column whose values are used to compute the function. <= /p> Multiple columns and wildcards are not supported. Usage Notes: Required? Data Type Example Value Yes String (column reference to Integer or Decimal v= alues) myColumn k_integer Integer representing the ranking of the unique value to extract from the= source column. Duplicate values are treated as separate values for purpose= s of this function's calculation. NOTE: The value for k must = be an integer between 1 and 1,000 inclusive. ```
``` k=3D1 represents the maximum value in the column.&nbs= p; If k is greater than or equal to the number of values in the column, th= e minimum value is returned. ```
• `Missing and null values are not factored into the ranking of k`.

Usage Notes:

Required?
Data Type
Example Value
Yes Integer (positive) `4 `

### r= owsBefore_integer, rowsAfter_integer

Integers representing the number of rows before or after the current one= from which to compute the rolling function, including the current row. For= example, if the first value is `5`, the current row and th= e four rows after it are used in the computation. Negative values for&= nbsp;`k` compute the rolling average from rows preceding th= e current one.

• `rowBefore=3D1` generates the current row value only.
• `rowBefore=3D-1`  uses all rows preceding the curr= ent one.
• If `rowsAfter` is not specified, then the value&nb= sp;`0` is applied.
• If a `group` parameter is applied, then these para= meter values should be no more than the maximum number of rows in the group= s.

Usage Notes:

Required? Data Type Example Value
No Integer `4`

## Ex= amples

### Exa= mple - ROLLINGKTHLARGEST functions

This example describes how to use the following rolling computational fu= nctions:

• `ROLLINGKTHLARGEST` - computes the kth largest= value from a rolling window of rows before and after the current row.= Duplicate values are treated as having the same k values. Se= e ROLLINGKTHLARGE= ST Function.
• `ROLLINGKTHLARGESTUNIQUE` - computes the unique kth largest value from a rolling window of rows before and= after the current row. Duplicate values are treated as having different&nb= sp;k values. See ROLLINGKTHLARGESTUNIQUE Function.

The following dataset contains daily counts of server restarts across th= ree servers over the preceding week. High server restart counts can indicat= e poor server health. In this example, you are interested in knowing for ea= ch server the rolling highest and second highest count of restarts per serv= er over the previous week.

Source:

Date Server Restarts
2/21/18 s01 4
2/21/18 s02 0
2/21/18 s03 0
2/22/18 s01 4
2/22/18 s02 1
2/22/18 s03 2
2/23/18 s01 2
2/23/18 s02 3
2/23/18 s03 4
2/24/18 s01 1
2/24/18 s02 0
2/24/18 s03 2
2/25/18 s01 5
2/25/18 s02 0
2/25/18 s03 4
2/26/18 s01 1
2/26/18 s02 2
2/26/18 s03 1
2/27/18 s01 1
2/27/18 s02 2
2/27/18 s03 2

Transform:

First, you want to maintain the row information as a separate column. Si= nce data is ordered already by the `Date` column, you can use th= e following:

=20

derive type:single value:ROWNUMBER() as:'entryId'=

Use the following function to compute the rolling kth largest v= alue of server restarts per server over the previous week. In this case, yo= u can use the `ROLLINGKTHLARGEST` function, setting k=3D1. Uniqueness doesn't matter for calculating the highest value:<= /p>=20

derive type: multiple value: rollingkthlargest(Re= starts, 1, 6, 0) group: Server order: Server as: 'rollingkthlargest_1'

Use the following function to compute the rolling second highest value. = In this case, you can use `ROLLINGKTHLARGESTUNIQUE`: <= /p>=20

derive type: multiple value: rollingkthlargestuni= que(Restarts, 2, 6, 0) group: Server order: Server as: 'rollingKthLargestUn= ique_2'

Results:

entryId Date Server Restarts rollingKthLargestUnique_2 rollingkthlargest_Restarts
3 2/21/18 s02 0 0 0
6 2/22/18 s02 1 0 1
9 2/23/18 s02 3 1 3
12 2/24/18 s02 0 1 3
15 2/25/18 s02 0 1 3
18 2/26/18 s02 2 2 3
21 2/27/18 s02 2 2 3
4 2/21/18 s03 0 0 0
7 2/22/18 s03 2 0 2
10 2/23/18 s03 4 2 4
13 2/24/18 s03 2 2 4
16 2/25/18 s03 4 2 4
19 2/26/18 s03 1 2 4
22 2/27/18 s03 2 2 4
2 2/21/18 s01 4 4 4
5 2/22/18 s01 4 4 4
8 2/23/18 s01 2 2 4
11 2/24/18 s01 1 2 4
14 2/25/18 s01 5 4 5
17 2/26/18 s01 1 4 5
20 2/27/18 s01 1 4 5

=20
=20