Date: Tue, 25 Jan 2022 23:17:58 +0000 (GMT) Message-ID: <132661271.108736.1643152678722@9c5033e110b2> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_108735_4614468.1643152678721" ------=_Part_108735_4614468.1643152678721 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html ROLLINGKTHLARGESTUNIQUE Function

ROLLINGKTHLARGESTUNIQUE Function

Contents:

=20

=20
=20
=20

=20
=20

Computes the rolling unique=20 kth largest value forward or backward of the current row.

For purposes of this calculation, two instances of the same value are tr= eated at one value for k. So, if your dataset contains four rows with colum= n values 1099, an= d 8, then KTHLARGESTUNIQUE re= turns 9 for k=3D2 and 8 for k=3D3.

ROLLINGKTHLARGESTUNIQUE computes the KTHLAR= GESTUNIQUE value across a defined window of values within a col= umn.

• If an input value is missing or null, it is not factored in the computa= tion. For example, for the first row in the dataset, the rolling calculatio= n of previous values is undefined.
• The row from which to extract a value is determined by the order in = which the rows are organized based on the order para= meter.

• If you are working on a randomly generated sample of your datas= et, the values that you see for this function might not correspond to the v= alues that are generated on the full dataset during job execution.

• Inputs:
• Required column name
• Required kth value, which is a positive integer
• Two optional integer parameters that determine the window backward= and forward of the current row. The default integer parameter values are&n= bsp;-1 and 0, which computes the rollin= g function from the current row back to the first row of the dataset.
• This function works with the following transforms:

For more information on a non-rolling version of this function, see = ;KTHLARGESTUNIQUE Funct= ion.

Basic Usage

Column example:

=20

rollingkthlargestunique(myCol, 2)

Ou= tput: Returns the rolling second largest unique value in the&= nbsp;myCol column from t= he first row of the dataset to the current one.

Rows before example:

=20

rollingkthlargestunique(myNumber, 2, 3)

Ou= tput: Returns the rolling second largest unique value of the = current row and the two previous row values in the myNumber column.

Rows before and after example:

=20

rollingkthlargestunique(myNumber, 4, 3, 2)=

Ou= tput: Returns the rolling fourth largest unique value of the = two previous row values, the current row value, and the two rows after the = current one in the myNumber column.

Syntax and Arguments

=20

rollingkthlargestunique(col_ref, rowsBefor= e_integer, rowsAfter_integer) order: order_col [group: group_col]

Argument Required? Data Type Description
col_ref Y string Name of column whose values are applied to the f= unction
k_integer Y integer (positive) The ranking of the unique value to= extract from the source column
rowsBefore_integer N integer Number of rows before the current one to include= in the computation
rowsAfter_integer N integer Number of rows after the current one to include = in the computation

For more information on the order and = group parameters, see Window Transform.

For more information on syntax standards, see Language Documentation Syntax Note= s.

col_ref

Name of the column whose values are used to compute the function. <= /p>

• Multiple columns and wildcards are not supported.

Usage Notes:

Required? Data Type Example Value
Yes String (column reference to Integer or Decimal v= alues) myColumn

k_integer

Integer representing the ranking of the unique value to extract from the= source column. Duplicate values are treated as a single value for purposes= of this function's calculation.

NOTE: The value for k must = be an integer between 1 and 1,000 inclusive.

• k=3D1 represents the maximum value in the column.&nbs= p;
• If k is greater than or equal to the number of values in the column, th= e minimum value is returned.
• Missing and null values are not factored into the ranking of k.

Usage Notes:

Required?
Data Type
Example Value
Yes Integer (positive) 4

rowsBefore_integer, rowsAfter_integer

Integers representing the number of rows before or after the current one= from which to compute the rolling function, including the current row. For= example, if the first value is 5, the current row and th= e four rows after it are used in the computation. Negative values for rowsAfter_integer&nbs= p;compute the rolling function from rows preceding the current one.<= /p>

• rowBefore=3D1 generates the current row value only.
• rowBefore=3D-1  uses all rows preceding the curr= ent one.
• If rowsAfter is not specified, then the value&nb= sp;0 is applied.
• If a group parameter is applied, then these para= meter values should be no more than the maximum number of rows in the group= s.

Usage Notes:

Required? Data Type Example Value
No Integer 4

Examples

Example - ROLLINGKTHLARGEST functions

This example describes how to use the following rolling computational fu= nctions:

• ROLLINGKTHLARGEST - computes the kth largest= value from a rolling window of rows before and after the current row.= Duplicate values are treated as having the same k values. Se= e ROLLINGKTHLARGE= ST Function.
• ROLLINGKTHLARGESTUNIQUE - computes the unique kth largest value from a rolling window of rows before and= after the current row. Duplicate values are treated as having different&nb= sp;k values. See ROLLINGKTHLARGESTUNIQUE Function.

The following dataset contains daily counts of server restarts across th= ree servers over the preceding week. High server restart counts can indicat= e poor server health. In this example, you are interested in knowing for ea= ch server the rolling highest and second highest count of restarts per serv= er over the previous week.

Source:

Date Server Restarts
2/21/18 s01 4
2/21/18 s02 0
2/21/18 s03 0
2/22/18 s01 4
2/22/18 s02 1
2/22/18 s03 2
2/23/18 s01 2
2/23/18 s02 3
2/23/18 s03 4
2/24/18 s01 1
2/24/18 s02 0
2/24/18 s03 2
2/25/18 s01 5
2/25/18 s02 0
2/25/18 s03 4
2/26/18 s01 1
2/26/18 s02 2
2/26/18 s03 1
2/27/18 s01 1
2/27/18 s02 2
2/27/18 s03 2

Transformation:

First, you want to maintain the row information as a separate column. Si= nce data is ordered already by the Date column, you can use th= e following:

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
Transformation Name <= code>New formula Single row formula ROWNUMBER() 'entryId'
=20

Use the following function to compute the rolling kth largest v= alue of server restarts per server over the previous week. In this case, yo= u can use the ROLLINGKTHLARGEST function, setting k=3D1. Uniqueness doesn't matter for calculating the highest value:<= /p>

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
Transformation Name <= code>New formula Multiple row formula rollingkthlargest(Restarts, 1, 6, 0) Server Server 'rollingkthlargest_1'
=20

Use the following function to compute the rolling second highest value. = In this case, you can use ROLLINGKTHLARGESTUNIQUE: <= /p>

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
Transformation Name <= code>New formula Multiple row formula rollingkthlargestunique(Restarts, 2, 6, 0)= Server Server 'rollingKthLargestUnique_2'
=20

Results:

entryId Date Server Restarts rollingKthLargestUnique_2 rollingkthlargest_Restarts
3 2/21/18 s02 0 0 0
6 2/22/18 s02 1 0 1
9 2/23/18 s02 3 1 3
12 2/24/18 s02 0 1 3
15 2/25/18 s02 0 1 3
18 2/26/18 s02 2 2 3
21 2/27/18 s02 2 2 3
4 2/21/18 s03 0 0 0
7 2/22/18 s03 2 0 2
10 2/23/18 s03 4 2 4
13 2/24/18 s03 2 2 4
16 2/25/18 s03 4 2 4
19 2/26/18 s03 1 2 4
22 2/27/18 s03 2 2 4
2 2/21/18 s01 4 4 4
5 2/22/18 s01 4 4 4
8 2/23/18 s01 2 2 4
11 2/24/18 s01 1 2 4
14 2/25/18 s01 5 4 5
17 2/26/18 s01 1 4 5
20 2/27/18 s01 1 4 5

=20
=20