Contents:
The cloudbased version of Trifacta Wrangler is now available! Read all about it, and register for your free account.
Extracts the ranked value from the values in a column, where
k=1
returns the maximum value. The value for k
must be between 1 and 1000, inclusive.For purposes of this calculation, two instances of the same value are treated as separate values. So, if your dataset contains three rows with column values 10
, 9
, and 9
, then KTHLARGEST
returns 9
for k=2
and k=3
.
When used in a pivot
transform, the function is computed for each instance of the value specified in the group
parameter. See Pivot Transform.
Input column can be of Integer or Decimal type. Nonnumeric data in the column is ignored. If a row contains a missing or null value, it is not factored into the calculation.
Basic Usage
pivot value:KTHLARGEST(myRating, 2) group:postal_code limit:1
Output: Generates a twocolumn table containing the unique values in the postal_code
column and the second highest value from the myRating
column for that postal_code
value. The limit
parameter defines the maximum number of output columns.
Syntax
pivot value:KTHLARGEST(function_col_ref, k_integer) [ group:group_col_ref] [limit:limit_count]
Argument  Required?  Data Type  Description 

function_col_ref  Y  string  Name of column to which to apply the function 
k_integer  Y  integer (positive)  The ranking of the value to extract from the source column 
For more information on the group
and limit
parameters, see Pivot Transform.
For more information on syntax standards, see Language Documentation Syntax Notes.
function_col_ref
Name of the column the values of which you want to calculate the mean. Column must contain Integer or Decimal values.
 Literal values are not supported as inputs.
 Multiple columns and wildcards are not supported.
Usage Notes:
Required?  Data Type  Example Value 

Yes  String (column reference)  myValues 
k_integer
Integer representing the ranking of the value to extract from the source column.
NOTE: The value for k
must be an integer between 1 and 1,000 inclusive.

k=1
represents the maximum value in the column.  If k is greater than or equal to the number of values in the column, the minimum value is returned.
 Missing and null values are not factored into the ranking of
k
.
Usage Notes:
Required?  Data Type  Example Value 

Yes  Integer (positive)  4 
Examples
This example explores how you can use aggregation functions to calculate rank of values in a column using the KTHLARGEST
and KTHLARGESTUNIQUE
functions.
Source:
You have a set of student test scores:
Student  Score 

Anna  84 
Ben  71 
Caleb  76 
Danielle  87 
Evan  85 
Faith  92 
Gabe  87 
Hannah  99 
Ian  73 
Jane  68 
Transform:
You can use the following transforms to extract the 1st through 4thranked scores on the test:
derive type:single value:KTHLARGEST(Score, 1) as: '1st'
derive type:single value:KTHLARGEST(Score, 2) as: '2nd'
derive type:single value:KTHLARGEST(Score, 3) as: '3rd'
derive type:single value:KTHLARGEST(Score, 4) as: '4th'
derive type:single value:KTHLARGESTUNIQUE(Score, 3) as: '3rdUnique'
derive type:single value:KTHLARGESTUNIQUE(Score, 4) as: '4thUnique'
Results:
When you reorganize the columns, the dataset might look like the following:
Student  Score  1st  2nd  3rd  4th  3rdUnique  4thUnique 

Anna  84  99  92  87  87  87  85 
Ben  71  99  92  87  87  87  85 
Caleb  76  99  92  87  87  87  85 
Danielle  87  99  92  87  87  87  85 
Evan  85  99  92  87  87  87  85 
Faith  92  99  92  87  87  87  85 
Gabe  87  99  92  87  87  87  85 
Hannah  99  99  92  87  87  87  85 
Ian  73  99  92  87  87  87  85 
Jane  68  99  92  87  87  87  85 
Notes:
 The value
87
is both the third and fourth scores. For the
KTHLARGEST
function, it is the output for the third and fourth ranking.  For the
KTHLARGESTUNIQUE
function, it is the output for the third ranking only.
 For the
This page has no comments.