Page tree

Release 7.6.2


Contents:

   

Extracts the ranked unique value from the values in a column, where k=1 returns the maximum value. The value for k must be between 1 and 1000, inclusive. Inputs can be Integer, Decimal, or Datetime.

For purposes of this calculation, two instances of the same value are treated as the same value of k. So, if your dataset contains four rows with column values 10 , 9 , 9 , and 8, then KTHLARGEST returns 9 for k=2 and 8 for k=3.

When used in a pivot transform, the function is computed for each instance of the value specified in the group parameter. See Pivot Transform.

Input column can be of Integer, Decimal, or Datetime type. Other values column are ignored. If a row contains a missing or null value, it is not factored into the calculation.

Wrangle vs. SQL: This function is part of Wrangle, a proprietary data transformation language. Wrangle is not SQL. For more information, see Wrangle Language.


Basic Usage

kthlargestunique(myRating, 3)

Output: Returns the third highest unique value from the myRating column.

Syntax and Arguments

kthlargestunique(function_col_ref, k_integer) [ group:group_col_ref] [limit:limit_count]


ArgumentRequired?Data TypeDescription
function_col_refYstringName of column to which to apply the function
k_integerYinteger (positive)The ranking of the unique value to extract from the source column

For more information on the group and limit parameters, see Pivot Transform.

For more information on syntax standards, see Language Documentation Syntax Notes.

function_col_ref

Name of the column the values of which you want to calculate the mean. Inputs must be Integer, Decimal, or Datetime values.

NOTE: If the input is in Datetime type, the output is in unixtime format. You can wrap these outputs in the DATEFORMAT function to output the results in the appropriate Datetime format. See DATEFORMAT Function.

  • Literal values are not supported as inputs.
  • Multiple columns and wildcards are not supported.

Usage Notes:

Required?Data TypeExample Value
YesString (column reference)myValues

k_integer

Integer representing the ranking of the unique value to extract from the source column. Duplicate values are treated as a single value for purposes of this function's calculation.

NOTE: The value for k must be an integer between 1 and 1,000 inclusive.

  • k=1 represents the maximum value in the column.
  • If k is greater than or equal to the number of values in the column, the minimum value is returned.
  • Missing and null values are not factored into the ranking of k.

Usage Notes:

Required?Data TypeExample Value
YesInteger (positive)4


Examples


Tip: For additional examples, see Common Tasks.

This example explores how you can use aggregation functions to calculate rank of values in a column using the KTHLARGEST and KTHLARGESTUNIQUE functions.

Source:

You have a set of student test scores:

StudentScore
Anna84
Ben71
Caleb76
Danielle87
Evan85
Faith92
Gabe87
Hannah99
Ian73
Jane68

Transformation:

You can use the following transformations to extract the 1st through 4th-ranked scores on the test:

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula KTHLARGEST(Score, 1)
Parameter: New column name '1st'

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula KTHLARGEST(Score, 2)
Parameter: New column name '2nd'

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula KTHLARGEST(Score, 3)
Parameter: New column name '3rd'

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula KTHLARGEST(Score, 4)
Parameter: New column name '4th'

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula KTHLARGESTUNIQUE(Score, 3)
Parameter: New column name '3rdUnique'

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula KTHLARGESTUNIQUE(Score, 4)
Parameter: New column name '4thUnique'

Results:

When you reorganize the columns, the dataset might look like the following:

StudentScore1st2nd3rd4th3rdUnique4thUnique
Anna84999287878785
Ben71999287878785
Caleb76999287878785
Danielle87999287878785
Evan85999287878785
Faith92999287878785
Gabe87999287878785
Hannah99999287878785
Ian73999287878785
Jane68999287878785

Notes:

  • The value 87 is both the third and fourth scores.
    • For the KTHLARGEST function, it is the output for the third and fourth ranking.
    • For the KTHLARGESTUNIQUE function, it is the output for the third ranking only.

This page has no comments.