Page tree

 

Contents:


Computes a specified percentile across all row values in a column or group. Input column can be of Integer or Decimal.
  • If a row contains a missing or null value, it is not factored into the calculation. If the entire column contains no values, the function returns a null value.
  • When used in a pivot transform, the function is computed for each instance of the value specified in the group parameter. See Pivot Transform.

Wrangle vs. SQL: This function is part of Wrangle , a proprietary data transformation language. Wrangle is not SQL. For more information, see Wrangle Language.

Basic Usage

percentile(myScores, 25, linear)

Output: Computes the value that is at the 25th percentile across all values in the myScores column.

Syntax and Arguments

percentile(function_col_ref,num_percentile,interpolation_method) [group:group_col_ref] [limit:limit_count]


ArgumentRequired?Data TypeDescription
function_col_refYstringName of column to which to apply the function
num_percentileYintegerInteger value between 1-100 of the percentile to compute
interpolation_methodYenumMethod by which to interpolate values between two row values. See below.

For more information on the group and limit parameters, see Pivot Transform.

For more information on syntax standards, see Language Documentation Syntax Notes.

function_col_ref

Name of the column the values of which you want to calculate the percentile. Column must contain Integer or Decimal values.

  • Literal values are not supported as inputs.
  • Multiple columns and wildcards are not supported.

Usage Notes:

Required?Data TypeExample Value
YesString (column reference)precipitationIn

num_percentile

Name of the column the values of which you want to calculate the percentile. Column must contain Integer or Decimal values.

  • Literal values are not supported as inputs.
  • Multiple columns and wildcards are not supported.

Usage Notes:

Required?Data TypeExample Value
YesInteger55

interpolation_method

Method of interpolation between each discrete value. The list of support methods is the following:

Interpolation methodDescription
linearPercentiles are calculated between two discrete values in a linear fashion.
exclusiveExcludes 0 and 1 from calculation of percentiles.
inclusiveIncludes 0 and 1 from calculation of percentiles.
lowerUse the lower value when the computed value falls between two discrete values.
upperUse the upper value when the computed value falls between two discrete values.
midpointUse the midpoint value when the computed value falls between two discrete values.

Usage Notes:

Required?Data TypeExample Value
YesEnumlinear

Examples

Tip: For additional examples, see Common Tasks.

Example - Percentile functions

This example illustrates how you can apply the following percentile-related functions to your transformations:
  • MEDIAN - Calculate the median value from a column of values. See MEDIAN Function.
  • PERCENTILE - Calculate a specified percentile for a column of values. See PERCENTILE Function.
  • QUARTILE - Calculate a specified quartile for a column of values. See QUARTILE Function.

Source:

The following table lists each student's height in inches:

StudentHeight
164
265
363
464
562
666
766
865
969
1066
1173
1269
1369
1461
1564
1661
1771
1867
1973
2066

Transformation:

Use the following transformations to calculate the median height in inches, a specified percentile and the first quartile.

Median: This transformation calculates the median value, which corresponds to the 50th percentile.

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula median(heightIn)
Parameter: New column name 'medianIn'

Percentile: This transformation calculates the 68th percentile.

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula percentile(heightIn, 68, linear)
Parameter: New column name 'percentile68In'

Quartile:This transformation calculates the first quartile, which corresponds to the 25th percentile.

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula quartile(heightIn, 1, linear)
Parameter: New column name 'percentile25In'

Results:

StudentHeightpercentile25Inpercentile68InmedianIn
16465.7568.9267.5
26565.7568.9267.5
36365.7568.9267.5
46465.7568.9267.5
56265.7568.9267.5
66665.7568.9267.5
76665.7568.9267.5
86565.7568.9267.5
96965.7568.9267.5
106665.7568.9267.5
117365.7568.9267.5
126965.7568.9267.5
136965.7568.9267.5
146165.7568.9267.5
156465.7568.9267.5
166165.7568.9267.5
177165.7568.9267.5
186765.7568.9267.5
197365.7568.9267.5
206665.7568.9267.5


This page has no comments.