This example illustrates how you can apply the following percentile-related functions to your transformations:
MEDIAN
- Calculate the median value from a column of values. See MEDIAN Function.PERCENTILE
- Calculate a specified percentile for a column of values. See PERCENTILE Function.QUARTILE
- Calculate a specified quartile for a column of values. See QUARTILE Function.
The following functions use an approximation technique for calculating median, percentile, and quartiles. In some cases, these calculations can be computed faster across large datasets.
Source:
The following table lists each student's height in inches:
Student | Height |
---|
1 | 64 |
2 | 65 |
3 | 63 |
4 | 64 |
5 | 62 |
6 | 66 |
7 | 66 |
8 | 65 |
9 | 69 |
10 | 66 |
11 | 73 |
12 | 69 |
13 | 69 |
14 | 61 |
15 | 64 |
16 | 61 |
17 | 71 |
18 | 67 |
19 | 73 |
20 | 66 |
Transformation:
Use the following transformations to calculate the median height in inches, a specified percentile and the first quartile.
- The first function uses a precise algorithm which can be slow to execute across large datasets.
- The second function uses an appropriate approximation algorithm, which is much faster to execute across large datasets.
- These approximate functions can use an error boundary parameter, which is set to
0.4
(0.4%) across all functions.
Median: This transformation calculates the median value, which corresponds to the 50th percentile.
D trans |
---|
RawWrangle | true |
---|
p03Value | 'medianIn' |
---|
Type | step |
---|
WrangleText | derive type: single value: median(heightIn) as: 'medianIn' |
---|
p01Name | Formula type |
---|
p01Value | Single row formula |
---|
p02Name | Formula |
---|
p02Value | median(heightIn) |
---|
p03Name | New column name |
---|
SearchTerm | New formula |
---|
|
D trans |
---|
RawWrangle | true |
---|
p03Value | 'approxMedianIn' |
---|
Type | step |
---|
WrangleText | derive type: single value: approxmedian(heightIn, 0.4) as: 'approxMedianIn' |
---|
p01Name | Formula type |
---|
p01Value | Single row formula |
---|
p02Name | Formula |
---|
p02Value | approximatemedian(heightIn, 0.4) |
---|
p03Name | New column name |
---|
SearchTerm | New formula |
---|
|
Percentile: This transformation calculates the 68th percentile.
D trans |
---|
RawWrangle | true |
---|
p03Value | 'percentile68In' |
---|
Type | step |
---|
WrangleText | derive type: single value: percentile(heightIn, 68, linear) as: 'percentile68In' |
---|
p01Name | Formula type |
---|
p01Value | Single row formula |
---|
p02Name | Formula |
---|
p02Value | percentile(heightIn, 68, linear) |
---|
p03Name | New column name |
---|
SearchTerm | New formula |
---|
|
D trans |
---|
RawWrangle | true |
---|
p03Value | 'approxPercentile68In' |
---|
Type | step |
---|
WrangleText | derive type: single value: approximatepercentile(heightIn, 68, 0.4) as: 'approxpercentile68In' |
---|
p01Name | Formula type |
---|
p01Value | Single row formula |
---|
p02Name | Formula |
---|
p02Value | approximatepercentile(heightIn, 68, 0.4) |
---|
p03Name | New column name |
---|
SearchTerm | New formula |
---|
|
Quartile: This transformation calculates the first quartile, which corresponds to the 25th percentile.
D trans |
---|
RawWrangle | true |
---|
p03Value | 'percentile25In' |
---|
Type | step |
---|
WrangleText | derive type: single value: quartile(heightIn, 1, linear) as: 'percentile25In' |
---|
p01Name | Formula type |
---|
p01Value | Single row formula |
---|
p02Name | Formula |
---|
p02Value | quartile(heightIn, 1, linear) |
---|
p03Name | New column name |
---|
SearchTerm | New formula |
---|
|
D trans |
---|
RawWrangle | true |
---|
p03Value | 'approxPercentile25In' |
---|
Type | step |
---|
WrangleText | derive type: single value: approximatequartile(heightIn, 1, 0.4) as: 'approxPercentile25In' |
---|
p01Name | Formula type |
---|
p01Value | Single row formula |
---|
p02Name | Formula |
---|
p02Value | approximatequartile(heightIn, 1, 0.4) |
---|
p03Name | New column name |
---|
SearchTerm | New formula |
---|
|
Results:
studentId | heightIn | approxPercentile25In | percentile25In | approxPercentile68In | percentile68In | approxMedianIn | medianIn |
---|
1 | 64 | 64 | 64 | 67.1 | 66.92 | 66 | 66 |
2 | 65 | 64 | 64 | 67.1 | 66.92 | 66 | 66 |
3 | 63 | 64 | 64 | 67.1 | 66.92 | 66 | 66 |
4 | 64 | 64 | 64 | 67.1 | 66.92 | 66 | 66 |
5 | 62 | 64 | 64 | 67.1 | 66.92 | 66 | 66 |
6 | 66 | 64 | 64 | 67.1 | 66.92 | 66 | 66 |
7 | 66 | 64 | 64 | 67.1 | 66.92 | 66 | 66 |
8 | 65 | 64 | 64 | 67.1 | 66.92 | 66 | 66 |
9 | 69 | 64 | 64 | 67.1 | 66.92 | 66 | 66 |
10 | 66 | 64 | 64 | 67.1 | 66.92 | 66 | 66 |
11 | 73 | 64 | 64 | 67.1 | 66.92 | 66 | 66 |
12 | 69 | 64 | 64 | 67.1 | 66.92 | 66 | 66 |
13 | 69 | 64 | 64 | 67.1 | 66.92 | 66 | 66 |
14 | 61 | 64 | 64 | 67.1 | 66.92 | 66 | 66 |
15 | 64 | 64 | 64 | 67.1 | 66.92 | 66 | 66 |
16 | 61 | 64 | 64 | 67.1 | 66.92 | 66 | 66 |
17 | 71 | 64 | 64 | 67.1 | 66.92 | 66 | 66 |
18 | 67 | 64 | 64 | 67.1 | 66.92 | 66 | 66 |
19 | 73 | 64 | 64 | 67.1 | 66.92 | 66 | 66 |
20 | 66 | 64 | 64 | 67.1 | 66.92 | 66 | 66 |