Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This example illustrates how to use the conditional ranking function KTHLARGESTIF in your functions KTHLARGESTIF and KTHLARGESTUNIQUEIF in your recipes.

Source:

Here is some example weather data:

datecityrain_cmtemp_Cwind_mph
1/23/17Valleyville0.0012.868.78
1/23/17Center Town0.319.45.3
1/23/17Magic Mountain0.000.07.3
1/24/17Valleyville0.2517.23.3
1/24/17Center Town0.541.17.6
1/24/17Magic Mountain0.325.08.8
1/25/17Valleyville0.023.36.8
1/25/17Center Town0.833.35.1
1/25/17Magic Mountain0.59-1.76.4
1/26/17Valleyville1.0815.04.2
1/26/17Center Town0.966.17.6
1/26/17Magic Mountain0.77-3.93.0
1/27/17Valleyville1.007.22.8
1/27/17Center Town1.3220.00.2
1/27/17Magic Mountain0.775.65.2
1/28/17Valleyville0.12-6.15.1
1/28/17Center Town0.145.04.9
1/28/17Magic Mountain1.501.10.4
1/29/17Valleyville0.3613.37.3
1/29/17Center Town0.756.19.0
1/29/17Magic Mountain0.603.36.0

...

D trans
RawWrangletrue
Typestep
WrangleTextpivot value:KTHLARGESTIF(temp_C,2,city == 'Center Town') limit:1
p01NameValues
p01ValueKTHLARGESTIF(temp_C,2,city == 'Center Town')
p02NameMax number of columns to create
p02Value1
SearchTermPivot columns

The value is 20.

For wind, you modify it to be the following, capturing the third-ranked value:

D trans
RawWrangletrue
Typestep
WrangleTextpivot value:KTHLARGESTIF(wind_mph,23,city == 'Center Town') limit:1
p01NameValues
p01ValueKTHLARGESTIF(wind_mph,3,city == 'Center Town')
p02NameMax number of columns to create
p02Value1
SearchTermPivot columns

In the results, you notice that there are two values for 8.8. So you change the function to use the KTHLARGESTUNIQUEIF function instead:

D trans
RawWrangletrue
Typestep
WrangleTextpivot value:KTHLARGESTUNIQUEIF(wind_mph,3,city == 'Center Town') limit:1
p01NameValues
p01ValueKTHLARGESTUNIQUEIF(wind_mph,23,city == 'Center Town')
p02NameMax number of columns to create
p02Value1
SearchTermPivot columns

The result value is 7.6. Note that this value appears twice, so if you change the rank parameter in the above transformation to 4, the results would return a different unique ranked value (7.3). 

Results:

You can choose to add any of these steps to generate an aggregated result. As an alternative, you can use a derive transform to insert these calculated results into new columns.