Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space DEV and version next

D toc

Excerpt

Computes the mode (most frequent value) from all row values in a column, according to their grouping. Input column can be of Integer, Decimal, or Datetime type.

  • If a row contains a missing or null value, it is not factored into the calculation. If the entire column contains no values, the function returns a null value.
  • If there is a tie in which the most occurrences of a value is shared between values, then the lowest value of the evaluated set is returned.
  • When used in a pivot transform, the function is computed for each instance of the value specified in the group parameter. See Pivot Transform.

For a non-conditional version of this function, see MODE Function.

For a version of this function computed over a rolling window of rows, see ROLLINGMODE Function.

D s lang vs sql

D s
snippetBasic

D lang syntax
RawWrangletrue
Typeref
showNotetrue
WrangleTextpivot value:modeif(count_visits, health_status == 'sick') group:postal_code limit:1

modeif(count_visits, health_status == 'sick')

Output: Returns the mode of the values in the count_visits column as long as health_status is set to sick.

D s
snippetSyntax

D lang syntax
RawWrangletrue
Typesyntax
showNotetrue
WrangleTextpivot value:modeif(function_col_ref, test_expression) [group:group_col_ref] [limit:limit_count]

modeif(function_col_ref, test_expression) [group:group_col_ref] [limit:limit_count]


ArgumentRequired?Data TypeDescription
function_col_refYstringName of column to which to apply the function
test_expressionYstring

Expression that is evaluated. Must resolve to true or false

For more information on the group and limit parameters, see Pivot Transform.

D s lang notes

function_col_ref

Name of the column the values of which you want to calculate the function. Column must contain Integer, Decimal, or Datetime values.

Info

NOTE: If the input is in Datetime type, the output is in unixtime format. You can wrap these outputs in the DATEFORMAT function to generate the results in the appropriate Datetime format. See DATEFORMAT Function.

  • Literal values are not supported as inputs.
  • Multiple columns and wildcards are not supported.

D s
snippetusage

Required?Data TypeExample Value
YesString (column reference)myValues

test_expression

This parameter contains the expression to evaluate. This expression must resolve to a Boolean (true or false) value.

D s
snippetusage

Required?
Data Type
Example Value
YesString expression that evaluates to true or false(LastName == 'Mouse' && FirstName == 'Mickey')


D s
snippetExamples

Example - MODEIF function

The following data contains a list of weekly orders for 2017 across two regions (r01 and r02). You are interested in calculating the most common order count for the second half of the year, by region.

Source:

Info

NOTE: For simplicity, only the first few rows are displayed.

DateRegionOrderCount
1/6/2017r0178
1/6/2017r0297
1/13/2017r0192
1/13/2017r0290
1/20/2017r0197
1/20/2017r0284

Transformation: 

To assist, you can first calculate the week number for each row:

D trans
RawWrangletrue
p03Value'weekNumber'
Typestep
WrangleTextderive type: single value: weeknum(Date) as: 'weekNumber'
p01NameFormula type
p01ValueSingle row formula
p02NameFormula
p02Valueweeknum(Date)
p03NameNew column name
SearchTermNew formula

Then, you can use the following aggregation to determine the most common order value for each region during the second half of the year:

D trans
RawWrangletrue
p03Value50
Typestep
WrangleTextpivot group: Region value: modeif(OrderCount, weekNumber > 26) limit: 50
p01NameRow labels
p01ValueRegion
p02NameValues
p02Valuemodeif(OrderCount, weekNumber > 26)
p03NameMax number of columns to create
SearchTermPivot columns

Results:

Regionmodeif_OrderCount
r0185
r02100

D s also
labelaggregate