D toc |
---|
Excerpt |
---|
Computes the mode (most frequent value) from all row values in a column, according to their grouping. Input column can be of Integer, Decimal, or Datetime type. |
- If a row contains a missing or null value, it is not factored into the calculation. If the entire column contains no values, the function returns a null value.
- If there is a tie in which the most occurrences of a value is shared between values, then the lowest value of the evaluated set is returned.
- When used in a
pivot
transform, the function is computed for each instance of the value specified in thegroup
parameter. See Pivot Transform.
For a non-conditional version of this function, see MODE Function.
For a version of this function computed over a rolling window of rows, see ROLLINGMODE Function.
D s lang vs sql
D s | ||
---|---|---|
|
D lang syntax | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
modeif(count_visits, health_status == 'sick') |
Output: Returns the mode of the values in the count_visits
column as long as health_status
is set to sick
.
D s | ||
---|---|---|
|
D lang syntax | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
modeif(function_col_ref, test_expression) [group:group_col_ref] [limit:limit_count] |
Argument | Required? | Data Type | Description |
---|---|---|---|
function_col_ref | Y | string | Name of column to which to apply the function |
test_expression | Y | string | Expression that is evaluated. Must resolve to |
For more information on the group
and limit
parameters, see Pivot Transform.
D s lang notes |
---|
function_col_ref
Name of the column the values of which you want to calculate the function. Column must contain Integer, Decimal, or Datetime values.
Info |
---|
NOTE: If the input is in Datetime type, the output is in unixtime format. You can wrap these outputs in the DATEFORMAT function to generate the results in the appropriate Datetime format. See DATEFORMAT Function. |
- Literal values are not supported as inputs.
- Multiple columns and wildcards are not supported.
D s | ||
---|---|---|
|
Required? | Data Type | Example Value |
---|---|---|
Yes | String (column reference) | myValues |
test_expression
This parameter contains the expression to evaluate. This expression must resolve to a Boolean (true
or false
) value.
D s | ||
---|---|---|
|
Required? | Data Type | Example Value |
---|---|---|
Yes | String expression that evaluates to true or false | (LastName == 'Mouse' && FirstName == 'Mickey') |
D s | ||
---|---|---|
|
Example - MODEIF function
The following data contains a list of weekly orders for 2017 across two regions (r01
and r02
). You are interested in calculating the most common order count for the second half of the year, by region.
Source:
Info |
---|
NOTE: For simplicity, only the first few rows are displayed. |
Date | Region | OrderCount |
---|---|---|
1/6/2017 | r01 | 78 |
1/6/2017 | r02 | 97 |
1/13/2017 | r01 | 92 |
1/13/2017 | r02 | 90 |
1/20/2017 | r01 | 97 |
1/20/2017 | r02 | 84 |
Transformation:
To assist, you can first calculate the week number for each row:
D trans RawWrangle true p03Value 'weekNumber' Type step WrangleText derive type: single value: weeknum(Date) as: 'weekNumber' p01Name Formula type p01Value Single row formula p02Name Formula p02Value weeknum(Date) p03Name New column name SearchTerm New formula
Then, you can use the following aggregation to determine the most common order value for each region during the second half of the year:
D trans RawWrangle true p03Value 50 Type step WrangleText pivot group: Region value: modeif(OrderCount, weekNumber > 26) limit: 50 p01Name Row labels p01Value Region p02Name Values p02Value modeif(OrderCount, weekNumber > 26) p03Name Max number of columns to create SearchTerm Pivot columns
Results:
Region | modeif_OrderCount |
---|---|
r01 | 85 |
r02 | 100 |
D s also label aggregate