Page tree

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

NOTE:  Trifacta Wrangler is a free product with limitations on its features. Some features in the documentation do not apply to this product edition. See Product Limitations.

   

Computes the mode (most frequent value) from all row values in a column, according to their grouping. Input column can be of Integer or Decimal type. 
  • If a row contains a missing or null value, it is not factored into the calculation. If the entire column contains no values, the function returns a null value.
  • If there is a tie in which the most occurrences of a value is shared between values, then no value is returned from the function.
  • When used in a pivot transform, the function is computed for each instance of the value specified in the group parameter. See Pivot Transform.

For a version of this function computed over a rolling window of rows, see ROLLINGMODE Function.

Basic Usage

pivot value:MODE(count_visits) group:postal_code limit:1

Output: Generates a two-column table containing the unique values from the postal_code column and the mode of the values in the count_visits column for the postal_code value. The limit parameter defines the maximum number of output columns.

Syntax and Arguments

pivot value:MODE(function_col_ref) [group:group_col_ref] [limit:limit_count]

ArgumentRequired?Data TypeDescription
function_col_refYstringName of column to which to apply the function

For more information on the group and limit parameters, see Pivot Transform.

For more information on syntax standards, see Language Documentation Syntax Notes.

function_col_ref

Name of the column the values of which you want to calculate the function. Column must contain Integer or Decimal values.

  • Literal values are not supported as inputs.
  • Multiple columns and wildcards are not supported.

Usage Notes:

Required?Data TypeExample Value
YesString (column reference)myValues

Examples

Tip: For additional examples, see Common Tasks.

Example - Statistics on Test Scores

This example illustrates how you can apply statistical functions to your dataset. Calculations include average (mean), max, min, standard deviation, and variance.

Functions:

ItemDescription
AVERAGE Function Computes the average (mean) from all row values in a column or group. Input column can be of Integer or Decimal.
MIN Function Computes the minimum value found in all row values in a column. Input column can be of Integer, Decimal or Datetime.
MAX Function Computes the maximum value found in all row values in a column. Inputs can be Integer, Decimal, or Datetime.
VAR Function Computes the variance among all values in a column. Input column can be of Integer or Decimal. If no numeric values are detected in the input column, the function returns 0
STDEV Function Computes the standard deviation across all column values of Integer or Decimal type.
NUMFORMAT Function Formats a numeric set of values according to the specified number formatting. Source values can be a literal numeric value, a function returning a numeric value, or reference to a column containing an Integer or Decimal values.
MODE Function Computes the mode (most frequent value) from all row values in a column, according to their grouping. Input column can be of Integer, Decimal, or Datetime type.

Source:

Students took a test and recorded the following scores. You want to perform some statistical analysis on them:

StudentScore
Anna84
Ben71
Caleb76
Danielle87
Evan85
Faith92
Gabe85
Hannah99
Ian73
Jane68

Transformation:

You can use the following transformations to calculate the average (mean), minimum, and maximum scores:

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula AVERAGE(Score)
Parameter: New column name 'avgScore'

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula MIN(Score)
Parameter: New column name 'minScore'

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula MAX(Score)
Parameter: New column name 'maxScore'

To apply statistical functions to your data, you can use the VAR and STDEV functions, which can be used as the basis for other statistical calculations.

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula VAR(Score)
Parameter: New column name var_Score

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula STDEV(Score)
Parameter: New column name stdev_Score

For each score, you can now calculate the variation of each one from the average, using the following:

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula ((Score - avg_Score) / stdev_Score)
Parameter: New column name 'stDevs'

Now, you want to apply grades based on a formula:

Gradestandard deviations from avg (stDevs)
AstDevs > 1
BstDevs > 0.5
C-1 <= stDevs <= 0.5
DstDevs < -1
FstDevs < -2

You can build the following transformation using the IF function to calculate grades.

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula IF((stDevs > 1),'A',IF((stDevs < -2),'F',IF((stDevs < -1),'D',IF((stDevs > 0.5),'B','C'))))

To clean up the content, you might want to apply some formatting to the score columns. The following reformats the stdev_Score and stDevs columns to display two decimal places:

Transformation Name Edit column with formula
Parameter: Columns stdev_Score
Parameter: Formula NUMFORMAT(stdev_Score, '##.00')

Transformation Name Edit column with formula
Parameter: Columns stDevs
Parameter: Formula NUMFORMAT(stDevs, '##.00')

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula MODE(Score)
Parameter: New column name 'modeScore'

Results:

StudentScoremodeScoreavgScoreminScoremaxScorevar_Scorestdev_ScorestDevsGrade
Anna8485826899

87.00000000000001

9.330.21C
Ben718582689987.000000000000019.33-1.18D
Caleb768582689987.000000000000019.33-0.64C
Danielle878582689987.000000000000019.330.54B
Evan858582689987.000000000000019.330.32C
Faith928582689987.000000000000019.331.07A
Gabe858582689987.000000000000019.330.32C
Hannah998582689987.000000000000019.331.82A
Ian738582689987.000000000000019.33-0.96C
Jane688582689987.000000000000019.33-1.50D

See Also for EXAMPLE - Statistical Functions:

 

  • No labels

This page has no comments.