Page tree

Release 8.7.1


Contents:

   

Generates the count of distinct values in a specified column, optionally counted by group. Generated value is of Integer type. 

NOTE: Empty string values are counted. Null values are not counted.

NOTE: When added to a transformation, the function calculates the number of distinct values in the specified column, as displayed in the current sample. Counts are not applied to the entire dataset until you run the job. If you change your sample or run the job, the computed values for this function are updated. Transformations that change the number of rows in subsequent recipe steps do not affect the value for the already computed instance of COUNTDISTINCT.

Wrangle vs. SQL: This function is part of Wrangle, a proprietary data transformation language. Wrangle is not SQL. For more information, see Wrangle Language.

Basic Usage

countdistinct(name)

Output: Returns the count of distinct values in the name column.

Syntax and Arguments

countdistinct(function_col_ref) [group:group_col_ref] [limit:limit_count]


ArgumentRequired?Data TypeDescription
function_col_refYstringName of column to which to apply the function


For more information on the group and limit parameter, see Pivot Transform.

function_col_ref

Name of the column from which to count values based on the grouping.

  • Literal values are not supported as inputs. 
  • Multiple columns and wildcards are not supported.

Usage Notes:

Required?Data TypeExample Value
YesString (column reference)myValues

Examples


Tip: For additional examples, see Common Tasks.

Example - Simple row count

This section provides simple examples for how to use the COUNTA and COUNTDISTINCT functions. These functions include the following:
  • COUNTA - Count the number of values within a group that meet a specific condition. See COUNTA Function.
  • COUNTDISTINCT - Count the number of non-null values within a group that meet a specific condition. See COUNTDISTINCT Function

Source:

In the following example, the seventh row is an empty string, and the eighth row is a null value.

rowIdVal
r001val1
r002val1
r003val1
r004val2
r005val2
r006val3
r007(empty)
r008(null)

Transformation:

Apply a COUNTA function on the source column:

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula COUNTA(Val)
Parameter: New column name 'fctnCounta'

Apply a COUNTDISTINCT function on the source:

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula COUNTDISTINCT(Val)
Parameter: New column name 'fctnCountdistinct'

Results:

Below, both functions count the number of values in the column, with COUNTDISTINCT counting distinct values only. The empty value for r007 is counted by both functions.

rowIdValfctnCountdistinctfctnCounta
r001val147
r002val147
r003val147
r004val247
r005val247
r006val347
r007(empty)47
r008(null)47

 

 

This page has no comments.