Page tree

Trifacta Dataprep


Contents:

On April 28, 2021, Google is changing the required permissions for attaching IAM roles to service accounts. If you are using IAM roles for your Google service accounts, please see Changes to User Management.

   

This example shows how you can use the following functions to perform some analysis on Datetime columns.

  • MINDATE - Calculates the earliest (minimum) date from a column of Datetime column values. See MINDATE Function.
  • MAXDATE - Calculates the latest (maximum) date from a column of Datetime column values. See MAXDATE Function.
  • MODEDATE - Calculates the most frequent (mode) date from a column of Datetime column values. See MODEDATE Function.

Source:

The following dataset contains a set of three available dates for a set of classes:

classIdDate1Date2Date3
c0012020-03-092020-03-132020-03-17
c0022020-03-092020-03-062020-03-21
c0032020-03-092020-03-162020-03-23
c0042020-03-092020-03-232020-04-06
c0052020-03-092020-04-092020-05-09
c0062020-03-092020-08-092021-01-09

Transformation: 

To compare dates across multiple columns, you must consolidate the values into a single column. You can use the following transformation to do so:

Transformation Name Unpivot columns
Parameter: Columns Date1,Date2,Date3
Parameter: Group size 1

The dataset is now contained in three columns, with descriptions listed below:

classIdkeyvalue
Same as previous.DateX column identifierCorresponding value from the DateX column.

You can use the following to rename the value column to eventDates:

Transformation Name Rename columns
Parameter: Option Manual rename
Parameter: Column value
Parameter: New column name eventDates

Using the following transformations, you can create new columns containing the min, max, and mode values for the Datetime values in eventDates:

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula MINDATE(eventDates)
Parameter: New column name earliestDate

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula MAXDATE(eventDates)
Parameter: New column name latestDate

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula MODEDATE(eventDates)
Parameter: New column name mostFrequentDate

Results:

classIdkeyeventDatesmostFrequentDatelatestDateearliestDate
c001Date12020-03-092020-03-092021-01-092020-03-06
c001Date22020-03-132020-03-092021-01-092020-03-06
c001Date32020-03-172020-03-092021-01-092020-03-06
c002Date12020-03-092020-03-092021-01-092020-03-06
c002Date22020-03-062020-03-092021-01-092020-03-06
c002Date32020-03-212020-03-092021-01-092020-03-06
c003Date12020-03-092020-03-092021-01-092020-03-06
c003Date22020-03-162020-03-092021-01-092020-03-06
c003Date32020-03-232020-03-092021-01-092020-03-06
c004Date12020-03-092020-03-092021-01-092020-03-06
c004Date22020-03-232020-03-092021-01-092020-03-06
c004Date32020-04-062020-03-092021-01-092020-03-06
c005Date12020-03-092020-03-092021-01-092020-03-06
c005Date22020-04-092020-03-092021-01-092020-03-06
c005Date32020-05-092020-03-092021-01-092020-03-06
c006Date12020-03-092020-03-092021-01-092020-03-06
c006Date22020-08-092020-03-092021-01-092020-03-06
c006Date32021-01-092020-03-092021-01-092020-03-06

This page has no comments.