Page tree

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 37 Next »


Contents:

Our documentation site is moving!

For up-to-date documentation of Dataprep, please visit us at https://help.alteryx.com/Dataprep/.

   

Contents:


Dataprep by Trifacta supports a variety of Datetime formats, each of which has additional variations to it.

Date Range

Supported Date Ranges:  

  • Earliest: January 1, 1400

    NOTE: Two-digit values for the year that are older than 80 years from the current year are forward-ported into the future. For example, in a job run on Dec 31, 2021, the date 01/01/41 is interpreted as 01/01/1941. However, if the job is run the next day (January 01, 2022), then the same data is interpreted as 01/01/2041. As a workaround, you can use the DATEFORMAT function to reformat these dates as four-digit values (01/01/1941). See DATEFORMAT Function.

  • Latest: December 31, 2599

You can use dates in the Gregorian calendar system only. Dates in the Julian calendar are not supported.

Data Validation

When values are validated against the Datetime data type, the Dataprep by Trifacta application does not compare them to an underlying calendar system. Instead, the application validates the values using regular expressions. This regular expression method checks for general Datetime validation and is fast to evaluate.

However, some values may follow the regular expression validation pattern but are not accurate dates. For example, every four years, February 29 is a valid date. When this date is validated against the Datetime data type, it may be detected as a valid value, while the date is changed in the application to be incremented to a close accurate date, such as March 1 in this example.

Formatting Tokens

You can use the following tokens to change the format of a column of dates:

LetterDate or Time ComponentPresentationExamples
MMonth in yearNumber1
MMMonth in yearNumber01
MMMMMonth in yearMonthJanuary
MMMMonth in yearMonthJan
yyYearNumber16
yyyyYearNumber2016
DDay in yearNumber352
dDay in monthNumber9
ddDay in a monthNumber09
EEE Day in week (three-letter abbreviation)TextWed
EEEEDay in weekTextWednesday
h

Hour in day (1-12)

NOTE: Requires an AM/PM indicator (a).

Number2
hh

Hour in am/pm (01-12)

NOTE: Requires an AM/PM indicator (a).

Number02
HHour in day (1-12)Number2
HHHour in day (0-23)Number20
mMinute in an hourNumber9

mm

Minute in an hourNumber09
sSecond in a minuteNumber3
ssSecond in a minuteNumber03
SSS MillisecondNumber218
XTime zoneISO 8601 time zone-08:00
aAM/PM indicatorStringAM


NOTE: When publishing to relational targets, Datetime values are written as date/time values in newly created tables. If you are appending to a relational table column that is in timestamp format, Datetime values can be written as timestamps.



Tip: If your DateTime column contains data in multiple formats, you must change the format of the DateTime column to one format and then add a transformation to convert that data to the other format. When all formats of your source date values are converted to a single format, the application should infer the appropriate date and time format.

Supported Separators:

  • Date separators: blank space, comma, single hyphen, or forward slash
  • Time separators: blank space, comma, single hyphen, colon, t or T
  • Non-delimited Datetime values are supported. For example, yyyymmdd, yyyymmddThhmmssX.

ISO 8601 Time Zone Notes:

  • Support for timezone offset from UTC indicated by +hh:mm, +hhmm, or +hh. For example, the date '2013-11-18 11:55-04:00' is recognized as a DateTime value.

  • Datetime part functions (for example, Hour) truncate time zones and return local time.
  • If you have a column with multiple time zones, you can convert the column to Unixtime so you can perform Date/Time operations with a standardized time zone. If you want to work with local times, you can truncate the time zone or use other Datetime functions. See UNIXTIME Function.

Two-digit year values

Depending on the system, a two-digit value for year in a Datetime value is subject to different interpretations. In  Dataprep by Trifacta, two-digit values for the year that are older than 80 years from the current year are forward-ported into the future. For example, in a job run on Dec 31, 2021, the date 01/01/41 is interpreted as 01/01/1941. However, if the job is run the next day (January 01, 2022), then the same data is interpreted as 01/01/2041.

Other systems use different limits for backward versus forward porting of year values:

As a result, it can be a challenge to manage these system-dependent two-digit years in a consistent manner. 

Tip: For best results, you should format year values as four-digit values before the data is ingested into Dataprep by Trifacta. Four-digit years are consistently represented across all systems.

If the above is not possible, you can create replacement steps in your recipe to convert two-digit years to four-digit values. In the following example, 00-39 is interpreted as a 19XX year, while 40-99 is interpreted as a 20XX year: 

Transformation Name Replace text or pattern
Parameter: Column myDateColumn
Parameter: Find /\b([456789][0-9])\b$/
Parameter: Replace with 19$1

and

Transformation Name Replace text or pattern
Parameter: Column myDateColumn
Parameter: Find /\b([0123][0-9])\b$/
Parameter: Replace with 20$1

Supported Datetime Formats

For more information on the available formats and examples of each one, see Datetime Formats (PDF).

For more information on supported date formatting strings, see DATEFORMAT Function.

Supported Time Zones

For more information, see Supported Time Zone Values.

Job Execution

Datetime data typing involves the basic type definition, plus any supported formatting options. Depending on where the job is executed, there may be variation in how the Datetime data type is interpreted. 

  • Some running environments may perform additional inference on the typing.

    NOTE: During job execution on Spark, inputs of Datetime data type may result in row values being inferred for data type individually. For example, the String value 01/10/2020 may be inferred by date transformations as 1st Oct, 2020 or 10th Jan, 2020. Resulting outputs of Datetime values may not be deterministic in this scenario.

  • Some formatting options may not be supported. 

  • No labels

This page has no comments.