Contents:
NOTE: Transforms are a part of the underlying language, which is not directly accessible to users. This content is maintained for reference purposes only. For more information on the user-accessible equivalent to transforms, see Transformation Reference.
Type is specified as a string literal or comma-separated set of literals. For more information on valid string literals, see Valid Data Type Strings .
Tips:
Tip: You can use the settype
transform to override the data type inferred for a column. However, if a new transformation step is added, the column data type is re-inferred, which may override your specific typing. You should consider applying setttype
transforms as late as possible in your recipes.
- When a column is set to a data type, all values in the column are validated against the new type, which might change the number of mismatched values. Some cleanup might be required. Some operations might cause the data type to be re-validated automatically.
- It might be easier to set type using the column's drop-down. Selections of data type from the column drop-down are turned into recipe steps using the
settype
transform. - If you encounter a significant number of mismatches after you change the data type, you might find it helpful to change or revert the type to String. All data can be interpreted as a String or a list of string values. The transforms and functions for manipulating String data might be easier to use to clean up mismatched data before changing the data type to the preferred one.
- Row values that do not match the new data type might be turned to null values during job execution.
Basic Usage
Single-column example:
settype col: Score type: 'Integer'
Output: Changes the data type for the Score
column to Integer.
Multi-column example:
settype col: Score,studentId type: 'Integer'
Output: Changes the data type for the Score
and studentId
columns to Integer.
Syntax and Parameters
settype col:col1,col2 type:'string_literal'
Token | Required? | Data Type | Description |
---|---|---|---|
settype | Y | transform | Name of the transform |
col | Y | string | Comma-separated list of columns to which to apply the specified type. |
type | Y | string | String literal identifying the data type to apply to the column(s). See Valid Data Type Strings. |
For more information on syntax standards, see Language Documentation Syntax Notes.
col
Identifies the column(s) to which to apply the transform. You can specify one or more columns.
Usage Notes:
Required? | Data Type |
---|---|
Yes | Comma-separated strings (column name or names) |
type
NOTE: When specifying a data type by name, you must use the String value listed below. The Data Type value is the display name for the type.
For a list of valid strings, see Valid Data Type Strings.
settype col: zips type:'Zipcode'
Output: Changes the data type of the zips
column to Zip Code data type. All values are validated as U.S. Zip code.
Usage Notes:
Required? | Data Type |
---|---|
Yes | String value |
Tip: For additional examples, see Common Tasks.
Examples
Example - Simple settype with date values
Source:
Here is a list of activities listed by date. Note the variation in date values, including what is clearly an invalid date. Here is the source data:
myDate, myAction 4/4/2016,Woke up at 6:30 4-4-2016,Got ready 9-9-9999,Drove kids to school 4-4-2016, Commuted to work
Transformation:
When this data is imported into the Transformer page, there are couple of immediate issues: no column headings and blank rows at the bottom. These two transformations fix that:
Transformation Name | Rename column with row(s) |
---|---|
Parameter: Option | Use row(s) as column names |
Parameter: Type | Use a single row to name columns |
Parameter: Row number | 1 |
Transformation Name | Filter rows |
---|---|
Parameter: Condition | Custom formula |
Parameter: Type of formula | Custom single |
Parameter: Condition | ismissing([myDate]) |
Parameter: Action | Delete matching rows |
For the invalid date, you can infer from the rows around it that it should be from the same date. You can make the following change to fix it:
Transformation Name | Replace text or pattern |
---|---|
Parameter: Column | myDate |
Parameter: Find | `9-9-9999` |
Parameter: Replace with | '4-4-2016' |
Parameter: Match all occurrences | true |
Now that the dates look fairly consistent, you can set the data type of the column to a matching Datetime format:
Transformation Name | Change column data type |
---|---|
Parameter: Columns | myDate |
Parameter: New type | Custom or Date/Time |
Parameter: Specify type | 'mm-dd-yy','mm*dd*yyyy' |
Note the syntax above for specifying Datetime types. In addition to the Datetime
keyword, you must specify the format type, followed by the variation of that format.
Tip: A set of supported formats and variations for Datetime are available through the column data type selector. When you select your desired Datetime format, the setttype
transform is added to your recipe.
Results:
myDate | myAction |
---|---|
4/4/2016 | Woke up at 6:30 |
4-4-2016 | Got ready |
4-4-2016 | Drove kids to school |
4-4-2016 | Commuted to work |
Example - Use merge and settype to clean up numeric data that should be treated as other data types
This page has no comments.