This section provides information on improvements to the .
If you have upgraded from a 3.0 or earlier to Release 3.1 or later, you should review this page, as some type-related behaviors have changed in the platform.
Where there are mismatches between inputs and the expected input data type, the following values are generated for the mismatches:
|Source data type||Output if mismatched|
Primitive data types:
|null value, if mismatched|
|Datetime||null value, if mismatched|
Other non-primitive data types, including:
|Converted to string values, if mismatched|
|String||Anything can be a String value.|
State values and custom data types are converted to string values, if they are mismatched.
The running environment has been augmented to use three-value logic for null values.
When values are compared, the result can be
false in most cases.
If a null value was compared to a null value in the running environment:
This change aligns the behavior of the running environment with that of SQL and Hadoop Pig.
Assume that the column
nuller contains null values and that you have the following transform:
derive value:(nuller >= 0)
Prior to Release 3.1, the above transform generated a column of
In Release 3.1 and later, the transform generates a column of null values.
In the following example,
a_null_expression always evaluates to a null value.
derive value: (a_null_expression ? 'a' : 'b')
In Release 3.0, this expression generated
b for all inputs on the running environment and a null value on Hadoop Pig.
In Release 3.1 and later, this expression generates a null value for all inputs on both running environments.
Tip: Beginning in Release 3.1, you can use the
For example, you have the following dataset:
|You can't break this.|
|Not broken yet.|
You test each row for the presence of the string
derive value: if(find(MyStringCol, 'can\'t',true,0) > -1, true, false) as:'MyFindResults'
The above transform results in the following:
|You can't break this.||true|
|Not broken yet.|
In this case, the value of
false is not written to the other columns, since the
find function returns a null value. This null value, in turn, nullifies the entire expression, resulting in a null value written in the new column.
You can use the following to locate the null values:
derive value:isnull(MyFindResults) as:'nullInMyFindResults'
NOTE: Upgraded recipes continue to function properly. However, if you edit the recipe step in an upgraded system, you are forced to fix the formatting issue before saving the change.
Before this release, you could create a transform like the following:
This transform generated a column of map values, like the following:
Beginning this release, the above command is invalid, as the date values must be properly formatted prior to display. The following works:
This transform generates a column of Datetime values in the following format:
Before this release:
Prior release output:
derive value:dateformat(time(11,34,58), 'HH-mm-ss')
This release's output:
Beginning in this release, the
dateformat functions requires an AM/PM indicator (
a) if the date formatting string uses a 12-hour time indicator (
Valid for earlier releases:
derive value: unixtimeformat(myDate, 'yyyy-MM-dd hh:mm:ss') as:'myUnixDate'
Valid for this release and later:
derive value: unixtimeformat(myDate, 'yyyy-MM-dd hh:mm:ss a') as:'myUnixDate'
These references in recipes fail to validate in this release or later and must be fixed.
If a formatting string is not a datetime format recognized by the , the output is generated as a string value.
This change was made to provide clarity to some ambiguous conditions.
Beginning in this release, the colon (
:) is no longer supported as a delimiter for date values. It is still supported for time values.
|02:03:16||Recognized as a time value|
When data such as the above is imported, it may not be initially recognized by the as Datetime type.
To fix, you might apply the following transform:
replace col:myDateValue with:'-' on:`-` global:true
The new column values are more likely to be inferred as Datetime values. If not, you can choose the appropriate Datetime format from the data type drop-down for the column. See Data Grid Panel.