In general terms, a null value is a definition that points to nothing. A container for a value, such as a row-column combination or a variable, exists, but the container points to no actual value.
NOTE: In the platform, null values are a subset of the category identifying missing values. For technical reasons, however, displays null values as missing values and visually treats them as the same. Internally, they are understood to be different values.
truefor null and missing values.
truefor a null value and
falsefor a missing value. See below.
For example, the following transform generates a column of null values, which are represented as missing values in the data quality bar.
derive type:single value:NULL()
NOTE: When a recipe containing a user-defined function is applied to text data, any null characters cause records to be truncated by the running environment during job execution. In these cases, please execute the job on Hadoop.
Null values are displayed with missing values in the Missing values category of the data quality bar (in black).
You can use the following transform to distinguish between null and missing values. This transform generates a new column of values, which are set to
true if the value in
isActive is a null value:
Tip: You can use this transform and a subsequent sorting step on the generated column to filter for null values.
derive type:single value:ISNULL(isActive)
On import, if a column has a high enough percentage of null values, the platform may retype the column as a
String column, which may yield mismatched values in addition to the missing values that were imported from null values.
See Find Missing Data.
If needed, you can write a null value to a set of data. In the following example, all missing values in a column are replaced by nulls, using the
set col: My_Column value: NULL() row: ISMISSING([My_Column])
The above transform writes null values, but these values are converted to missing values on export.