Contents:
Missing values can be:
Empty values, such as an empty string:
''
- Null values, in which no data is present at all.
In the Designer Cloud® application, these different types of values are represented as simply missing values. You can use the visual tools in the Transformer page to standardize missing values to either empty or null values, depending on your downstream requirements.
Locate Missing Values
At the top of each column, you can see the column histogram:
Figure: Example column histogram
Color bar | Description |
---|---|
green | Valid values for the current data type of the column |
red | Invalid values for the current data type of the column |
black | Missing values could be empty or null. |
Manage Missing Values
Steps:
- In the Transformer page, click the black bar in the column histogram to select missing values in a column.
- A set of predictive suggestions is displayed in the right panel. Review the suggested transformations. You can click them to preview the results.
- For example, if the column is a primary key identifier for rows in the dataset, an empty or null value could be an indicator of bad data in the row. You may choose the suggestion to Delete rows.
- To make modifications to the suggested transformation before you add, click Edit.
To add the selected suggestion, click Add. The sample of the data in transformed.
Tip: You can also use the Recipe panel and add the required functions to manage the missing data function.
Manage Missing and Null Values
When you edit the above transformation, it should look like the following:
Transformation Name | Filter rows |
---|---|
Parameter: Condition | Custom formula |
Parameter: Type of formula | Custom single |
Parameter: Condition | ismissing([Primary_Website_or_URL]) |
Parameter: Action | Delete matching rows |
In the above, the operative test is contained in the Formula textbox. The formula tests if the value is missing:
ismissing([Primary_Website_or_URL])
However, a missing value may be treated differently from a null value, so we can add to this formula a test to see if the value is null:
or(ismissing([Primary_Website_or_URL]),isnull([Primary_Website_or_URL]))
The above test is a bit more complicated. It tests if the value is missing or is null. If either of those conditions is met, then the row is deleted.
Tip: Functions can be nested.
The edited transformation should look like the following:
Transformation Name | Filter rows |
---|---|
Parameter: Condition | Custom formula |
Parameter: Type of formula | Custom single |
Parameter: Condition | or(ismissing([Primary_Website_or_URL]),ISNULL([Primary_Website_or_URL])) |
Parameter: Action | Delete matching rows |
Learn More
Find Missing Data: | Dataprep by Trifacta | Designer Cloud | Designer Cloud Enterprise Edition |
Manage Null Values: | Dataprep by Trifacta | Designer Cloud | Designer Cloud Enterprise Edition |
This page has no comments.