d-docdevnotes |
---|
D migrated |
---|
page | Find and Fix Bad Data |
---|
|
D migrated |
---|
page | Clean and Enhance Your Data |
---|
|
|
Excerpt |
---|
You might encounter problems with how data has been structured or formatted that you must fix prior to providing the content to your target system. You can use the methods in this section to locate problems with the content or data typing of your data. |
...
In the
, it is very easy to identify where there are errors in your data. What is truly innovative is how you correct them:- Identify missing or mismatched data by color-coded bars in column data.
- Select a bar.
- Suggestions are offered in a set of cards on the right panel.
- Click a suggestion, and immediately see the effects of the suggested transformation previewed in the data grid.
- If the transformation needs tweaking, you can edit the transformation as needed.
- If the transformation is not the correct one, click another suggestion.
- When satisfied, you add the transformation, and your sample of data is transformed.
Image Added
D caption |
---|
Select errors in your data, and review AI-driven suggestions for how to correct. Make the change on the spot. |
Through this series of seeing, selecting, and refining issues in your sampled data, you can address basic errors in data mismatches, missing data, non-standard values, outlier values, and much more to improve the overall consistency and quality of your data.
Find bad data
In the Transformer page, above each column of data is a data quality bar and histogram.
The top bar is the data quality bar. The data quality bar segments the values found in the column into three color-coded bands:
Color bar | Description |
---|
green | Valid values for the current data type of the column |
red | Invalid values for the current data type of the column |
black | Missing values could be empty or null. |
Image Added
D caption |
---|
Mismatched values in a column are indicated in red |
Change column data type
In the image above, you can identify the data type of the column based on the icon to the left of the column name (POS_Sales
). In this case, the data type is Decimal.
In some cases, invalid data can be fixed by simply changing the column data type. You can click the current data type indicator to review and select a more appropriate data type.
Tip |
---|
Tip: You can change the data type of the column by click the data type icon for the column. |
Tip |
---|
Tip: No value is invalid for the String data type. |
Image Added
D caption |
---|
Change column data type |
Find outlier data
You can explore the details of a column of data to review statistical metrics on the data and to locate outlier values. In the column menu, select Column Details.
Image Added
Tip |
---|
Tip: When these bars are clicked or SHIFT -clicked, the selected values are used to prompt suggestions for how to transform them. |
Tip |
---|
Tip: You can explore the patterns in the data in the Patterns tab, where you can also use these patterns to standardize the formatting of your data. |
Fix Mismatched Values
When
evaluates a dataset sample, it interprets the values in a column against its expectations for the values. Based on the column's specified data type and internal pattern matching, values are categorized as valid, mismatched, or missing. These value categories are represented in a slender bar at the top of each column.
...