Contents:
Data quality rules provide an automated way to identify data inaccuracy and highlight exceptions to monitor and track data cleanliness over time. Data quality rules enable you to continuously assess different qualitative dimensions such as accuracy, completeness, uniqueness, and validity.
NOTE: Data quality rules are not transformation steps, but they assess the current state of the sampled data and can be used in constructing transformation steps to improve data quality.
In addition, you can use calculated metrics as a source of data quality inputs to create a metric-based data quality rule. For example, you can create a data quality rule where the average Price
has to be greater than 50.00.
From the Transformer page, click the Data quality rules icon. If you have not created any rules, the panel is empty. To create a new rule, click Add rule.
Tip: To review a set of suggested data quality rules based on your dataset, click View suggestions. Designer Cloud can automatically suggest a series of rules to validate various data quality aspects.
Figure: Data quality rule suggestions
Tip: You can hover over the color bars to view the failed values and passed values. You can also select Show only affected checkbox to view only the passed or failed columns.
Rule Categories
Data quality rules evaluate the values in one or more columns against test criteria that you define. Designer Cloud has a set of pre-defined data quality rule types. You can select the required rule type to monitor data quality during the import, transformation, and export of your datasets.
Custom-based rules
You can create custom-based rules using formulas containing Wrangle functions.
Metric-based rules
You can use custom metrics to assess data quality. You can use a calculated metric type (derived metrics) as a data quality input type and create a metric-based data quality rule. For example, you can create a constraint that the inventory quantity should be within a specific range.
Metric input types are supported for the following rules:
In Range
Greater Than
Less Than
Equals
Not Equals
In Set
Not In Set
Create Rules
For more information on creating rules, see Build Data Quality Rules.
Data Quality in Job Details
After you have successfully run your job, you can review the results of your data quality rules applied across the entire dataset in the Rules tab on the Job Details page.
NOTE: To display data quality results in your job details, visual profiling must be enabled for job execution.
Figure: Data quality job details
Learn More
Overview of Data Quality Rules: | Dataprep by Trifacta | Designer Cloud | Designer Cloud Enterprise Edition |
Data Quality Rules Reference: | Dataprep by Trifacta | Designer Cloud | Designer Cloud Enterprise Edition |
This page has no comments.