Skip to main content

Data Quality Rules Reference


This feature may not be available in all product editions. For more information on available features, see Compare Editions.

This section contains reference information on the data quality rule types and input types that are available in Dataprep by Trifacta.

  • Data quality rules can be applied to your dataset through the Transformer page.

  • Input types identify the calculated metric types that can be used as inputs for a data quality rule.

Rule Types




Column values must be unique.


Source column values imply the values of a target column. For each unique source value, there should be exactly one implied target value.

Not Missing

Column values must not be missing. Null values and empty strings are not allowed.

Not Null

Column values must not be null. Empty strings are allowed.


Column values must be valid instances of a data type.


Column values must match a pattern.

Not Match

Column values must not match a pattern.

Starts With

Column values must start with a pattern.

Ends With

Column values must end with a pattern.


Column values must equal a provided value.

Not Equal

Column values must not equal a provided value.

In Range

Column values must lie between provided minimum and maximum values.

Greater Than

Column values must be greater than a minimum value.

Less Than

Column values must be less than a maximum value.

In Set

Column values must be one of a set of acceptable values.

Not In Set

Column values must not be one of a set of unacceptable values.


Apply a custom data quality rule formula.

Metric Input Types

The following metric input types can be selected as the source of a data quality rule.


These input types are available for selection from the Column drop-down.

Metric input types are supported for the following rules:

  • In Range

  • Greater Than

  • Less Than

  • Equals

  • Not Equals

  • In Set

  • Not In Set




The average column value.

Count Distinct

The number of unique column values.


The maximum column value.


The minimum column value.


The sum of column values.

Standard Deviation

The sample standard deviation of column values.


The sample variance of column values.


The number of rows.


The Pearson correlation coefficient between two columns.


The distance from the mean, in units of standard deviations.