This page describes the statistical information available for individual columns of data.

Below, you can review general statistics maintained for each data type, followed by breakdowns of statistics for each specific type of data.

General Column Counts

For any selection of values in a column, the following counts are generally available.

Count NameDescription
Valid ValuesCount of values that are valid for the column's data type
Unique ValuesCount of unique values. Duplicate values are not counted.
Outlier Values

Count of values that qualify as outliers. An outlier value is either:

  • < (25th percentile) - (2 * IQR)
  • > (75th percentile) + (2* IQR)
  • IQR (interquarterile range) is the range of values between the two middle quarters, which is equivalent to the range between the 25th and 75th percentiles. Thus, in the above computations, the IQR factor ensures that the outliers are at the extremes of the entire range.
Mismatched ValuesCount of values that do not confirm to the column's data type. For example, an Integer column with a value of "MISSING" results in a mismatched value.
Missing ValuesCount of values that are not populated

General Column Statistics

These statistics are available for most types of data through the Column Browser.

Statistic NameDescription
MinimumLowest value in the column
Lower QuartileThe median of the lower half of values (25th percentile)

The middle value of the selected set. For example, in a set of 21 values, the median value is the 11th value in ascending order.

  • For datasets with an even number of values, the median is the mean of the two middle values.
Upper QuartileThe median of the upper half of values (75th percentile)
MaximumHighest value in the column
AverageAverage value in the column
Standard DeviationThe computed standard deviation for the selected values.