Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space DEV and version r079

D toc

Excerpt

In the Column Details panel, you can review additional details about a column of your dataset.

...

Select Column Details

...

 from any column menu or the Action menu in the column browser.


Tip

Tip: Use the Column Details panel to explore values in an individual column, when the context of the value is not important for your current exploration. For example, you can identify outlier values for the column or compare the number of unique values to number of rows to determine whether the column could be a key value.

...

In the Patterns tab, you can review patterns identified by the platform in the selected column's data and then create steps based on patterns that you select. Pattern profiling automatically finds and groups clusters of the column's values based on similarities in format and structure, such as differently formatted phone numbers, addresses, log entries, and name fields. For example, if some of your dataset's address values include apartment numbers, you can create a split transform based Split transformation based on a pattern that includes the apartment numbers.

...

Info

NOTE: Wide columns, such as Arrays, Objects, or freeform text, might take a while to profile.

Tip

Tip: You can see data in the data grid while exploring patterns through the context panel. See Pattern Details Panel.

...

  • Each non-blank value in the column is represented by one of the displayed patterns. Patterns are specified as a combination of literal values and and 
    d-s-itemlang
    itempatterns
    rtrue
    . For more information on these patterns, see Text Matching.
  • Patterns might be more generalized than the constraints of the column's data type.
  • Token values are
    d-s-itemlang
    itempatterns
    without  without braces.

D caption
typefigure
Column Details panel - Patterns tab

...

In the above example, all values that have been identified as matching the url

d-s-itemlang
itempattern
are  are contained in the first category.

  • Select a pattern to trigger a set of suggestion cards to apply to the represented data.
    • When you select values from a pattern's histogram, all suggestions match the pattern. You cannot select the values that do not match the pattern from the histogram.
    • For more information, see Explore Suggestions.
  • Select a token within a pattern or a highlighted block of text among the example values to trigger suggestion cards that apply the token within the pattern.
  • You can modify the selected suggestion in the Transform Builder. See Transform Builder.
    • When you apply the transform transformation to your recipe, the Patterns tab is updated automatically.

      Tip

      Tip: When you see a pattern that you wish to reuse, select the pattern and one of its suggestion cards and then modify the step.

  • Expand the caret next to any pattern to explore its sub-patterns, which identify subsets of values within the broader pattern.

    Info

    NOTE: The Other pattern is a special category that contains values and counts not recognized by the currently selected pattern or sub-pattern. For example, when you select url pattern, the Other pattern captures the non-URL values. When you explore a sub-pattern of URLs, the Other category captures the values not recognized within the sub-pattern.

...

After patterns have been selected, they can be reused through the Transform Builder. See Pattern History Panel.Column patterns can also be reviewed in the context panel. See Pattern Details Panel.