Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space DEV and version next

...

Excerpt

When you transform your data in the Transformer page, you are performing these transformations on a sample of the total dataset. As needed, you can generate new samples using a variety of algorithms to acquire other slices of your data.

A The initial data sample is always collected from the initial rows of the dataset. Whenever you create a recipe and open the dataset in the Transformer page,

D s webapp
  automatically generates the initial sample by loading

  • By default, the initial sample is the first 10 MB of your dataset.
    • The size of the sample can be modified by an administrator.
    • For file-based sources, the initial sample is taken from a limited number of files. 
      • By default, this limit is set to 10 files.
      • The maximum number of files from which a sample can be generated can be defined by an administrator.
  • If your dataset is less than 10 MB, then the entire dataset is may be loaded as an initial sample. 
  • For datasets larger than 10 MB, the first 10MB of rows are loaded into the Transformer page.
Tip

Tip: On the Transformer page, this first sample is listed as Initial Data. For more information on how this special sampling type is generated, see Overview of Sampling.

When to Take a New Sample

...

  • Advanced sampling options are available only with a full scan of the dataset.
  • Undo/redo do not change the sample state, even if the sample becomes invalid. 
  • When
  • executed on the 
  • a new sample is generated, sort transformations are not preserved for some type of outputs. Sort transformations must be reapplied. See Sort Rows.
  • When executed on the
    D s photon
     running environment, samples taken from a dataset with parameters are limited to a maximum of 50 files.

Collect a New Sample

You can use the existing loaded sample, or you can collect a new sample to use.

...