Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space DEV and version next

...

  • The default sample is the initial sample.
  • By default, each sample is 10 MB in size or the entire dataset if it's smaller.  
  • If your source of data is a directory containing multiple files, the initial sample for the combined dataset is generated from the first set of rows in the first filename listed in the directory.
  • If you are wrangling a dataset with parameters, the initial sample that is loaded in the Transformer page is taken from the first matching dataset.

...

Info

NOTE: When a flow is shared, its samples are shared with other users. However, if those users do not have access to the underlying files that back a sample, they do not have access to the sample and must create their own.

Tip

Tip: If you have added an expensive transformation step, such as a complex union or join, you can improve performance of the Transformer page by generating and using a new sample.

For more information on creating samples, see Samples Panel.

...

  • Parameters: Subsequent samples generated from the Transformer page are sampled across all datasets matched by parameter values.
  • Variables: You can apply override values to the defaults for your dataset's variables at sample execution time. In this manner, you can draw your samples from specific sources files within your dataset with parameters.

Choosing Samples

After you have collected multiple samples of multiple types on your dataset, you can choose the proper sample to use for your current task, based on:

...

  • Some advanced sampling options are available only with execution across a scan of the full dataset.
  • Undo/redo do not change the sample state, even if the sample becomes invalid. 
  • When a new sample is generated, any sort transforms any Sort transformations that have been applied previously must be re-applied. Depending on the type of output, sort order may not be preserved. For more information, see Sort Transform.

  • Samples taken from a dataset with parameters are limited to a maximum of 50 files when executed on the Photon running environment. You can modify parameters as they apply to sampling jobs. See Samples Panel.

...