...
Excerpt |
---|
When you transform your data in the Transformer page, you are performing these transformations on a sample of the total dataset. As needed, you can generate new samples using a variety of algorithms to acquire other slices of your data. |
A The initial data sample is always collected from the initial rows of the dataset. Whenever you create a recipe and open the dataset in the Transformer page,
D s webapp |
---|
- By default, the initial sample is the first 10 MB of your dataset.
- The size of the sample can be modified by an administrator.
- For file-based sources, the initial sample is taken from a limited number of files.
- By default, this limit is set to
10
files. - The maximum number of files from which a sample can be generated can be defined by an administrator.
- By default, this limit is set to
- If your dataset is less than 10 MB, then the entire dataset is may be loaded as an initial sample.
- For datasets larger than 10 MB, the first 10MB of rows are loaded into the Transformer page.
Tip |
---|
Tip: On the Transformer page, this first sample is listed as Initial Data. For more information on how this special sampling type is generated, see Overview of Sampling. |
When to Take a New Sample
...
- Advanced sampling options are available only with a full scan of the dataset.
- Undo/redo do not change the sample state, even if the sample becomes invalid.
- When executed on the
- a new sample is generated, sort transformations are not preserved for some type of outputs. Sort transformations must be reapplied. See Sort Rows.
- When executed on the
running environment, samples taken from a dataset with parameters are limited to a maximum of 50 files.D s photon
Collect a New Sample
You can use the existing loaded sample, or you can collect a new sample to use.
...