- on a specified set of rows (firstrows)
- on a quick scan across the dataset
- on a full scan of the entire dataset
NOTE: When a flow is shared, its samples are shared with other users. However, if those users do not have access to the underlying files that back a sample, they do not have access to the sample and must create their own.
Tip: If you have added an expensive transformation step, such as a complex union or join, you can improve performance of the Transformer page by generating and using a new sample.
For more information on creating samples, see Samples Panel.
- Some advanced sampling options are available only with execution across a scan of the full dataset.
- Undo/redo do not change the sample state, even if the sample becomes invalid.
When a new sample is generated, any Sort transformations that have been applied previously must be re-applied. Depending on the type of output, sort order may not be preserved.
Samples taken from a dataset with parameters are limited to a maximum of 50 files when executed on the
running environment. You can modify parameters as they apply to sampling jobs. See Samples Panel.
NOTE: The First rows sampling technique requires the
Random selection of a subset of rows in the dataset. These samples are comparatively fast to generate.You can apply quick scan or full scan to determine the scope of the sample.
Find specific values in one or more columns. For the matching set of values, a random sample is generated.