- The default sample is the initial sample.
- By default, each sample is 10 MB in size or the entire dataset if it's smaller.
- If your source of data is a directory containing multiple files, the initial sample for the combined dataset is generated from the first set of rows in the first filename listed in the directory.
If you are wrangling a dataset with parameters, the initial sample that is loaded in the Transformer page is taken from the first matching dataset.
NOTE: When a flow is shared, its samples are shared with other users. However, if those users do not have access to the underlying files that back a sample, they do not have access to the sample and must create their own.
Tip: If you have added an expensive transformation step, such as a complex union or join, you can improve performance of the Transformer page by generating and using a new sample.
For more information on creating samples, see Samples Panel.
- Parameters: Subsequent samples generated from the Transformer page are sampled across all datasets matched by parameter values.
Variables: You can apply override values to the defaults for your dataset's variables at sample execution time. In this manner, you can draw your samples from specific sources files within your dataset with parameters.
After you have collected multiple samples of multiple types on your dataset, you can choose the proper sample to use for your current task, based on: