In the example above, you can see that the current sample is a Random sample.
Initial Data: The
sample is taken from the first set of rows in the first file or table that is part of the dataset.
In some cases, the Initial Data sample is the entire dataset.
Tip: For purposes of loading the data, the initial data sample is generated and displayed at first. For a better representation of the entire dataset, you should create a new sample.
- In other cases, the Initial Data sample is generated from a collection of files.
- If the recipe is a child recipe, then the Initial Data sample indicates the selected sample of the parent recipe.
- For more information on this special sampling type, see Overview of Sampling.
To create a new sample, click Collect a new sample.
To review all samples that you have created, see Sample Jobs Page.
At the top of the panel, you can review the currently loaded sample. Each user has his own active sample on a dataset.
NOTE: When a new sample is generated, any Sort transformations that have been applied previously must be re-applied. Depending on the type of output, sort order may not be preserved.
Initial Data: By default, the application loads the first N rows of the dataset as the initial data sample when the Transformer page is opened. The number of rows depends on column count, data density, and other factors. If the dataset is small enough, the full dataset is used.
NOTE: By default, samples may be up to 10 MB in size or may be limited based on the maximum number of files that can be scanned. For datasets smaller than this limit, the entire dataset is loaded. See Overview of Sampling.
Click the link in the current sample card to see the list of all available samples.
Tip: To change the name of a sample, click its card in the list of all available. Then, click the Edit icon.
Below the current sample, you can review the available options for creating new samples. Each type of sample reflects a different method of collection.
|D s sampling|
To collect a new sample, click the appropriate sample card. See below.
Tip: A sample execution is a type of job. Any issues related to the execution of a sampling can be reviewed through the job logs.
- To cancel a sample collection, click the X next to the progress bar. The interrupted sample is listed as unavailable. You can download the logs from the unfinished sample collection.
When a new sample is collected, it is gathered based on the current location in the recipe when the sample is gathered. So, if the recipe contains steps that join in other datasets, those joins are performed to bring together the data from which the sample is executed.
Collect new sample panel
NOTE: Except for the initial data sample, all samples are generated based on the steps leading up to the location of the cursor in the recipe. If earlier steps are deleted or modified, the collected sample can be invalidated.
- Anomaly type:(Anomaly-based) Select the type of anomalous values to include in your sample: invalid, missing, or both types.
- Variable overrides: If one or more variables is associated with your dataset, you can define the value overrides to be applied when the sample is executed.
- You can use these overrides to sample data from different source files in your dataset with parameters.
- A variable can have an empty value.
- For more information, see Overview of Parameterization.
- To begin collecting the sample, click Collect.
- You can continue working while the sample is collected. When the sample is available, a status message is displayed in the Transformer page.
- You can click Load Sample in the Samples panel to begin using it.
To use one of the available samples, select its cardclick Load. The sample is loaded in the data grid. For more information, see Generate a Sample.
NOTE: If you add recipe steps that change the number of rows in your dataset (or a few other edge case steps), some of your existing samples may no longer be valid. When you execute a join, union, or delete action or edit steps before this action, you may be prompted with the Change Recipe dialog, which includes the following message:
Your change will invalidate some of the currently available samples for this source. The invalid samples will be deactivated.
For more information on the types of transformations that can invalidate samples, see Reshaping Steps.
In the Samples panel, locate the job in-progress. Click the X.
- You can also review and cancel sample jobs through a page in the
. For more information, see Sample Jobs Page.
D s webapp
|D s also|