Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space DEV and version next

...

  • In some cases, the Initial Data sample is the entire dataset. 

    Data from the rest of the first file or table or from other files or tables is not included in the data grid.

    Tip

    Tip: For purposes of loading the data, the initial data sample is generated and displayed at first. For a better representation of the entire dataset, you should create a new sample.

  • In other cases, the Initial Data sample is generated from a collection of files. For more information on this special sampling type, see Overview of Sampling.

To create a new sample, click Collect a new sample.

...

To review all samples that you have created, see Sample Jobs Page.

Image RemovedImage Added


D caption
typefigure
Samples Panel

...

  • Initial Data: By default, the application loads the first N rows of the dataset as the initial data sample when the Transformer page is opened. The number of rows depends on column count, data density, and other factors. If the dataset is small enough, the full dataset is used. 

    Info

    NOTE: By default, samples may be up to 10 MB in size or may be limited based on the maximum number of files that can be scanned. For datasets smaller than this limit, the entire dataset is loaded. See Overview of Sampling.

  • Click the link in the current sample card to see the list of all available samples.

    Tip

    Tip: To change the name of a sample, click its card in the list of all available. Then, click the Edit icon.

...

  • To collect a new sample, click the appropriate sample card. See below.

    Tip

    Tip: A sample execution is a type of job. Any issues related to the execution of a sampling can be reviewed through the job logs.


  • After a sample is created, you can load it at any time, as long as it is still valid. Next to a collected sample, click Load sample.
  • For more information on sampling methods, see Overview of Sampling.

...

When a new sample is collected, it is gathered based on the current location in the recipe when the sample is gathered. So, if the recipe contains steps that join in other datasets, those joins are performed to bring together the data from which the sample is executed. Image Removed

Image Added

D caption
typefigure
Collect new sample panel

...

  • Anomaly type:(Anomaly-based) Select the type of anomalous values to include in your sample: invalid, missing, or both types.
  • Variable overrides: If one or more variables is associated with your dataset, you can define the value overrides to be applied when the sample is executed. 
    • You can use these overrides to sample data from different source files in your dataset with parameters. 
    • A variable can have an empty value.
    • For more information, see Overview of Parameterization.
  • To begin collecting the sample, click Collect.
  • You can continue working while the sample is collected. When the sample is available, a status message is displayed in the Transformer page.
  • You can click Load Sample in the Samples panel to begin using it.

...

To use one of the available samples, select its cardclick Load. The sample is loaded in the data grid. For more information, see Generate a Sample.

Info

NOTE: If you add recipe steps that change the number of rows in your dataset (or a few other edge case steps), some of your existing samples may no longer be valid. When you execute a join, union, or delete action or edit steps before this action, you may be prompted with the Change Recipe dialog, which includes the following message:

Your change will invalidate some of the currently available samples for this source. The invalid samples will be deactivated.

For more information on the types of transformations that can invalidate samples, see Reshaping Steps.


D s also
inCQLtrue
label(label = "sample") OR (label = "sample_ui") OR (label = "sample_ui")