Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space DEV and version next

...

  • The default sample is the initial sample.
  • If your source of data is a directory containing multiple files, the initial sample for the combined dataset is generated from the first set of rows in the first filename listed in the directory.
    • The maximum number of files in a directory that can be read in the initial sample is limited by parameter for performance reasons. 

...

    • If you are wrangling a dataset with parameters, the initial sample loaded in the Transformer page is taken from the first matching dataset.

  • If the matching file is a multi-sheet Excel file, the sample is taken from the first sheet in the file.

  • By default, each initial sample is either: 
    • 10 MB in size
    or the entire dataset if it's smaller.  
    • Limited by the maximum number of files
    • The entire dataset
  • If the source data is larger than 10MB in size, a random sample is automatically generated for you when the recipe is first loaded in the Transformer page. 
    • The initial sample is selected by default. When the automatic random sample has finished generation, it can be manually selected for display.
    If your source of data is a directory containing multiple files, the initial sample for the combined dataset is generated from the first set of rows in the first filename listed in the directory.
    • If the matching file is a multi-sheet Excel file, the sample is taken from the first sheet in the file.

    • If you are wrangling a dataset with parameters, the initial sample loaded in the Transformer page is taken from the first matching dataset.

Generating samples

Additional samples can be generated from the context panel on the right side of the Transformer page. Sample jobs are independent job executions. When a sample job succeeds or fails, a notification is displayed for you.

...