Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space DEV and version next

D toc

Excerpt

Through the Import Data page, you can upload

...

datasets or select datasets from sources that are stored on connected datastores. From the Library page,

...

click Import Data.

...


Image Added

D caption
typefigure
Import Data page

...

General Limitations

...

Info

NOTE: For file-based sources,

D s product
rtrue
expects that each row of data in the import file is terminated with a consistent newline character, including the last one in the file.

  • For single files lacking this final newline character, the final record may be dropped.

  • For multi-file imports lacking a newline in the final record of a file, this final record may be merged with the first one in the next file and then dropped depending on your running environment.

 

...

  • in the
    D s photon
    running environment.



D s minrows

File and path limitations:

  • The colon character  ( :)  cannot appear in a filename or a file path.
  • Filenames cannot begin with special characters like dot ( .) or underscore( _).
  • Input file or table paths can have a maximum length of 1024 characters. 

Basic Workflow

1. Connect to sources

During import, the

D s webapp
identifies file formats based on the extension of the filename.

  • Compressed files are recognized and can be imported based on their file extensions.

...

  • Filenames that do not have an extension are treated as TXT files.

Upload:

...

D s product
rtrue

...

can also load files from your local file system.

Tip

Tip: You can drag and drop files from your desktop to to upload them.


Info

NOTE: You can upload a file up to

...

1 GB in size.


Info

NOTE: When you upload an updated version of a previously uploaded file, the new file is stored as a separate upload altogether. In your flow, you must swap out the old dataset to point to the new one.








For more information on the supported input formats,

...

see Supported File Formats.

2. Add datasets

...

When you have found your source directory or file

...

:

  • You can hover over the name of a file to preview its contents.

    Info

    NOTE: Preview may not be available for some sources, such as Parquet.

  • Click the Plus icon next to

...

  • the directory or filename to add it as a dataset.

...

  • Tip

    Tip: You can import multiple datasets at the same time. See below.

  • Excel files:

...

  • Click the Plus icon next to the parent workbook to add all of the worksheets as a single dataset, or you can add individual sheets as individual datasets.

...

Tip

Tip: If you experience issues uploading XLS/XLSX files that are larger than 35MB, you can convert the files to CSV files and then upload them.

...


3. Configure selections

When a dataset has been selected, the following fields appear on the right side of the screen. Modify as needed:

  • Dataset Name:

...

  • This name appears in the interface.

...

  • Dataset Description: You may add an optional description that provides additional detail about the dataset. This information is visible in some areas of the interface.
Tip

Tip: Click the Eye icon to inspect the contents of the dataset prior to importing.

Tip

Tip: You can select a single dataset or multiple datasets for import.

...


Edit settings 

You can

...

edit any additional or optional settings for an individual dataset. Perform the following:

Steps:

  1. Click Edit Settings from the card for an individual dataset, click Edit Settings
    InfoNOTE: In some cases, there may be discrepancies between row counts in the previewed data versus the data grid after the dataset has been imported, due to rounding in row counts performed in the preview.
  2. Per-file encoding: By default, 

    D s product
     attempts to interpret the encoding used in the file. In some cases, the data preview panel may contain garbled data, due to a mismatch in encodings. In the Data Preview dialog, you can select a different encoding for the file. When the correct encoding is selected, the preview displays the data as expected.

  3. Detect structure: By default,
    D s product
    attempts to interpret the structure of your data during import. This structuring attempts to apply an initial tabular structure to the dataset.
  4. Unless you have specific problems with the initial structure, you should leave the Detect structure setting enabled. Recipes created from these imported datasets automatically include the structuring as the first, hidden steps. These steps are not available for editing, although you can remove them through the Recipe panel. See Recipe Panel.
  5. When detecting structure is disabled, imported datasets whose schema has not been detected are labeled, raw datasets. When recipes are created for these raw datasets, the structuring datasets are added into the recipe and can be edited as needed.
  6. For more information, see Initial Parsing Steps.the right panel. The dialog box is displayed. 
  7. In the dialog box, select the required options and modify the settings.

4. Import selections

Single dataset

If you have selected a single dataset for import:

...

Tip

Tip: If present, you can click the Add to new flow checkbox, which adds the imported datasets to an Untitled flow.

  • Click Continue. The dataset is imported. 
  • A recipe is created for it, added to a new flow, and loaded in the Transformer page for wrangling.

...

...

  1. To import the selected datasets, click Import Datasets. The imported datasets are created. You can begin working with these imported datasets now or at a later time. 
  2. To import the selected datasets and add them to a flow:
    1. Click the Add Dataset to a Flow checkbox. 
    2. Click the textbox to see the available flows, or start typing a new name. 
    3. Click Import & Add to Flow
    4. The datasets are imported, and the associated recipes are created. These datasets and recipes are added to the selected flow. 
    5. For any dataset that has been added to a flow, you can review and perform actions on it. See Flow View Page.

...

Import Multiple Datasets

...

Multiple datasets

You can import multiple datasets from multiple sources at the same time. In the Import Data pageData page, continue selecting sources, and additional dataset cards are added to the right panel.

Info

NOTE: If you are importing from multiple files at the same time, the files are not necessarily read in a regular or predictable order. Avoid using functions such as SOURCEROWNUMBER, which relies on original row numbers. See SOURCEROWNUMBER Function


Info

NOTE: When you import a dataset with parameters from multiple files, only the first matching file is displayed in the right panel.

In the right panel, you can see a preview of each dataset and make changes as needed.

...

D caption
typefigure
Import Multiple Datasets

 

...


If you have selected multiple datasets for import:

Tip

Tip: If present, you can click the Add to new flow checkbox, which adds the imported datasets to an Untitled flow. For more information, see Flow View Page.

  • To import the selected datasets, click Continue

    • To begin transforming one of these datasets in Flow View, select it. From its context menu, select Add new recipe. Select the recipe. In the context panel on the right, select Edit Recipe. See Transformer Page.

  • To remove a dataset from import, click the X in the dataset card.

D s also
inCQLtrue
label((label = "dataset_ui") OR (label = "import_dataset"))