Through the Import Data page, you can upload datasets or select datasets from sources that are stored on connected datastores. From the Library for Data page, click Import Data.
Import Data page
NOTE: For file-based sources, the
|D s minrows|
File and path limitations:
- The colon character (
:) cannot appear in a filename or a file path.
- Filenames cannot begin with special characters like dot (
.) or underscore(
- Input file or table paths can have a maximum length of 1024 characters.
1. Connect to sources
During import, the
identifies file formats based on the extension of the filename.
D s webapp type Portal
- Compressed files are recognized and can be imported based on their file extensions.
- Filenames that do not have an extension are treated as TXT files.
Tip: You can drag and drop files from your desktop to to upload them.
NOTE: You can upload a file up to 100 MB 1 GB in size.
NOTE: When you upload an updated version of a previously uploaded file, the new file is stored as a separate upload altogether. Where the imported dataset based on the previous version is used, you must swap out the old dataset to point to the new one.
For more information on the supported input formats, see Supported File Formats.
You can hover over the name of a file to preview its contents.
NOTE: Preview may not be available for some sources, such as Parquet.
Click the Plus icon next to the directory or filename to add it as a dataset.
Tip: You can import multiple datasets at the same time. See below.
Excel files: Click the Plus icon next to the parent workbook to add all of the worksheets as a single dataset, or you can add individual sheets as individual datasets. See Import Excel Data
To show hidden files or folders, select Show hidden.
NOTE: Hidden folder names begin with a dot (
Tip: If you have run a job with profiling enabled, you can import your profile files as datasets and then publish them to other datastores, such as BigQuery, for additional analysis. These files are stored in the
3. Configure selections
When a dataset has been selected, the following fields appear on the right side of the screen. Modify as needed:
- Dataset Name: This name appears in the interface.
- Dataset Description: You may add an optional description that provides additional detail about the dataset. This information is visible in some areas of the interface.
Tip: Click the Eye icon to inspect the contents of the dataset prior to importing.
Tip: You can select a single dataset or multiple datasets for import.
You can modify settings used during import for individual files. In the card for edit any additional or optional settings for an individual dataset, click Edit Settings.
NOTE: In some cases, there may be discrepancies between row counts in the previewed data versus the data grid after the dataset has been imported, due to rounding in row counts performed in the preview.
- Per-file encoding: By default,
attempts to interpret the encoding used in the file. In some cases, the data preview panel may contain garbled data, due to a mismatch in encodings. In the Data Preview dialog, you can select a different encoding for the file. When the correct encoding is selected, the preview displays the data as expected.
D s product
- Detect structure: By default,
attempts to interpret the structure of your data during import. This structuring attempts to apply an initial tabular structure to the dataset.
D s product
- Unless you have specific problems with the initial structure, you should leave the Detect structure setting enabled. Recipes created from these imported datasets automatically include the structuring as the first, hidden steps. These steps are not available for editing, although you can remove them through the Recipe panel. See Recipe Panel.
- When detecting structure is disabled, imported datasets whose schema has not been detected are labeled, unstructured datasets. When recipes are created for these unstructured datasets, the structuring steps are added into the recipe and can be edited as needed.
- For more information, see Initial Parsing Steps.
Remove special characters from column names: When selected, characters that are not alphanumeric or underscores are stripped, and space characters are converted to underscores.
For more information, see Sanitize Column Names.
4. Import Selections
If you have selected a single dataset for import:
- To immediately wrangle it, click Import & Wrangle. The dataset is imported. A recipe is created for it, added to a flow, and loaded in the Transformer page for wrangling. See Transformer Page.
- To import the dataset, click Import. The imported dataset is created. You can add it to a flow and create a recipe for it later. See Library Page.
You can import multiple datasets from multiple sources at the same time. In the Import Data page, continue selecting . Perform the following:
- Click Edit Settings from the card for an individual dataset in the right panel. The dialog box is displayed.
- In the dialog box, select the required options and modify the settings.
- File Import Settings: For more information, see File Import Settings.
4. Import selections
You can import one or more datasets. Continue selecting sources, and additional dataset cards are added to the right panel.
In the right panel, you can see a preview of each dataset and make changes as needed.
ImportImporting Multiple Datasets
- To import the selected datasets, click Import Datasets. The imported datasets are created. You can begin working with these imported datasets now or at a later time. If you are not wrangling the datasets immediately, the datasets you just imported are listed at the top of the Library page. See Library Page.
- To import the selected datasets and add them to a flow:
- Click the Add Dataset to a Flow checkbox.
- Click the textbox to see the available flows, or start typing a new name.
- Click Import & Add to Flow.
Create or select the flow to which to add them:
Add imported datasets to a flow
- Filter by Type: Click one of the tabs in the dialog to display only the applicable flows.
- Search: Start typing letters to filter the list of flows.
- Create new flow: Enter a name and description for the new flow to which to add the datasets.
- To add the datasets to the selected flow, click Add.
- The datasets are imported, and the associated recipes are created. These datasets and recipes are added to the selected flow.
- For any dataset that has been added to a flow, you can review and perform actions on it. See Flow View Page.
- To remove a dataset from import, click the X in the dataset card.
|D s also|