In the Library page, you can review your imported and reference datasets and any macros that you may have created. You can also import new data from this page. |
NOTE: You can only see the imported datasets to which you have access in your currently selected project or workspace. If the data underlying the imported dataset is not available, the imported dataset is still listed in the Library page, since it is just a reference to the data. |
Library Page |
To create a new imported dataset, click Import Data. See Import Data Page.
For large relational or Parquet datasets, you can monitor the import process through the Library page.
For more information, see Overview of Job Monitoring.
Click one of the pre-defined filters to show datasets of the following types:
All Data: All imported datasets or references available to the current user.
Imported Datasets: Datasets that you have imported into .
References: Objects that you have created from your recipes that can be referenced in another flow as a dataset.
Macros: Sequences of steps that can be reused in other recipes. See Macros Page.
Filter by ownership:
For the selected object type, you can filter based on the ownership of the object:
Actions:
Object Actions:
Hover over an object to reveal these actions on the right side of the screen.
Preview: Inspect a preview of the dataset.
NOTE: Preview is not available for binary format sources. |
Make a copy: Create a copy of the imported dataset. This option is not available for reference datasets.
Edit data settings: If the source of the imported dataset required conversion to an internally supported format, you can modify settings related to that conversion process. For more information, see File Import Settings.
Tip: This setting applies primarily to binary file formats, such as PDF and Excel, or file formats that may require additional steps to convert into tabular data, such as JSON. |
Delete Dataset: Delete the dataset.
Deleting a dataset cannot be undone. |
Settings: Modify settings for the imported dataset.
Refresh Dataset: If available, this option refreshes the dataset's metadata with the latest source schema.
NOTE: If you attempt to refresh the schema of a parameterized dataset based on a set of files, only the schema for the first file is checked for changes. If changes are detected, the other files are contain those changes as well. This can lead to changes being assumed or undetected in later files and potential data corruption in the flow. |
For more information, see Overview of Schema Management.