In addition to CSV and other formats, can directly import Microsoft® Excel® workbooks and folders containing workbooks. The worksheets of a workbook can be imported as:
A dataset with parameters
NOTE: When importing as a parameterized dataset, all selected worksheets are imported into a single dataset.
Source Excel files with cells bracketed by single double quotes may not be properly ingested if any terminating quotes are missing.
Tip: You can check the data quality bars for mismatched values or, for strings, the data histogram bars for anomalous values to see if the above issue is present. If so, deselect Detect Structure on import. Then, use a Split rows transformation applied to the affected column to break up the column as needed.
If loading your Excel-based dataset in the Transformer page results in a blank screen, please take a new sample. The file requires conversion again with each generated sampling.
NOTE: When you share a flow that contains a dataset sourced from Microsoft Excel, the user with whom the flow is shared may receive a
When Excel data is imported into , each sheet in an imported file must be converted to a CSV and then ingested for use. The product's conversion service supports two methods:
|Apache POI (Java)||(default) This new method improves performance, scalability, and reliability of the Excel conversion process.|
|Python||The Python method has been available in the product for some time.|
Tip: The Java-based Apache POI conversion method is recommended. However, if you encounter issues while using this method, you may choose to use the following settings to enable the Python method.
Administrators can modify the following settings to enable either conversion method.
|Enable Apache POI based converter for Excel data conversion|
Set this value to
Set this value to
|Detect maximum column count in XLSX sheet|
When the above setting is enabled, set this value to
This setting does not apply to files in XLS format.
In the Library page, click Import Data. Select the connection to use. See Import Data Page.
Import Excel workbook
Tip: If you experience issues uploading large XLS/XLSX files, you can convert the files to CSV files and then upload them.
By default, all worksheets in the workbook are imported as individual datasets. To change how the data is imported, click Edit in the right panel.
Import settings for Excel datasets
Selected sheets into 1 dataset: All selected sheets in the workbook are combined and imported as a single dataset.
NOTE: The schemas of each dataset must match. Columns must be listed in the same order in each dataset. The column headers are taken from the first selected dataset.
All and future sheets into 1 dataset: If the workbook is updated periodically with new sheets that you would like to add in the future, select this option. After initial selection of sheets, all sheets that are added to the workbook in the future are automatically added as part of the imported dataset.
Tip: Use this option to capture future additional sheets or changes to the names of the current sheets.
NOTE: When an imported dataset based on this option is first loaded into the Transformer page, the data grid displays an initial sample taken from rows in the first sheet only. When you take another sample from the Samples panel, data is collected from other sheets. For more information, see Samples Panel.
NOTE: This option is available only if you are connected to a backend file storage system.
You can select the sheets to import.
NOTE: If you are importing a folder of Excel files, data preview and initial sampling are executed against the first file found in the folder.
To preview the data of an individual sheet, mouse over a dataset and click Jump to.