Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space DEV and version next

...

D s product
rtrue
 can directly import Above® Adobe® Acrobat® PDF files containing one or more tables. The tables of a PDF can be imported as:

...

  1. In the menu bar, click Library
  2. In the Library page, click Import Data. Select the connection to use. See Import Data Page.

    D caption
    typefigure
    Import PDF file containing multiple pages
  3. After you select the file, it is uploaded and converted to into individual CSV files for each page in the PDF file and then stored by the platform. Depending on the size of the file, this process may take a while.
  4. By default, all pages in the PDF are imported as individual datasets. To change how the data is imported, click Edit in the right panel.

    D caption
    Import settings for PDF datasets
  5. Dataset creation:
    1. 1 dataset per table: (Default) Each selected table in the PDF is imported as a separate dataset. 
      Specify the base name of the datasets that you are creating. If you are creating a single dataset, the name of the PDF file is used. 
    2. Selected tables into 1 dataset: All selected tables in the PDF are combined and imported as a single dataset.

      Info

      NOTE: The schemas of each dataset must match. Columns must be listed in the same order in each dataset. The column headers are taken from the first selected dataset.

    3. All and future tables into 1 dataset: If the PDF is updated periodically with new tables that you would like to add in the future, select this option. After initial selection of the tables to include, all PDF pages that are added to the PDF file in the future are automatically added as part of the imported dataset.

      Info

      NOTE: This option is available only if you are connected to a backend file storage system.

      Info

      NOTE: When an imported dataset based on this option is first loaded into the Transformer page, the data grid displays an initial sample taken from rows in the first table only. When you take another sample from the Samples panel, data is collected from other tables. For more information, see Samples Panel.

  6. Selected tables: 
    1. You can select the tables to import. A table can be a single page, or a single table among multiple on a page.

      Info

      NOTE: If you are importing a folder of PDF files, data preview and initial sampling are executed against the first file found in the folder.

    2. To preview the data of an individual table, mouse over a dataset and click Jump to.

  7. Remove special characters from column names: Select this option to remove any special characters from the inferred column headers during import.
  8. You can also choose how to detect column headers from each imported table. 
  9. To save changes, click Save.
  10. After your datasets have been added, you can edit the name and description information for each in the right navigation panel.
  11. Optionally, you can assign the new dataset(s) to an existing flow or create a new one to contain them.

...