Skip to main content

File Import Settings

When you edit settings on a selected file in the Import Data page, the following settings are displayed.

You can edit any additional or optional settings for an individual dataset. Perform the following:

  1. Click Edit Settings from the card for an individual dataset in the right panel. The dialog box is displayed.

  2. In the dialog box, select the required options and modify the settings.

Per-file encoding

By default, the Dataprep by Trifacta platform applies a specified encoding type on the imported the file. In some cases, the data preview panel may contain garbled data, due to a mismatch in encodings. In the Data Preview dialog, you can select a different encoding for the file. When the correct encoding is selected, the preview displays the data as expected.

Note

Assessing the file encoding type based on parsing an input file is not an accurate method. Instead, the Dataprep by Trifacta platform assumes that the file is encoded in the default encoding. If it is not, you should change the encoding type for the file.

Note

In some cases, imported files are not properly parsed due to issues with encryption types or encryption keys in the source datastore. For more information, please contact your datastore administrator.

Detect structure

By default, the Dataprep by Trifacta platform attempts to interpret the structure of your data during import. This structuring attempts to apply an initial tabular structure to the dataset.

  • Unless you have specific problems with the initial structure, you should leave the Detect structure setting enabled.

    Tip

    When Detect structure is enabled for an imported dataset, the structure of the file can be checked for changes during job execution.

  • When detecting structure is disabled, imported datasets whose schema has not been detected are labeled, unstructured datasets.

  • When you create a recipe using the imported dataset:

    • When Detect structure is enabled, an initial set of parsing steps is applied to the data as the first steps of your recipe. These steps are hidden from view.

    • When Detect structure is disabled, the initial parsing steps are inserted as the first steps of your recipe, where you can modify them to fit the structure of the data.

    • For more information, see Recipe Panel.

  • For more information, see Initial Parsing Steps.

Remove special characters from column names

When selected, characters that are not alphanumeric or underscores are stripped, and space characters are converted to underscores.

Selecting column headers

You can apply the column headers to your datasets during import. Select the required option from the drop-down list:

  • Infer header: (default) When selected, the Trifacta Application infers the header based on the data in the import.

  • Use first row as header: When selected, the first row is used as the column headers.

  • No header: When selected, the inference is ignored and column headers are defined using generic names with no headers.

If replacing a file:

  • If you replace a dataset and select the Use first row as header option, then the existing header row labels are updated with the new headers.

  • Pre-existing transformation steps may be broken if the headers are changed by a replaced file.