Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

You can create a dataset from one or more files stored in ADLS.

Wildcards:

You can parameterize your input paths to import source files as part of the same imported dataset. For more information, see Overview of Parameterization

Folder selection:

When you select a folder in ADLS to create your dataset, you select all files in the folder to be included. Notes:

  • This option selects all files in all sub-folders. If your sub-folders contain separate datasets, you should be more specific in your folder selection.
  • All files used in a single dataset must be of the same format and have the same structure. For example, you cannot mix and match CSV and JSON files if you are reading from a single directory. 
  • When a folder is selected from ADLS, the following file types are ignored:
    • *_SUCCESS and *_FAILED files, which may be present if the folder has been populated by HDI.
    • If you have stored files in ADLS that begin with an underscore (_), these files cannot be read during batch transformation and are ignored. Please rename these files through ADLS so that they do not begin with an underscore. 

...

NOTE:
Warning

If your deployment is using ADLS, do not use the trifacta/uploads directory. This directory is used for storing uploads and metadata, which may be used by multiple users. Manipulating files outside of the

D s webapp
can destroy other users' data. Please use the tools provided through the interface for managing uploads from ADLS.

Info

Users can specify a default output home directory and, during job execution, an output directory for the current

...

job

...

.

Access to results:

Depending on how the platform is integrated with ADLS, other users may or may not be able to access your job results.

  • If user impersonation mode is enabled, results are written to ADLS through the ADLS account configured for your use. Depending on the permissions of your ADLS account, you may be the only person who can access these results.

  • If user impersonation mode is not enabled, then each 
    D s item
    user
    user
     writes results to ADLS using a shared account. Depending on the permissions of that account, your results may be visible to all platform users.

...