Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Create a new file: Enter the filename to create. A filename extension is automatically added for you, so you should omit the extension from the filename.
  2. Output directory: Read-only value for the current directory. 
    1. To change it, navigate to the proper directory.

       

  3. Data Storage Format: Select the output format you want to generate for the job.
    1. Avro: 

      This format is used to support data serialization within a Hadoop environment.
    2. CSV and JSON: These formats are supported for all types of imported datasets and all running environments. 

      Info

      NOTE: JSON-formatted files that are generated by

      D s product
      are rendered in JSON Lines format, which is a single line per-record variant of JSON. For more information, see http://jsonlines.org.


    3. Parquet: This format is a columnar storage format.
    4. TDE: Choose TDE (Tableau Data Extract) to generate results that can be imported into Tableau.

      If you have created a Tableau Server connection, you can publish the results directly into Tableau Server after they have been generated.

      Info

      NOTE: If you encounter errors generating results in TDE format, additional configuration may be required. See Supported File Formats.


    5. For more information, see Supported File Formats.
  4. Publishing action: Select one of the following:

    Info

    NOTE: If multiple jobs are attempting to publish to the same filename, a numeric suffix (_N) is added to the end of subsequent filenames (e.g. filename_1.csv).

     

     

     

    1. Create new file every run: For each job run with the selected publishing destination, a new file is created with the same base name with the job number appended to it (e.g. myOutput_2.csvmyOutput_3.csv, and so on). 
    2. Append to this file every run: For each job run with the selected publishing destination, the same file is appended, which means that the file grows until it is purged or trimmed.

      Info

      NOTE: When publishing single files to S3 or WASB, the append action is not supported.


      Info

      NOTE: When appending data into a Hive table, the columns displayed in the Transformer page must match the order and data type of the columns in the Hive table.


      Info

      NOTE: This option is not available for outputs in TDE format.


      Info

      NOTE: Compression of published files is not supported for an append action.


    3. Replace this file every run: For each job run with the selected publishing destination, the existing file is overwritten by the contents of the new results.
  5. More Options:

    1. Include headers as first row on creation: For CSV outputs, you can choose to include the column headers as the first row in the output. For other formats, these headers are included automatically.

      Info

      NOTE: Headers cannot be applied to compressed outputs.


    2. Include quotes: For CSV outputs, you can choose to include double quote marks around all values, including headers.

    3. Delimiter: For CSV outputs, you can enter the delimiter that is used to separate fields in the output. The default value is the global delimiter, which you can override on a per-job basis in this field.

      Tip

      Tip: If needed for your job, you can entire Unicode characters in the following format: \uXXXX.


      Info

      NOTE: The Spark running environment does not support use of multi-character delimiters for CSV outputs. You can switch your job to a different running environment or use single-character delimiters. For more information on this issue, see https://issues.apache.org/jira/browse/SPARK-24540.


    4. Single File: Output is written to a single file.

    5. Multiple Files: Output is written to multiple files.
  6. Compression: For text-based outputs, compression can be applied to significantly reduce the size of the output. Select a preferred compression format for each format you want to compress.

    Info

    NOTE: If you encounter errors generating results using Snappy, additional configuration may be required. See Supported File Formats.


  7. To save the publishing action, click Add.

...