Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

See Job Results Page.

Publishing Actions

 

You can add, remove, or edit the outputs generated from this job. By default, a CSV output for your home directory on the selected datastore is included in the list of destinations, which can be removed if needed. You must include at least one output destination. 

...

From the available datastores in the left column, select the target for your publication. 

Image Modified

D caption
typefigure
Add Publishing Action
Info

NOTE: Do not create separate publishing actions that apply to the same file or database table.

New/Edit: You can create new or modify existing connections. See Create Connection Window.

 

Steps:

  1. Select the publishing target. Click an icon in the left column.
    1. If Hive publishing is enabled, you must select or specify a database table to which to publish.

      Depending on the running environment, results are generated in Avro or Parquet format. See below for details on specifying the action and the target table.

      If you are publishing a wide dataset to Hive, you should generate results using Parquet.

      For more information on how data is written to Hive, see Hive Data Type Conversions.

  2.  

    Locate a publishing destination: Do one of the following.

    1. Explore: 

      Info

      NOTE: The publishing location must already exist before you can publish to it. The publishing user must have write permissions to the location.

      Info

      NOTE: If your HDFS environment is encrypted, the default output home directory for your user and the output directory where you choose to generate results must be in the same encryption zone. Otherwise, writing the job results fails with a Publish Job Failed error. For more information on your default output home directory, see User Profile Page.

       

      1. To sort the listings in the current directory, click the carets next to any column name.
      2. For larger directories, browse using the paging controls.
      3. Use the breadcrumb trail to explore the target datastore. Navigate folders as needed.
    2. Search: Use the search bar to search for specific locations in the current folder only.
    3. Manual entry: Click the Edit icon to manually edit or paste in a destination.
  3. Choose an existing file or folder: When the location is found, select the file to overwrite or the folder into which to write the results.

    Info

    NOTE: You must have write permissions to the folder or file that you select.

    1. To write to a new file, click Create a new file

    Create a new file: See below.

  4. Create Folder: Depending on the storage destination, you can click it to create a new folder for the job inside the currently selected oneDo not include spaces in your folder name.
  5. To save the publishing destination, click Save Settings.

...

When you generate file-based results, you can configure the filename, storage format, compression, number of files, and the updating actions in the right-hand panel.

Image Modified

D caption
typefigure
Output File Settings

...

  1. Select location: Navigate the Redshift browser to select the schema and table to which to publish.
    1. To create a new table, click Create a new table.
  2. Select table options:
    1. Table name:
      1. New table: enter a name for it. You may use a pre-existing table name, and schema checks are performed against it.
      2. Existing table: you cannot modify the name.
    2. Output database: To change the database to which you are publishing, click the Redshift icon in the sidebar. Select a different database.

    3. Publish actions: Select one of the following.
      1. Create new table every run: Each run generates a new table with a timestamp appended to the name.
      2. Append to this table every run: Each run adds any new results to the end of the table.
      3. Truncate the table every run: With each run, all data in the table is truncated and replaced with any new results.
      4. Drop the table every run: With each run, the table is dropped (deleted), and all data is deleted. A new table with the same name is created, and any new results are added to it.
  3. To save the publishing action, click Save Settings.

 

Run Job

To execute the job as configured, click Run Job. The job is queued for execution.

...