Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

PropertyDescription
wrangledDataset
(required) Internal identifier for the object whose results you wish to generate. The recipes of all preceding datasets on which this dataset depends are executed as part of the job.
overrides.execution

(required, if first time running the job) Indicates the running environment on which the job is executed. Accepted values:

  • photon
  • spark - Spark job on the integrated Hadoop cluster
  • databricksSpark - Spark implementation on Azure Databricks

For more information, see Running Environment Options.

overrides.profiler

(required, if first time running the job) When set to true, a visual profile of the job is generated as specified by the profiling options for the platform. See Profiling Options.

overrides.writesettings(required, if first time running the job) These settings define the publishing options for the job. See below.
ranfrom

(optional) If this value is set to null, then the job does not show up in the Job Results page.

If set to cli, the job appears as a CLI job.

See Job Results Page.

...

NOTE: Parquet format requires execution on a Hadoop running environment ( overrides.execution must be spark).

PropertyDescription
path(required) The fully qualified path to the output location where to write the results
action

(required) If the output file or directory exists, you can specify one of the following actions:

  • create - Create a new, parallel location, preserving the old results.
  • append - Add the new results to the old results.
  • overwrite - Replace the old results with the new results.
format

(required) Output format for the results. Specify one of the following values:

  • csv
  • json
  • avro
  • pqt
     

Info
Info

NOTE: To specify multiple output formats, create additional writesettings object for each output format.

compression(optional) For csv and json results, you can optionally compress them using bzip2 or gzip compression. Default is none.header(optional) For csv results with action set to create or append, this value determines if a header row with column names is inserted at the top of the results. Default is false.asSingleFile(optional) For csv and json results, this value determines if the results are concatenated into a single file or stored as multiple files. Default is false.