Page tree

 

Contents:


Create a jobGroup, which launches the specified job as the authenticated user.

The request specification depends on one of the following conditions:

  • Dataset has already had a job run against it and just needs to be re-run.
  • Dataset has not had a job run, or the job definition needs to be re-specified.

NOTE: In this release, you cannot execute jobs sourced from datasets in Redshift or SQL DW or publish to these locations via the API. This known issue will be fixed in a future release.

Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See API Authentication.

Request

Request Type: POST

Endpoint:

/v3/jobGroups

Request Body - Re-run job:

If you are re-running a job that has already executed and do not need to modify any job settings, you can use the following simplified body to launch it:

{
  "wrangledDataset": {
    "id": 7
  }
}

Request Body - Specify job:

If you are specifying a new job or must re-run a job with new settings, you must include a version of the following request body. Required parameters are listed below:

{
  "wrangledDataset": {
    "id": 1
  },
  "overrides": {
    "execution": "photon",
    "profiler": false,
    "writesettings": [
      {
        "path": "hdfs://hadoop:50070/trifacta/queryResults/admin@trifacta.local/cdr_txt.csv",
        "action": "create",
        "format": "csv",
        "compression": "none",
        "header": false,
        "asSingleFile": false
      }
    ]
  },
  "ranfrom": "cli"
}

Response

Response Status Code - Success: 201 - Created

Response Body Example:

{
  "jobgroupId": 3,
  "jobIds": [
    5,
    6
  ],
  "reason": "JobStarted",
  "sessionId": "9c2c6220-ef2d-11e6-b644-6dbff703bdfc"
}

Reference

Request Reference:

PropertyDescription
wrangledDataset
(required) Internal identifier for the object whose results you wish to generate. The recipes of all preceding datasets on which this dataset depends are executed as part of the job.
overrides.execution

(required, if first time running the job) Indicates the running environment on which the job is executed. Accepted values:

  • photon
  • spark - Spark job on the integrated Hadoop cluster
  • databricksSpark - Spark implementation on Azure Databricks

For more information, see Running Environment Options.

overrides.profiler

(required, if first time running the job) When set to true, a visual profile of the job is generated as specified by the profiling options for the platform. See Profiling Options.

overrides.writesettings(required, if first time running the job) These settings define the publishing options for the job. See below.
ranfrom

(optional) If this value is set to null, then the job does not show up in the Job Results page.

If set to cli, the job appears as a CLI job.

See Job Results Page.

writesettings Reference:

The writesettings values allow you to specify aspects of the publication of results to the specified path location.

NOTE: writesettings values are required if you are running this specified job for the dataset for the first time.

NOTE: To specify multiple outputs, you can include additional writesettings objects in the request. For example, if you want to generate output to csv and json, you can duplicate the writesettings object for csv and change the format value in the second one to json.

These settings correspond to values that you can apply through the UI or through the command line interface.

PropertyDescription
path(required) The fully qualified path to the output location where to write the results
action

(required) If the output file or directory exists, you can specify one of the following actions:

  • create - Create a new, parallel location, preserving the old results.
  • append - Add the new results to the old results.
  • overwrite - Replace the old results with the new results.
format

(required) Output format for the results. Specify one of the following values:

  • csv
  • json
  • avro
  • pqt

NOTE: To specify multiple output formats, create additional writesettings object for each output format.

compression(optional) For csv and json results, you can optionally compress them using bzip2 or gzip compression. Default is none.
header(optional) For csv results with action set to create or append, this value determines if a header row with column names is inserted at the top of the results. Default is false.
asSingleFile(optional) For csv and json results, this value determines if the results are concatenated into a single file or stored as multiple files. Default is false.

This page has no comments.