NOTE: If you need to make changes for purposes of a specific job run, you can add overrides to the request for the job. These overrides apply only for the current job. For more information, see API JobGroups Create v4. |
Create the outputobject for the recipe.
To begin, you need the internal identifier for the recipe.
NOTE: In the APIs, a recipe is identified by its internal name, a wrangled dataset. |
Request:
Endpoint | http://www.wrangle-dev.example.com:3005/v4/wrangleddatasets |
---|---|
Authentication | Required |
Method | GET |
Request Body | None. |
Response:
Status Code | 200 - OK | |
---|---|---|
Response Body |
|
cURL example:
curl -X GET \ http://www.wrangle-dev.example.com:3005/v4/connections \ -H 'authorization: Basic <auth_token>' \ -H 'cache-control: no-cache' |
Checkpoint: In the above, let's assume that the recipe identifier of interest is |
For more information, see API Connections Get v4.
Create the outputobject and associate it with the recipe identifier. In the following request, the wrangledDataset identifier that you retrieved in the previous call is applied as the flowNodeId
value.
The following example includes an embedded writesettings
object, which generates a CSV file output. You can remove this embedded object if desired, but you must create a writesettings
object before you can generate an output.
Request:
Endpoint | http://www.wrangle-dev.example.com:3005/v4/outputobjects | |
---|---|---|
Authentication | Required | |
Method | POST | |
Request Body |
|
Response:
Status Code | 201 - Created | |
---|---|---|
Response Body |
|
cURL example:
curl -X POST \ http://www.wrangle-dev.example.com/v4/outputobjects \ -H 'authorization: Basic <auth_token>' \ -H 'cache-control: no-cache' \ -H 'content-type: application/json' \ -d '{ "execution": "photon", "profiler": true, "isAdhoc": true, "writeSettings": { "data": [ { "delim": ",", "path": "hdfs://hadoop:50070/trifacta/queryResults/admin@example.com/POS_01.avro", "action": "create", "format": "avro", "compression": "none", "header": false, "asSingleFile": false, "prefix": null, "suffix": "_increment", "hasQuotes": false } ] }, "flowNode": { "id": 11 } }' |
Checkpoint: You've created an outputobject ( |
Now that outputs have been defined for the recipe, you can just execute a job on the specified recipe flowNodeId=11
:
Request:
Endpoint | http://www.wrangle-dev.example.com:3005/v4/jobGroups | |
---|---|---|
Authentication | Required | |
Method | POST | |
Request Body |
|
Response:
Status Code | 201 - Created | |
---|---|---|
Response Body |
|
NOTE: To re-run the job against its currently specified outputs, writesettings, and publications, you only need the recipe ID. If you need to make changes for purposes of a specific job run, you can add overrides to the request for the job. These overrides apply only for the current job. For more information, see API JobGroups Create v4. |
To track the status of the job:
status
field by querying the specific job. For more information, see API JobGroups Get v4.Checkpoint: You've run a job, generating one output in Avro format. |
Suppose you want to create another file-based output for this outputobject. You can create a second writesettings object, which publishes the results of the job run on the recipe to the specified location.
The following example creates settings for generating a parquet-based output.
Request:
Endpoint | http://www.wrangle-dev.example.com:3005/v4/writesettings/ | |
---|---|---|
Authentication | Required | |
Method | POST | |
Request Body |
|
Response:
Status Code | 201 - Created | |
---|---|---|
Response Body |
|
cURL example:
curl -X POST \ http://www.wrangle-dev.example.com/v4/writesettings \ -H 'authorization: Basic <auth_token>' \ -H 'cache-control: no-cache' \ -H 'content-type: application/json' \ -d '{ "delim": ",", "path": "hdfs://hadoop:50070/trifacta/queryResults/admin@example.com/POS_r03.pqt", "action": "create", "format": "pqt", "compression": "none", "header": false, "asSingleFile": false, "prefix": null, "suffix": "_increment", "hasQuotes": false, "outputObject": { "id": 4 } } |
Checkpoint: You've added a new writesettings object and associated it with your outputobject ( |
To generate a publication, you must identify the connection through which you are publishing the results.
Below, the request returns a single connection to Hive (id=1
).
Request:
Endpoint | http://www.wrangle-dev.example.com:3005/v4/connections |
---|---|
Authentication | Required |
Method | GET |
Request Body | None. |
Response:
Status Code | 200 - OK | |
---|---|---|
Response Body |
|
cURL example:
curl -X GET \ http://www.wrangle-dev.example.com/v4/connections \ -H 'authorization: Basic <auth_token>' \ -H 'cache-control: no-cache' \ -H 'content-type: application/json' |
For more information, see API Connections Get List v4.
You can create publications that publish table-based outputs through specified connections. In the following, a Hive table is written out to the default
database through connectionId = 1. This publication is associated with the outputObject id=4.
Request:
Endpoint | http://www.wrangle-dev.example.com:3005/v4/publications | |
---|---|---|
Authentication | Required | |
Method | POST | |
Request Body |
|
Response:
Status Code | 201 - Created | |
---|---|---|
Response Body |
|
cURL example:
curl -X POST \ http://latest-dev.trifacta.net:3005/v4/publications \ -H 'authorization: Basic <auth_token>' \ -H 'cache-control: no-cache' \ -H 'content-type: application/json' \ -d '{ "path": [ "default" ], "tableName": "myPublishedHiveTable", "targetType": "hive", "action": "create", "outputObject": { "id": 4 }, "connection": { "id": 1 } }' |
Checkpoint: You're done. |
You have done the following:
You can now generate results for these three different outputs whenever you run a job (create a jobgroup) for the associated recipe.