After you have run a job to generate results, you can publish those results to different targets as needed. This section describes how to automate those publishing steps through the APIs.
NOTE: This workflow applies to re-publishing job results after you have already generated them. |
NOTE: After you have generated results and written them to one target, you cannot publish to the same target. You must configure the outputs to specify a different format and location and then run a new job. |
In the application, you can publish after generating results. See Publishing Dialog.
For each target, you must have access to create a connection to it. After a connection is created, it can be reused, so you may find it easier to create them through the application.
NOTE: Connections created through the application must be created through the Connections page, which is used for creating read/write connections. Do not create these connections through the Import Data page. See Connections Page. |
Before you publish results to a different datastore, you must generate results and store them in HDFS.
NOTE: To produce some output formats, you must run the job on the Spark running environment. |
In the examples below, the following example data is assumed:
Identifier | Value |
---|---|
jobId | 2 |
flowId | 3 |
wrangledDatasetId (also flowNodeId) | 10 |
For more information on running a job, see API JobGroups Create v4.
For more information on the publishing endpoint, see API JobGroups Put Publish v4.
The following uses the Avro results from the specified job (jobId = 2) to publish the results to the test_table
table in the default
Hive schema through connectionId=1.
NOTE: To publish to Hive, the targeted database is predefined in the connection object. For the |
Request:
Endpoint | http://www.wrangle-dev.example.com:3005/v4/jobGroups/2/publish | |
---|---|---|
Authentication | Required | |
Method | PUT | |
Request Body |
|
Response:
Status Code | 200 - OK | |
---|---|---|
Response Body |
|
The following uses the TDE results from the specified job (jobId = 2) to publish the results to the test_table3
table in the default
Tableau Server database through connectionId=3.
Request:
Endpoint | http://www.wrangle-dev.example.com:3005/v4/jobGroups/2/publish | |
---|---|---|
Authentication | Required | |
Method | PUT | |
Request Body |
|
Response:
Status Code | 200 - OK | |
---|---|---|
Response Body |
|
The following uses the Avro results from the specified job (jobId = 2) to publish the results to the test_table2
table in the public
Redshift schema through connectionId=2.
NOTE: To publish to Redshift, the targeted database is predefined in the connection object. For the |
Request:
Endpoint | http://www.wrangle-dev.example.com:3005/v4/jobGroups/2/publish | |
---|---|---|
Authentication | Required | |
Method | PUT | |
Request Body |
|
Response:
Status Code | 200 - OK | |
---|---|---|
Response Body |
|
The following uses the Parquet results from the specified job (jobId = 2) to publish the results to the test_table4
table in the dbo
SQL DW database through connectionId=4.
Request:
Endpoint | http://www.wrangle-dev.example.com:3005/v4/jobGroups/2/publish | |
---|---|---|
Authentication | Required | |
Method | PUT | |
Request Body |
|
Response:
Status Code | 200 - OK | |
---|---|---|
Response Body |
|
When you are publishing results to a relational source, you can apply overrides to the job to redirect the output or change the action applied to the target table. For more information, see API Workflow - Run Job.