Page tree

Trifacta Dataprep


Contents:

On April 28, 2021, Google is changing the required permissions for attaching IAM roles to service accounts. If you are using IAM roles for your Google service accounts, please see Changes to User Management.

   

For the latest updates on available API endpoints and documentation, see api.trifacta.com.

Feature Availability: This feature is available in Cloud Dataprep Premium by TRIFACTA® INC.

Contents:


This section describes how to run a plan using the APIs available in  Cloud Dataprep by TRIFACTA® INC..

  • A plan is a scheduled sequence of tasks based on a trigger that you define. 
    • When a plan is executed via API, the request is the trigger, and the plan is executed immediately. 
  • Plans can be designed in the Trifacta application. For more information, see Plans Page.
  • For more information on plans in general, see Overview of Operationalization.

A note about API URLs:

In the listed examples, URLs are referenced in the following manner:

<protocol>://<platform_base_url>/

In your product, these map references map to the following:

<http or https>://<hostname>:<port_number>/

For more information, see API Reference.

Pre-requisites

Before you begin, you should verify the following:

  1. Get authentication credentials. As part of each request, you must pass in authentication credentials to the platform. 

    Tip: The recommended method is to use an access token, which can be generated from the Trifacta application. For more information, see Access Tokens Page.

    For more information, see  Cloud Dataprep by TRIFACTA INC. API Reference docs: Premium | Standard

  2. Verify plan and its flows and outputs:
    1. You must create a plan first. See Plan View Page.
    2. As part of creating that plan, you must verify that all referenced flows and output objects are properly defined and can be executed independently. 

      NOTE: In a flow, all recipes that you wish to have executed by the corresponding task must have a defined output object. For each output object, you must create at least one write settings or publication object. During plan runs, these objects are not validated, and tasks fail without them.

    3. Any applicable parameters are applied to the tasks at the time of execution. Parameter overrides are not supported in plans.
    4. See Flow View Page. 
  3. Verify plan execution. Run the desired plan through the Trifacta application and verify that the output objects are properly generated. See Plan View Page.
  4. Acquire plan identifier. In Plan View, acquire the numeric value for the plan from the URL. In the following, the plan Id is 1234:

    http://<platform_base_url>/plans/1234

Step - Run Plan

Through the APIs, you can run a plan. Construct a request like the following, where:

  • <id> is the plan identifier that you already extracted from the Plan View URL.
Endpoint<protocol>://<platform_base_url>/v4/plans/<id>/run
AuthenticationRequired
MethodPOST
Request Body

None.

Response Code201 - Created
Response Body
{
    "validationStatus": "Valid",
    "planSnapshotRunId": 2
}

If the 201 response code is returned, then the plan has been queued for execution. 

Tip: Retain the id value in the response. In the above, 2 is the internal identifier for the plan run, which is referenced via the generated snapshot of the corresponding flows in the plan's tasks. You will need this value to check on your plan run status.

For more information, see  Cloud Dataprep by TRIFACTA INC. API Reference docs: Premium | Standard


Checkpoint: You have queued your plan for execution.

Step - Run Plan with Overrides

When you run your plan, you can apply overrides to any of the parameters that are sourced in flow tasks within the plan. Overrides are applied in the body request when submitting to the plan run API endpoint.

Endpoint<protocol>://<platform_base_url>/v4/plans/<id>/run
AuthenticationRequired
MethodPOST
Request Body
{
  "planNodeOverrides": [
    {
      "handle": "ax",
      "overrideKey": "region",
      "value": {
        "variable": {
          "value": "02"
        }
      }
    },
    {
      "handle": "cq",
      "overrideKey": "state",
      "value": "CA"
    }
  ]
}
Response Code201 - Created
Response Body
{
    "validationStatus": "Valid",
    "planSnapshotRunId": 2
}
Request Body AttributeDescription
handle

This value corresponds to the identifier for the task node in Plan View. In the Trifacta application,

Tasks are label in the following format:

<task_type>-<handleId>

where:

  • <task_type> - is a string literal:
    • flowtask denotes a flow task.
    • http denotes an HTTP task.
  • <handleId> - a lowercase identifier for the task. Handle value must be two lowercase letters, at a minimum. Value must be unique to the tasks of the plan. This value is used as the handle value.

Tip: You can retrieve this value by selecting the task in Plan View, which is listed at the top of the task icon.

overrideKeyThe name of the parameter to override.
valueThe override value to apply to the parameter. This value can be specified as a String value or as a JSON object. See the previous examples.

If the 201 response code is returned, then the plan has been queued for execution. 

Tip: Retain the id value in the response. In the above, 2 is the internal identifier for the plan run, which is referenced via the generated snapshot of the corresponding flows in the plan's tasks. You will need this value to check on your plan run status.

For more information, see  Cloud Dataprep by TRIFACTA INC. API Reference docs: Premium | Standard

Step - Monitoring Your Plan Run

You can monitor the status of your plan run through the following endpoint, where:

  • <id> is the plan snapshot identifier for your run that you retained from the previous step.
Endpoint<protocol>://<platform_base_url>/v4/planSnapshotRuns/<id>
AuthenticationRequired
MethodGET
Request BodyNone.
Response Code200 - Ok
Response Body
{
    "id": 2,
    "status": "InProgress",
    "scheduleHistoryId": null,
    "startedAt": "2020-04-23T17:53:33.466Z",
    "finishedAt": null,
    "submittedAt": "2020-04-23T17:53:32.993Z",
    "executionId": null,
    "createdAt": "2020-04-23T17:53:33.312Z",
    "updatedAt": "2020-04-23T17:53:33.499Z",
    "plan": {
        "id": 1
    }
}

When the plan run has successfully completed, the returned status message includes the following:

"status": "Complete",

For more information, see  Cloud Dataprep by TRIFACTA INC. API Reference docs: Premium | StandardYou can also review your plan runs through the Trifacta application at the following URL:

<protocol>://<platform_base_url>/plans/<planId>/runs/<planSnapshotRunId>


Tip: You have executed the plan run. Results have been delivered to the designated output locations.

Step - Add Flow Messages

Feature Availability: This feature is available in the following editions:

  • Cloud Dataprep Standard by TRIFACTA INC.
  • Cloud Dataprep Premium by TRIFACTA INC.

You can configure flow webhooks and email notifications to deliver to stakeholders through the individual flows that are referenced in your plans. 

NOTE: These features may require enablement and configuration in your environment.

For more information on these messaging types, see Overview of Operationalization.

A flow webhook is a REST API-based message that is triggered on the success or failure of generating an output from a flow. When the output referenced in a plan is generated, any webhook messages for the output are also triggered.

NOTE: You can define the equivalent of a webhook in your plan. HTTP tasks execute similar requests to a flow webhook and are an integrated part of plans. For more information, see Create HTTP Task.

Some uses:

  • You can configure webhooks to deliver messages for each output referenced in the flow. Based on the schedule for your flow, you can review these messages to determine if the flow executed properly. 
  • You can configure a final output in the final task that is executed after the upstream recipes in the same flow. 
    • All of the upstream recipes in the flow feed into a final recipe, which generates an unused output. 
  • When you create a flow webhook based on this final output, you can send a message that the final task has been executed.

An email notification is an email that is sent through the configured SMTP server to stakeholders based on the successful or failed execution of an output. You can define email notifications for your individual flows, and these messages get delivered as part of the flow execution that is part of the plan. 

Tip: When an email notification is sent as part of task execution, the internal plan identifier is included as part of the message.

For more information on email notification, see Manage Flow Notifications Dialog.

This page has no comments.