Skip to main content

API Task - Deploy a Flow

Overview

In this task, you learn how to deploy a flow in development to a production instance of the platform. After you have created and finished a flow in a Development (Dev) instance, you can deploy it to an environment designed primarily for production execution of jobs for finished flows (Prod instance). For more information on managing these deployments, see Overview of Deployment Manager.

Prerequisites

Finished flow: This example assumes that you have finished development of a flow with the following characteristics:

  • Single dataset imported from a table through a Redshift connection

  • Single JSON output

Separate Dev and Prod instances: Although it is possible to deploy flows to the same instance in which they are developed, this example assumes that you are deploying from a Dev instance to a completely separate Prod instance. The following implications apply:

  • Separate user accounts to access Dev (User1) and Prod (Admin2) instances.

    Tip

    You should do all of your recipe development and testing in Dev/Test. Avoid making changes in a Prod environment.

    Note

    Although these are separate user accounts, the assumption is that the same admin-level user is using these accounts through the APIs.

  • New connections must be created in the Prod instance to access the production version of the database table.

Task

In this example, your environment contains separate Dev and Prod instances, each of which has a different set of users.

Item

Dev

Prod

Environment

http://wrangle-dev.example.com:3005

Tip

Dev environment work can be done through the UI, which may be easier.

http://wrangle-prod.example.com:3005

User

User1

Note

User1 has no access to Prod.

Admin2

Source DB

devWrangleDB

prodWrangleDB

Source Table

Dev-Orders

Prod-Orders

Connection Name

Dev Redshift Conn

Prod Redshift Conn

Example Flow:

User1 is creating a flow, which is used to wrangle weekly batches of orders for the enterprise. The flow contains:

  • A single imported dataset that is created from a Redshift database table.

  • A single recipe that modifies the imported dataset.

  • A single output to a JSON file.

  • Production data is hosted in a different Redshift database. So, the Prod connection is different from the Dev connection.

Steps:

  1. Build in Dev instance: User1 creates the flow and iterates on building the recipe and running jobs until a satisfactory output can be generated in JSON format.

  2. Export: When User1 is ready to push the flow to production, User1 exports the flow and downloads the export package ZIP file to the local desktop.

  3. Deploy to Prod instance:

    1. Admin2 creates a new deployment in the Prod instance.

    2. Admin2 creates a new connection (Prod Redshift Conn) in the Prod instance.

    3. Admin2 creates new import rules in the Prod instance to map from the old connection (Dev Redshift Conn) to the new one (Prod Redshift Conn).

    4. Admin2 uploads the export ZIP package.

  4. Test deployment: Through Flow View in the Prod instance, Admin2 runs a job. The results look fine.

  5. Set schedule: Using cron, Admin2 sets a schedule to run the active release for this deployment once per week.

    1. Each week, the Prod-Orders table must be refreshed with data.

    2. The dataset is now operational in the Prod environment.

Step - Get Flow Id

The first general step is for the Dev user (User1) to get the flowId and export the flow from the Dev instance.

Steps:

Tip

If it's easier, you can gather the flowId from the user interface in Flow View. In the following example, the flowId is 21:

http://www.wrangle-dev.example.com:3005/flows/21
  1. Through the APIs, you can create a flow using the following call:

    Endpoint

    http://www.wrangle-dev.example.com:3005/v4/flows

    Authentication

    Required

    Method

    GET

    Request Body

    None.

  2. The response should be status code 200 - OK with a response body like the following:

    {    "data": [
            {
                "id": 21,
                "name": "Intern Training",
                "description": "null",
                "createdAt": "2019-01-08T18:14:37.851Z",
                "updatedAt": "2019-01-08T18:57:26.824Z",
                "creator": {
                    "id": 2
                },
                "updater": {
                    "id": 2
                },
                "folder": {
                    "id": 1
                },
                "workspace": {
                    "id": 1
                }
            },
            {
                "id": 19,
                "name": "example Flow",
                "description": null,
                "createdAt": "2019-01-08T17:25:21.392Z",
                "updatedAt": "2019-01-08T17:30:30.959Z",
                "creator": {
                    "id": 2
                },
                "updater": {
                    "id": 2
                },
                "folder": {
                    "id": 4
                },
                "workspace": {
                    "id": 1
                }
            }
        ]
    }
  3. Retain the flow identifier (21) for later use.

Note

You have identified the flow to export.

For more information, see https://api.trifacta.com/ee/9.7/index.html#operation/listFlows

Step - Export a Flow

Export the flow to your local desktop.

Tip

This step may be easier to do through the UI in the Dev instance.

Steps:

  1. Export flowId=21:

    Endpoint

    http://www.wrangle-dev.example.com:3005/v4/flows/21/package

    Authentication

    Required

    Method

    GET

    Request Body

    None.

  2. The response should be status code 200 - OK. The response body is the flow itself.

  3. Download and save this file to your local desktop. Let's assume that the filename you choose is flow-WrangleOrders.zip.

For more information, see https://api.trifacta.com/ee/9.7/index.html#operation/getFlowPackage

Step - Create Deployment

In the Prod environment, you can create the deployment from which you can manage the new flow. Note that the following information has changed for this environment:

Item

Prod env value

userId

Admin2

baseURL

http://www.wrangle-prod.example.com:3005

Steps:

  1. Through the APIs, you can create a deployment using the following call:

    Endpoint

    http://www.wrangle-prod.example.com:3005/v4/deployments

    Authentication

    Required

    Note

    Username and password credentials must be submitted for the Admin2 account.

    Method

    POST

    Request Body

    {
        "name": "Production Orders"
    }
  2. The response should be status code 201 - Created with a response body like the following:

    {
       "id": 3,
        "name": "Production Orders",
        "updatedAt": "2017-11-27T23:48:54.340Z",
        "createdAt": "2017-11-27T23:48:54.340Z",
        "creator": {
            "id": 1
        },
        "updater": {
            "id": 1
        }
    }
  3. Retain the deploymentId (3) for later use.

For more information, see https://api.trifacta.com/ee/9.7/index.html#operation/createDeployment

Step - Create Connection

When a flow is exported, its connections are not included in the export. Before you import the flow into a new environment:

  • Connections must be created or recreated in the Prod environment. In some cases, you may need to point to production versions of the data contained in completely different databases.

  • Rules must be created to remap the connection to use in the imported flow.

This section and the following step through these processes.

Steps:

  1. From the Dev environment, you collect the connection information for the flow:

    Endpoint

    http://www.wrangle-dev.example.com:3005/v4/connections

    Authentication

    Required

    Note

    Username and password credentials must be submitted for the User1 account.

    Method

    GET

    Request Body

    None.

  2. The response should be status code 200 - Ok with a response body like the following:

    {
        "data": [
            {
                "id": 9,
                "host": "dev-redshift.example.com",
                "port": 5439,
                "vendor": "redshift",
                "params": {
                    "connectStrOpts": "",
                    "defaultDatabase": "devWrangleDB",
                    "extraLoadParams": "BLANKSASNULL EMPTYASNULL TRIMBLANKS TRUNCATECOLUMNS"
                },
                "ssl": false,
                "vendorName": "redshift",
                "name": "Dev Redshift Conn",
                "description": "",
                "type": "jdbc",
                "isGlobal": true,
                "credentialType": "iamRoleArn",
                "credentialsShared": true,
                "uuid": "b8014610-ce56-11e7-9739-27deec2c3249",
                "disableTypeInference": false,
                "createdAt": "2017-11-21T00:55:50.770Z",
                "updatedAt": "2017-11-21T00:55:50.770Z",
                "credentials": [
                    {
                        "user": "devDBuser"
                    }
                ],
                "creator": {
                    "id": 2
                },
                "updater": {
                    "id": 2
                },
                "workspace": {
                    "id": 1
                }
            }
        ],
        "count": {
            "owned": 1,
            "shared": 0,
            "count": 1
        }
    }
  3. You retain the above information for use in Production.

  4. In the Prod environment, you create the new connection using the following call:

    Endpoint

    http://www.wrangle-prod.example.com:3005/v4/connections

    Authentication

    Required

    Note

    Username and password credentials must be submitted for the Admin2 account.

    Method

    POST

    Request Body

    {
         "host": "prod-redshift.example.com",
         "port": 1433,
         "vendor": "redshift",
         "params": {
           "connectStrOpts": "",
           "defaultDatabase": "prodWrangleDB",
           "extraLoadParams": "BLANKSASNULL EMPTYASNULL TRIMBLANKS TRUNCATECOLUMNS"
         },
         "vendorName": "redshift",
         "name": "Redshift Conn Prod",
         "description": "",
         "isGlobal": true,
         "type": "jdbc",
         "ssl": false,
         "credentialType": "iamRoleArn",
         "credentials": [
            {
              "username": "prodDBUser",
              "password": "<password>",
              "iamRoleArn": "iam:aws:12345"
            }
         ]
    }
  5. The response should be status code 201 - Created with a response body like the following:

    {
      "id": 12,
      "host": "prod-redshift.example.com",
      "port": 5439,
      "vendor": "redshift",
         "params": {
           "connectStrOpts": "",
           "defaultDatabase": "prodWrangleDB",
           "extraLoadParams": "BLANKSASNULL EMPTYASNULL TRIMBLANKS TRUNCATECOLUMNS"
         },
      "ssl": false,
      "name": "Redshift Conn Prod",
      "description": "",
      "type": "jdbc",
      "isGlobal": true,
      "credentialType": "iamRoleArn",
      "credentialsShared": true,
      "uuid": "fa7e06c0-0143-11e8-8faf-27c0392328c5",
      "disableTypeInference": false,
      "createdAt": "2018-01-24T20:20:11.181Z",
      "updatedAt": "2018-01-24T20:20:11.181Z",
      "credentials": [
          {
              "username": "prodDBUser"
          }
      ],
      "creator": {
          "id": 2
      },
      "updater": {
          "id": 2
      }
    }
  6. When you hit the /v4/connections endpoint again, you can retrieve the connectionId for this connection. In this case, let's assume that the connectionId value is 12.

See https://api.trifacta.com/ee/9.7/index.html#operation/createConnection

Step - Create Import Rules

Now that you have defined the connection to use to acquire the production data from within the production environment, you must create an import rule to remap from the Dev connection to the Prod connection within the flow definition. This rule is applied during the import process to ensure that the flow is working after it has been imported.

In this case, you must remap the uuid value for the Dev connection, which is written into the flow definition, with the connection Id value from the Prod instance.

For more information on import rules, see API Task - Define Deployment Import Mappings.

Steps:

  1. From the Dev environment, you collect the connection information for the flow:

    Endpoint

    http://www.wrangle-dev.example.com:3005/v4/connections

    Authentication

    Required

    Note

    Username and password credentials must be submitted for the User1 account.

    Method

    GET

    Request Body

    None.

  2. The response should be status code 200 - Ok with a response body like the following:

    {
        "data": [
            {
                "id": 9,
                "host": "dev-redshift.example.com",
                "port": 5439,
                "vendor": "redshift",
                "params": {
                    "connectStrOpts": "",
                    "defaultDatabase": "devWrangleDB",
                    "extraLoadParams": "BLANKSASNULL EMPTYASNULL TRIMBLANKS TRUNCATECOLUMNS"
                },
                "ssl": false,
                "vendorName": "redshift",
                "name": "Dev Redshift Conn",
                "description": "",
                "type": "jdbc",
                "isGlobal": true,
                "credentialType": "iamRoleArn",
                "credentialsShared": true,
                "uuid": "b8014610-ce56-11e7-9739-27deec2c3249",
                "disableTypeInference": false,
                "createdAt": "2017-11-21T00:55:50.770Z",
                "updatedAt": "2017-11-21T00:55:50.770Z",
                "credentials": [
                    {
                        "user": "devDBuser"
                    }
                ],
                "creator": {
                    "id": 2
                },
                "updater": {
                    "id": 2
                },
                "workspace": {
                    "id": 1
                }
            }
        ],
        "count": {
            "owned": 1,
            "shared": 0,
            "count": 1
        }
    }
  3. From the above information, you retain the following, which uniquely identifies the connection object, regardless of the instance to which it belongs:

    "uuid": "b8014610-ce56-11e7-9739-27deec2c3249",
  4. Against the Prod environment, you now create an import mapping rule:

    Endpoint

    http://www.wrangle-prod.example.com:3005/v4/deployments/3/objectImportRules

    Authentication

    Required

    Method

    PATCH

    Request Body:

    [{"tableName":"connections","onCondition":{"uuid": "b8014610-ce56-11e7-9739-27deec2c3249"},"withCondition":{"id":12}}]
  5. The response should be status code 200 - Ok with a response body like the following:

    {
        "deleted": []
    }

    Since the method is a PATCH, you are updating the rules set that applies to all imports for this deployment. In this case, there were no pre-existing rules, so the response indicates that nothing was deleted. If another set of import rules is submitted, then the one you just created is deleted.

See https://api.trifacta.com/ee/9.7/index.html#operation/updateObjectImportRules

See https://api.trifacta.com/ee/9.7/index.html#operation/updateValueImportRules

Step - Import Package to Create Release

You are now ready to import the package to create the release.

Steps:

  1. Against the Prod environment, you now import the package:

    Endpoint

    http://www.wrangle-prod.example.com:3005/v4/deployments/3/releases

    Authentication

    Required

    Method

    POST

    Request Body

    The request body must include the following key and value combination submitted as form data:

    key

    value

    data

    "@path-to-flow-WrangleOrders.zip"

  2. The response should be status code 201 - Created with a response body like the following:

    {    "importRuleChanges": {
            "object": [{"tableName":"connections","onCondition":{"uuid": "b8014610-ce56-11e7-9739-27deec2c3249"},"withCondition":{"id":12}}],
            "value": []
        },
        "flowName": "Wrangle Orders"
    }

See https://api.trifacta.com/ee/9.7/index.html#operation/importPackageForDeployment

Step - Activate Release

When a package is imported into a release, the release is automatically set as the active release for the deployment. If at some point in the future, you need to change the active release, you can use the following endpoint to do so.

Steps:

  1. Against the Prod environment, use the following endpoint:

    Endpoint

    http://www.wrangle-prod.example.com:3005/v4/releases/5

    Authentication

    Required

    Method

    PATCH

    Request Body

    {
        "active": true
    }
  2. The response should be status code 200 - OK with a response body like the following:

    {
        "id": 3,
        "updater": {
            "id": 3
        },
        "updatedAt": "2017-11-28T00:06:12.147Z"
    }

See https://api.trifacta.com/ee/9.7/index.html#operation/patchRelease

Step - Run Deployment

You can now execute a test run of the deployment to verify that the job executes properly.

Note

When you run a deployment, you run the primary flow in the active release for that deployment. Running the flow generates the output objects for all recipes in the flow.

Note

For datasets with parameters, you can apply parameter overrides through the request body through the following API call. For more information, see https://api.trifacta.com/ee/9.7/index.html#operation/runDeployment

Steps:

  1. Against the Prod environment, use the following endpoint:

    Endpoint

    http://www.wrangle-prod.example.com:3005/v4/deployments/3/run

    Authentication

    Required

    Method

    POST

    Request Body

    None.

  2. The response should be status code 201 - Created with a response body like the following:

    {
        "data": [
            {
                "reason": "JobStarted",
                "sessionId": "dd6a90e0-c353-11e7-ad4e-7f2dd2ae4621",
                "id": 33
            }
        ]
    }

See https://api.trifacta.com/ee/9.7/index.html#operation/runDeployment

Step - Iterate

If you need to make changes to fix issues related to running the job:

  • Recipe changes should be made in the Dev environment and then passed through export and import of the flow into the Prod deployment.

  • Connection issues:

    • Check Flow View in the Prod instance to see if there are any red dots on the objects in the package. If so, your import rules need to be fixed.

    • Verify that you can import data through the connection.

  • Output problems could be related to permissions on the target location.

Step - Set up Production Schedule

When you are satisfied with how the production version of your flow is working, you can set up periodic schedules using a third-party tool to execute the job on a regular basis.

The tool must hit the Run Deployment endpoint and then verify that the output has been properly generated.