Page tree

 

Contents:


Overview

In this workflow, you learn how to deploy a flow in development to a production instance of the platform. After you have created and finished a flow in a Development (Dev) instance, you can deploy it to an environment designed primarily for production execution of jobs for finished flows (Prod instance). For more information on managing these deployments, see Overview of Deployment Management.

Pre-requisites

Finished flow: This example assumes that you have finished development of a flow with the following characteristics:

  • Single dataset imported from a table through a Redshift connection
  • Single JSON output

Separate Dev and Prod instances: Although it is possible to deploy flows to the same instance in which they are developed, this example assumes that you are deploying from a Dev instance to a completely separate Prod instance. The following implications apply:

  • Separate user accounts to access Dev (User1) and Prod (Admin2) instances.

    Tip: You should do all of your recipe development and testing in Dev/Test. Avoid making changes in a Prod environment.

    NOTE: Although these are separate user accounts, the assumption is that the same admin-level user is using these accounts through the APIs.

  • New connections must be created in the Prod instance to access the production version of the database table. 

Workflow

In this example, your environment contains separate Dev and Prod instances, each of which has a different set of users.

ItemDevProd
Environment

http://wrangle-dev.example.com:3005

Tip: Dev environment work can be done through the UI, which may be easier.


http://wrangle-prod.example.com:3005
User

User1

NOTE: User1 has no access to Prod.

Admin2
Source DBdevWrangleDBprodWrangleDB
Source TableDev-OrdersProd-Orders 
Connection NameDev Redshift ConnProd Redshift Conn

 

Example Flow:

User1 is creating a flow, which is used to wrangle weekly batches of orders for the enterprise. The flow contains:

  • A single imported dataset that is created from a Redshift database table.
  • A single recipe that modifies the imported dataset.
  • A single output to a JSON file.
  • Production data is hosted in a different Redshift database. So, the Prod connection is different from the Dev connection.

Steps:

  1. Build in Dev instance: User1 creates the flow and iterates on building the recipe and running jobs until a satisfactory output can be generated in JSON format.
  2. Export: When User1 is ready to push the flow to production, User1 exports the flow and downloads the export package ZIP file to the local desktop.
  3. Deploy to Prod instance: 
    1. Admin2 creates a new deployment in the Prod instance.
    2. Admin2 creates a new connection (Prod Redshift Conn) in the Prod instance.
    3. Admin2 creates new import rules in the Prod instance to map from the old connection (Dev Redshift Conn) to the new one (Prod Redshift Conn).
    4. Admin2 uploads the export ZIP package.
  4. Test deployment: Through Flow View in the Prod instance, Admin2 runs a job. The results look fine.
  5. Set schedule: Using cron, Admin2 sets a schedule to run the active release for this deployment once per week.
    1. Each week, the Prod-Orders table must be refreshed with data.
    2. The dataset is now operational in the Prod environment.

Step - Get Flow Id

The first general step is for the Dev user (User1) to get the flowId and export the flow from the Dev instance.

Steps:

Tip: If it's easier, you can gather the flowId from the user interface in Flow View. In the following example, the flowId is 21:

http://www.wrangle-dev.example.com:3005/flows/21
  1. Through the APIs, you can create a flow using the following call:

    Endpoint http://www.wrangle-dev.example.com:3005/v3/flows
    AuthenticationRequired
    MethodGET
    Request Body

    None.

     

  2. The response should be status code 200 - OK with a response body like the following:

    [
        {
            "id": 21,
            "name": "Wrangle Orders",
            "description": null,
            "deleted_at": null,
            "cpProject": null,
            "createdAt": "2017-11-27T18:19:12.763Z",
            "updatedAt": "2017-11-27T18:19:12.763Z",
            "createdBy": 2,
            "updatedBy": 2,
            "associatedPeople": [
                {
                    "outputHomeDir": "/trifacta-hdp26/queryResults/user1@example.com",
                    "name": "User 1",
                    "email": "user1@example.com",
                    "id": 2,
                    "flowpermission": {
                        "flowId": 21,
                        "personId": 2,
                        "role": "owner"
                    }
                }
            ]
        },
        {
            "id": 19,
            "name": "example Flow",
            "description": null,
            "deleted_at": null,
            "cpProject": null,
            "createdAt": "2017-11-15T23:00:24.263Z",
            "updatedAt": "2017-11-15T23:00:24.263Z",
            "createdBy": 2,
            "updatedBy": 2,
            "associatedPeople": [
                {
                    "outputHomeDir": "/trifacta-hdp26/queryResults/user1@example.com",
                    "name": "User 1",
                    "email": "user1@example.com",
                    "id": 2,
                    "flowpermission": {
                        "flowId": 19,
                        "personId": 2,
                        "role": "owner"
                    }
                }
            ]
        }
    ]
  3. Retain the flow identifier (21) for later use.

Checkpoint: You have identified the flow to export.

For more information, see API Flows Get v3.

Step - Export a Flow

Export the flow to your local desktop.

Tip: This step may be easier to do through the UI in the Dev instance.

Steps:

  1. Export flowId=21:

    Endpointhttp://www.wrangle-dev.example.com:3005/v3/flows/21/package
    AuthenticationRequired
    MethodGET
    Request Body

    None.

     

  2. The response should be status code 200 - OK. The response body is the flow itself. 

  3. Download and save this file to your local desktop. Let's assume that the filename you choose is flow-WrangleOrders.zip.

For more information, see API Flows Package Get v3.

Step - Create Deployment

In the Prod environment, you can create the deployment from which you can manage the new flow. Note that the following information has changed for this environment:

ItemProd env value
userIdAdmin2
baseURLhttp://www.wrangle-prod.example.com:3005

Steps:

  1. Through the APIs, you can create a deployment using the following call:

    Endpointhttp://www.wrangle-prod.example.com:3005/v3/deployments
    Authentication

    Required

    NOTE: Username and password credentials must be submitted for the Admin2 account.

    MethodPOST
    Request Body
    {
        "name": "Production Orders"
    }
  2. The response should be status code 201 - Created with a response body like the following:

    {   "id": 3,
        "name": "Production Orders",
        "createdBy": 1,
        "updatedBy": 1,
        "updatedAt": "2017-11-27T23:48:54.340Z",
        "createdAt": "2017-11-27T23:48:54.340Z"
    }
  3. Retain the deploymentId (3) for later use.

For more information, see API Deployments Create v3.

Step - Create Connection

When a flow is exported, its connections are not included in the export. Before you import the flow into a new environment:

  • Connections must be created or recreated in the Prod environment. In some cases, you may need to point to production versions of the data contained in completely different databases.
  • Rules must be created to remap the connection to use in the imported flow.

This section and the following step through these processes.

Steps:

  1. From the Dev environment, you collect the connection information for the flow: 

    Endpoint http://www.wrangle-dev.example.com:3005/v3/connections
    Authentication

    Required

    NOTE: Username and password credentials must be submitted for the User1 account.

    MethodGET
    Request Body

    None.

  2. The response should be status code 200 - Ok with a response body like the following:

    {
        "data": [
            {
                "connectParams": {
                    "vendor": "redshift",
                    "host": "dev-redshift.example.com",
                    "port": "5439",
                    "defaultDatabase": "devWrangleDB",
                    "extraLoadParams": "BLANKSASNULL EMPTYASNULL TRIMBLANKS TRUNCATECOLUMNS"
                },
                "id": 9,
                "host": "dev-redshift.example.com",
                "port": 5439,
                "vendor": "redshift",
                "params": {
                    "connectStrOpts": "",
                    "defaultDatabase": "devWrangleDB",
                    "extraLoadParams": "BLANKSASNULL EMPTYASNULL TRIMBLANKS TRUNCATECOLUMNS"
                },
                "ssl": false,
                "name": "Dev Redshift Conn",
                "description": "",
                "type": "jdbc",
                "createdBy": 1,
                "isGlobal": true,
                "credentialType": "custom",
                "credentialsShared": true,
                "uuid": "b8014610-ce56-11e7-9739-27deec2c3249",
                "createdAt": "2017-11-21T00:55:50.770Z",
                "updatedAt": "2017-11-21T00:55:50.770Z",
                "updatedBy": 2,
                "credentials": [
                    {
                        "user": "devDBuser"
                    }
                ]
            }
        ],
        "count": {
            "owned": 1,
            "shared": 0,
            "count": 1
        }
    }
  3. You retain the above information for use in Production.

  4. In the Prod environment, you create the new connection using the following call:

    Endpoint http://www.wrangle-prod.example.com:3005/v3/connections
    Authentication

    Required

    NOTE: Username and password credentials must be submitted for the Admin2 account.

    MethodPOST
    Request Body
    {
         "name": "Redshift Conn Prod",
         "description": "",
         "isGlobal": true,
         "type": "jdbc",
         "host": "prod-redshift.example.com",
         "port": 1433,
         "vendor": "redshift",
         "params": {
           "connectStrOpts": "",
           "defaultDatabase": "prodWrangleDB",
           "extraLoadParams": "BLANKSASNULL EMPTYASNULL TRIMBLANKS TRUNCATECOLUMNS"
         },
         "ssl": false,
         "credentialType": "custom",
         "credentials": [
            {
              "username": "prodDBUser",
              "password": "<password>"
            }
         ]
    }
  5. The response should be status code 201 - Created with a response body like the following:

    {
      "host": "prod-redshift.example.com",
      "port": 5439,
      "vendor": "redshift",
         "params": {
           "connectStrOpts": "",
           "defaultDatabase": "prodWrangleDB",
           "extraLoadParams": "BLANKSASNULL EMPTYASNULL TRIMBLANKS TRUNCATECOLUMNS"
         },
      "ssl": false,
      "name": "Redshift Conn Prod",
      "description": "",
      "type": "jdbc",
      "isGlobal": true,
      "credentialType": "custom",
      "credentialsShared": true,
      "credentials": [
              "username": "prodDBUser"
      ]
    }
  6. When you hit the /v3/connections endpoint again, you can retrieve the connectionId for this connection. In this case, let's assume that the connectionId value is 12.

See API Connections Create v3.

Step - Create Import Rules

Now that you have defined the connection to use to acquire the production data from within the production environment, you must create an import rule to remap from the Dev connection to the Prod connection within the flow definition. This rule is applied during the import process to ensure that the flow is working after it has been imported.

In this case, you must remap the uuid value for the Dev connection, which is written into the flow definition, with the connection Id value from the Prod instance.

For more information on import rules, see Define Import Mapping Rules.

Steps:

  1. From the Dev environment, you collect the connection information for the flow: 

    Endpointhttp://www.wrangle-dev.example.com:3005/v3/connections
    Authentication

    Required

    NOTE: Username and password credentials must be submitted for the User1 account.

    MethodGET
    Request Body

    None.

  2. The response should be status code 200 - Ok with a response body like the following:

    {
        "data": [
            {
                "connectParams": {
                    "vendor": "redshift",
                    "host": "dev-redshift.example.com",
                    "port": "5439",
                    "defaultDatabase": "devWrangleDB",
                    "extraLoadParams": "BLANKSASNULL EMPTYASNULL TRIMBLANKS TRUNCATECOLUMNS"
                },
                "id": 9,
                "host": "dev-redshift.example.com",
                "port": 5439,
                "vendor": "redshift",
                "params": {
                    "connectStrOpts": "",
                    "defaultDatabase": "devWrangleDB",
                    "extraLoadParams": "BLANKSASNULL EMPTYASNULL TRIMBLANKS TRUNCATECOLUMNS"
                },
                "ssl": false,
                "name": "Dev Redshift Conn",
                "description": "",
                "type": "jdbc",
                "createdBy": 1,
                "isGlobal": true,
                "credentialType": "custom",
                "credentialsShared": true,
                "uuid": "b8014610-ce56-11e7-9739-27deec2c3249",
                "createdAt": "2017-11-21T00:55:50.770Z",
                "updatedAt": "2017-11-21T00:55:50.770Z",
                "updatedBy": 2,
                "credentials": [
                    {
                        "user": "devDBuser"
                    }
                ]
            }
        ],
        "count": {
            "owned": 1,
            "shared": 0,
            "count": 1
        }
    }
  3. From the above information, you retain the following, which uniquely identifies the connection object, regardless of the instance to which it belongs:

    "uuid": "b8014610-ce56-11e7-9739-27deec2c3249",
  4. Against the Prod environment, you now create an import mapping rule:

    Endpointhttp://www.wrangle-prod.example.com:3005/v3/deployments/3/objectImportRules
    Authentication

    Required

    MethodPATCH

    Request Body:

    [{"tableName":"connections","onCondition":{"uuid": "b8014610-ce56-11e7-9739-27deec2c3249"},"withCondition":{"id":12}}]
  5. The response should be status code 200 - Ok with a response body like the following:

    {
        "deleted": []
    }

    Since the method is a PATCH, you are updating the rules set that applies to all imports for this deployment. In this case, there were no pre-existing rules, so the response indicates that nothing was deleted. If another set of import rules is submitted, then the one you just created is deleted.

See API Deployments Object Import Rules Patch v3.

See API Deployments Value Import Rules Patch v3.

Step - Import Package to Create Release

You are now ready to import the package to create the release.

Steps:

  1. Against the Prod environment, you now import the package:

    Endpointhttp://www.wrangle-prod.example.com:3005/v3/deployments/3/releases
    Authentication

    Required

    MethodPOST
    Request Body

    The request body must include the following key and value combination submitted as form data:

    keyvalue
    data"@path-to-flow-WrangleOrders.zip"
  2. The response should be status code 201 - Created with a response body like the following:

    {    "importRuleChanges": {
            "object": [{"tableName":"connections","onCondition":{"uuid": "b8014610-ce56-11e7-9739-27deec2c3249"},"withCondition":{"id":12}}],
            "value": []
        },
        "flowName": "Wrangle Orders"
    }

See API Releases Create v3.

Step - Activate Release

When a package is imported into a release, the release is automatically set as the active release for the deployment. If at some point in the future, you need to change the active release, you can use the following endpoint to do so.

Steps:

  1. Against the Prod environment, use the following endpoint:

    Endpointhttp://www.wrangle-prod.example.com:3005/v3/releases/5
    Authentication

    Required

    MethodPATCH
    Request Body
    {
        "active": true
    }
  2. The response should be status code 200 - OK with a response body like the following:

    {    "id": 3,
        "updatedBy": 3,
        "updatedAt": "2017-11-28T00:06:12.147Z"
    }

See API Releases Patch v3.

Step - Run Deployment

You can now execute a test run of the deployment to verify that the job executes properly.

NOTE: When you run a deployment, you run the primary flow in the active release for that deployment. Running the flow generates the output objects for all recipes in the flow.

Steps:

  1. Against the Prod environment, use the following endpoint:

    Endpointhttp://www.wrangle-prod.example.com:3005/v3/deployments/3/run
    Authentication

    Required

    MethodPOST
    Request Body

    None.

  2. The response should be status code 201 - Created with a response body like the following:

    {
        "data": [
            {
                "reason": "JobStarted",
                "sessionId": "dd6a90e0-c353-11e7-ad4e-7f2dd2ae4621",
                "id": 33,
                "jobs": {
                    "data": [
                        {
                            "id": 68
                        },
                        {
                            "id": 69
                        },
                        {
                            "id": 70
                        }
                    ]
                }
            }
        ]
    }

See API Deployments Run v3.

Step - Iterate

If you need to make changes to fix issues related to running the job:

  • Recipe changes should be made in the Dev environment and then passed through export and import of the flow into the Prod deployment.
  • Connection issues:
    • Check Flow View in the Prod instance to see if there are any red dots on the objects in the package. If so, your import rules need to be fixed. 
    • Verify that you can import data through the connection.
  • Output problems could be related to permissions on the target location.

Step - Set up Production Schedule

When you are satisfied with how the production version of your flow is working, you can set up periodic schedules using a third-party tool to execute the job on a regular basis. 

The tool must hit the Run Deployment endpoint and then verify that the output has been properly generated.

This page has no comments.