Skip to main content

Install for Docker

This guide steps through the process of acquiring and deploying a Docker image of the Designer Cloud Powered by Trifacta platform in your Docker environment. Optionally, you can build the Docker image locally, which enables further configuration options.

Deployment Scenario

  • Designer Cloud Powered by Trifacta Enterprise Edition deployed into a customer-managed environment: On-premises, AWS, or Azure.

  • PostgreSQL 12.3 or MySQL 5.7 installed either:

    • Locally

    • Remote server

  • On-premises Hadoop:

    • Connected to a supported Hadoop cluster.

    • Kerberos integration is supported.

Limitations

Note

For Docker installs and upgrades, only the dependencies for the latest supported version of each supported major Hadoop distribution are available for use after upgrade. For more information on the supported versions, please see the hadoop-deps directory in the installer. Dependencies for versions other than those available on the installer are not supported.

  • You cannot upgrade to a Docker image from a non-Docker deployment.

  • You cannot switch an existing installation to a Docker image.

  • Supported distributions of Cloudera. See Supported Deployment Scenarios for Cloudera.

  • The base storage layer of the platform must be S3 or ABFSS.

  • High availability for the Designer Cloud Powered by Trifacta platform in Docker is not supported.

  • SSO integration is not supported.

Requirements

Support for orchestration through Docker Compose only

  • Docker version 17.12 or later. Docker version must be compatible with the following version(s) of Docker Compose.

  • Docker-Compose 1.24.1. Version must be compatible with your version of Docker.

Infrastructure

Before you begin a Dockerized install, please verify that your enterprise infrastructure meets the following integration requirements.

Note

The Docker image contains all components and requirements for the Trifacta node for the appropriate infrastructure. In the following pages, you should verify connectivity, account permissions, cluster and datastore availability, and other aspects of connecting the Trifacta node to your infrastructure resources.

Infrastructure

Documentation

On-premises Hadoop

Install On-Premises

AWS

Install for AWS

Azure

Install for Azure

Docker Daemon

Minimum

Recommended

CPU Cores

8 CPU

16 CPU

Available RAM

64 GB RAM

128 GB RAM

Database client

Installation or upgrade of the product in a Dockerized environment requires installation of appropriate database client on the Trifacta node.

Database vendor

Description

PostgreSQL 12.3

The database client is included as part of the image and is automatically installed.

MySQL 5.7

The database client must be downloaded and installed by the customer. It is not available in the Docker image. The database client must be referenced through the Docker image file.

Note

Before you perform an upgrade of your deployment that connects to a MySQL database, please contact Alteryx Customer Success and Services.

Preparation

  1. Review the Browser Requirements in the Planning Guide.

    Note

    Designer Cloud Powered by Trifacta Enterprise Edition requires the installation of a supported browser on each desktop.

  2. Acquire your License Key.

Acquire Image

You can acquire the latest Docker image using one of the following methods:

  1. Acquire from FTP site.

  2. Build your own Docker image.

Acquire from FTP site

Steps:

  1. Download the following files from the FTP site:

    1. trifacta-docker-setup-bundle-x.y.z.tar

    2. trifacta-docker-image-x.y.z.tar

      Note

      x.y.z refers to the version number (e.g. 6.4.0).

  2. Untar the setup-bundle file:

    tar xvf trifacta-docker-setup-bundle-x.y.z.tar
  3. Files are extracted into a docker folder. Key files:

    File

    Description

    docker-compose-local-postgres.yaml

    Runtime configuration file for the Docker image when PostgreSQL is to be running on the same machine. More information is provided below.

    docker-compose-local-mysql.yaml

    Runtime configuration file for the Docker image when MySQLis to be running on the same machine. More information is provided below.

    docker-compose-remote-db.yaml

    Runtime configuration file for the Docker image when the database is deployed on a remote server.

    Note

    You must manage this instance of the database.

    More information is provided below.

    docker-compose-remote-db-postgres-s3.yaml

    Runtime configuration file for the Docker image when the Postgres database is deployed in AWS, and Designer Cloud Powered by Trifacta Enterprise Edition is configured for S3 + EMR.

    docker-compose-remote-db-postgres-databricks-adls.yaml

    Runtime configuration file for the Docker image when the Postgres database is deployed in Azure, and Designer Cloud Powered by Trifacta Enterprise Edition is configured for ADLS Gen2 + Databricks.

    README-running-trifacta-container-aws.md

    Instructions for running the Alteryx container on AWS

    Note

    These instructions are referenced later in this task.

    README-running-trifacta-container-azure.md

    Instructions for running theAlteryx container on Azure

    Note

    These instructions are referenced later in this task.

    README-building-trifacta-container.md

    Instructions for building theAlteryx container

    Note

    This file does not apply if you are using the provided Docker image.

  4. Load the Docker image into your local Docker environment:

    docker load < trifacta-docker-image-x.y.z.tar
  5. Confirm that the image has been loaded. Execute the following command, which should list the Docker image:

    docker images
  6. You can now configure the Docker image. Please skip that section.

Build your own Docker image

As needed, you can build your own Docker image.

Requirements

  • Docker version 17.12 or later. Docker version must be compatible with the following version(s) of Docker Compose.

  • Docker Compose 1.24.1. It should be compatible with above version of Docker.

Build steps

  1. Acquire the RPM file from the FTP site:

    Note

    You must acquire the el7 RPM file for this release.

  2. In your Docker environment, copy the trifacta-server\*.rpm file to the same level as the Dockerfile.

  3. Verify that the docker-files folder and its contents are present.

  4. Use the following command to build the image:

    docker build -t trifacta/server-enterprise:latest .
  5. This process could take about 10 minutes. When it is completed, you should see the build image in the Docker list of local images.

    Note

    To reduce the size of the Docker image, the Dockerfile installs the trifacta-server RPM file in one stage and then copies over the results to the final stage. The RPM is not actually installed in the final stage. All of the files are properly located.

  6. You can now configure the Docker image.

Configure Docker Image

Before you start the Docker container, you should review the properties for the Docker image. In the provided image, please open the appropriate docker-compose file:

File

Description

docker-compose-local-postgres.yaml

Database properties in this file are pre-configured to work with the installed instance of PostgreSQL, although you may wish to change some of the properties for security reasons.

docker-compose-local-mysql.yaml

Database properties in this file are pre-configured to work with the installed instance of MySQL, although you may wish to change some of the properties for security reasons.

docker-compose-remote-db.yaml

The Alteryx databases are to be installed on a remote server that you manage.

Note

Additional configuration is required.

docker-compose-remote-db-postgres-s3.yaml

TheAlteryx databases are to be installed on a remote Postgres server in AWS that you manage.

docker-compose-remote-db-postgres-databricks-adls.yaml

TheAlteryx databases are to be installed on a remote Postgres server in Azure that you manage.

Note

You may want to create a backup of this file first.

Key general properties:

Note

Avoid modifying properties that are not listed below.

Property

Description

image

This reference must match the name of the image that you have acquired.

container_name

Name of container in your Docker environment.

ports

Defines the listening port for the Trifacta Application. Default is 3005.

Note

If you must change the listening port, additional configuration is required after the image is deployed. See Change Listening Port.

Database properties:

These properties pertain to the database installation to which the Trifacta Application connects.

Property

Description

DB_TYPE

Set this value to postgresql or mysql.

DB_HOST_NAME

Hostname of the machine hosting the databases. Leave value as localhost for local installation.

DB_HOST_PORT

(Remote only) Port number to use to connect to the databases. Default is 5432.

Note

If you are modifying, additional configuration is required after installation is complete. See Change Database Port in the Databases Guide.

DB_ADMIN_USERNAME

Admin username to be used to create DB roles/databases. Modify this value for remote installation.

Note

If you are modifying this value, additional configuration is required. Please see the documentation for your database version.

DB_ADMIN_PASSWORD

Admin password to be used to create DB roles/databases. Modify this value for remote installation.

DB_AZURE_INSTANCE_NAME

Name of Azure Database for PostgreSQL Server. This setting is applicable only when the setup is on Azure with Databricks and ADLS Gen2.

Kerberos properties:

If your Hadoop cluster is protected by Kerberos, please review the following properties.

Property

Description

KERBEROS_KEYTAB_FILE

Full path inside of the container where the Kerberos keytab file is located. Default value:

/opt/trifacta/conf/trifacta.keytab

Note

The keytab file must be imported and mounted to this location. Configuration details are provided later.

KERBEROS_KRB5_CONF

Full path inside of the container where the Kerberos krb5.conf file is located. Default:

/opt/krb-config/krb5.conf

Hadoop distribution client JARs:

Please enable the appropriate path to the client JAR files for your Hadoop distribution. In the following example, the Cloudera path has been enabled:

# Mount folder from outside for necessary hadoop client jars
# For CDH
- /opt/cloudera:/opt/cloudera

Volume properties:

These properties govern where volumes are mounted in the container.

Note

These values should not be modified unless necessary.

Property

Description

volumes.conf

Full path in container to the Alteryx configuration directory. Default:

/opt/trifacta/conf

volumes.logs

Full path in container to the Alteryx logs directory. Default:

/opt/trifacta/logs

volumes.license

Full path in container to the Alteryx license directory. Default:

/trifacta-license

Setup Container

Steps:

  1. After you have performed the above configuration, execute the following to initialize the Docker container directories:

    docker-compose -f <docker-compose-filename>.yaml run --no-deps --rm trifacta initfiles
  2. When the above is started for the first time, the following directories are created on the localhost:

    Directory

    Description

    ./trifacta-data

    Used by the Alteryx container to expose the conf and logs directories.

    ./trifacta-license

    Place the license.json file in this directory.

  3. Generate .sql file containing sql statements to create users and databases necessary for Alteryx services:

    docker-compose -f <docker-compose-filename>.yaml run --no-deps --rm trifacta initdatabase
  4. The following file is created on localhost:

    Directory

    Description

    ./trifacta-data/db_setup/trifacta_<database_type>_DB_objects.sql

    Used by the Alteryx container to expose the conf and logs directories.

  5. Create users and database:

    • Postgres database:

      docker-compose -f <docker-compose-filename>.yaml run postgresdb sh -c "PGPASSWORD=<DB_ADMIN_PASSWORD> psql --username=<DB_ADMIN_USERNAME> --host=<DB_HOST_NAME> --port=<DB_HOST_PORT> --dbname=postgres -f /opt/trifacta/db_setup/trifacta_<DB_TYPE>_DB_objects.sql"
    • MySQL database:

      docker-compose -f <docker-compose-filename>.yaml run mysqldb sh -c "mysql --host=<DB_HOST_NAME> --port=<DB_HOST_PORT> --user=<DB_ADMIN_USERNAME> --password=<DB_ADMIN_PASSWORD> --database=mysql < /opt/trifacta/db_setup/trifacta_mysql_<DB_TYPE>_objects.sql"
  6. Run configuration and database migrations:

    Note

    During installation, the following command also creates required tables in the above databases.

    docker-compose -f <docker-compose-filename>.yaml run --no-deps --rm trifacta run-migrations
  7. Start Alteryx container:

    docker-compose -f <docker-compose-filename>.yaml up -d trifacta

Note

If the Alteryx container is running but nothing is listening at port 3005, please confirm that you have started the container using the appropriate docker-compose commands.

Import Additional Configuration Files

After you have started the new container, additional configuration files must be imported.

Import license key file

The Alteryx license file must be staged for use by the platform. Stage the file in the following location in the container:

Note

If you are using a non-default path or filename, you must update the <docker-compose-filename>.yaml file.

trifacta-license/license.json

Additional setup for Azure

For more information on setup on Azure using ADLS Gen2 storage, see ADLS Gen2 Access.

Additional setup for Hadoop on-premises

Import Hadoop distribution libraries

If the container you are creating is on the edge node of your Hadoop cluster, you must provide the Hadoop libraries.

  1. You must mount the Hadoop distribution libraries into the container. For more information on the libraries, see the documentation for your Hadoop distribution.

  2. The Docker Compose file must be made aware of these libraries. Details are below.

Import Hadoop cluster configuration files

Some core cluster configuration files from your Hadoop distribution must be provided to the container. These files must be copied into the following directory within the container:

./trifacta-data/conf/hadoop-site

For more information, see Configure for Hadoop in the Configuration Guide.

Install Kerberos client

If Kerberos is enabled, you must install the Kerberos client and keytab on the node container. Copy the keytab file to the following stage location:

./trifacta-data/conf/trifacta.keytab

See Configure for Kerberos Integration in the Configuration Guide.

Perform configuration changes as necessary

The primary configuration file for the platform is in the following location in the launched container:

trifacta-conf.json

Note

Unless you are comfortable working with this file, you should avoid direct edits to it. All subsequent configuration can be applied from within the application, which supports some forms of data validation. It is possible to corrupt the file using direct edits.

Configuration topics are covered later.

Start and Stop the Container

Stop container

Stops the container but does not destroy it.

Note

Application and local database data is not destroyed. As long as the <docker-compose-filename>.yaml properties point to the correct location of the *-data files, data should be preserved. You can start new containers to use this data, too. Do not change ownership on these directories.

docker-compose -f <docker-compose-filename>.yaml stop

Restart container

Restarts an existing container.

docker-compose -f <docker-compose-filename>.yaml start

Recreate container

Recreates a container using existing local data.

docker-compose -f <docker-compose-filename>.yaml up --force-recreate -d

Stop and destroy the container

Stops the container and destroys it.

Warning

The following also destroys all application configuration, logs, and database data. You may want to back up these directories first.

docker-compose -f <docker-compose-filename>.yaml down

Local PostgreSQL:

sudo rm -rf trifacta-data/ postgres-data/

Local MySQL or remote database:

sudo rm -rf trifacta-data/

Verify Deployment

  1. Verify access to the server where the Designer Cloud Powered by Trifacta platform is to be installed.

  2. Cluster Configuration: Additional steps are required to integrate the Designer Cloud Powered by Trifacta platform with the cluster. See Prepare Hadoop for Integration with the Platform in the Planning Guide.

  3. Start the platform within the container. See Start and Stop the Platform.

Configuration

After installation is complete, additional configuration is required. You can complete this configuration from within the application.

Steps:

  1. Login to the application. See Login.

  2. The primary configuration interface is the Admin Settings page. From the left menu, select User menu > Admin console > Admin settings. For more information, see Admin Settings Page in the Admin Guide.

  3. In the Admin Settings page, you should do the following:

    1. Configure password criteria. See Configure Password Criteria.

    2. Change the Admin password. See Change Admin Password.

  4. Workspace-level configuration can also be applied. From the left menu, select User menu > Admin console > Workspace settings. For more information, see Workspace Settings Page in the Admin Guide.

Warning

The Designer Cloud Powered by Trifacta platform requires additional configuration for a successful integration with the datastore. Please review and complete the necessary configuration steps. For more information, see Configure in the Configuration Guide.