Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • You cannot upgrade to a Docker image from a non-Docker deployment.
  • You cannot switch an existing installation to a Docker image.Hadoop integration defaults:The platform can integrate with supported versions of Cloudera only. The provided image is configured to work with CDH 5.15. To change, you must build your own Docker image.
  • Supported distributions of Cloudera or Hortonworks:
  • The base storage layer of the platform must be HDFS. Base storage of S3 is not supported.
  • High availability for the 
    D s platform
     in Docker is not supported.
  • SSO integration is not supported.

Requirements

...

You can acquire the latest Docker image using one of the following methods:

...

:

...

...


  1. Acquire from FTP site.
  2. Build your own Docker image.

...

  1. Download the following files from the FTP site:
    1. trifacta-docker-setup-bundle-x.y.z.tar
    2. trifacta-docker-image-x.y.z.tar 

      Info

      NOTE: x.y.z refers to the version number (e.g. 6.4.0).


  2. Untar the setup-bundle file:

    Code Block
    tar xvf trifacta-docker-setup-bundle-x.y.z.tar


  3. Files are extracted into a docker folder. Key files:

    FileDescription
    docker-compose-local-dbpostgres.yaml Runtime configuration file for the Docker image when PostgreSQL is to be running on the same machine. More information is provided below.
    docker-compose-local-mysql.yaml Runtime configuration file for the Docker image when MySQL is to be running on the same machine. More information is provided below.
    docker-compose-remote-db.yaml

    Runtime configuration file for the Docker image when PostgreSQL the database is to be accessed from a remote server.

    Info

    NOTE: You must manage this instance of PostgreSQLthe database.

    More information is provided below.

    README-running-trifacta-container.md

    Instructions for running the

    D s item
    itemcontainer

    Info

    NOTE: These instructions are referenced later in this workflow.


    README-building-trifacta-container.md

    Instructions for building the

    D s item
    itemcontainer

    Info

    NOTE: This file does not apply if you are using the provided Docker image.



  4. Load the Docker image into your local Docker environment:

    Code Block
    docker load < trifacta-docker-image-x.y.z.tar


  5. Confirm that the image has been loaded. Execute the following command, which should list the Docker image:

    Code Block
    docker images


  6. You can now configure the Docker image. Please skip that section.

...

FileDescription
docker-compose-localpostgres-db.yamlPostgreSQL will be automatically installed and managed on the local machine. Database properties in this file are pre-configured to work with the installed instance of PostgreSQL, although you may wish to change some of the properties for security reasons.
docker-compose-mysql-db.yamlDatabase properties in this file are pre-configured to work with the installed instance of MySQL, although you may wish to change some of the properties for security reasons.
docker-compose-remote-db.yaml

The

D s item
itemdatabases
are to be installed on a remote instance of PostgreSQL server that you manage.

Info

NOTE: Additional configuration is required.


...

PropertyDescription
imageThis reference must match the name of the image that you have acquired.
container_nameName of container in your Docker environment.depends on(Local only) Set this value to postgresdb to integrate with a PostgreSQL database.
ports

Defines the listening port for the

D s webapp
. Default is 3005.

Info

NOTE: If you must change the listening port, additional configuration is required after the image is deployed. See Change Listening Port

For more information, see System Ports.

...

These properties pertain to the installation of PostgreSQL the database to which the 

D s webapp
 connects.

PropertyDescription
DB_INIT

If set to true, database initialization steps are performed at startup.

Info

NOTE: This step applies only if you are starting the container for the first time, and PostgreSQL databases will be installed locally.


DB_TYPESet this value to postgrespostgresql or mysql.
DB_HOST_NAMEHostname of the machine hosting the databases. Leave value as localhost for local installation.
DB_HOST_PORT

(Remote only) Port number to use to connect to the databases. Default is 5432.

Info

NOTE: If you are modifying, additional configuration is required after installation is complete. See Change Database Port.


DB_ADMIN_USERNAME

Admin username to be used to create DB roles/databases. Modify this value for remote installation.

Info

NOTE: If you are modifying this value, additional configuration is required. Please see the PostgreSQL documentation for your database version.


DB_ADMIN_PASSWORDAdmin password to be used to create DB roles/databases. Modify this value for remote installation.

Hadoop properties:

If you are integrating the platform with a Hadoop cluster, please review the following properties.

...

Set this value to hdfs.

Info

NOTE: This value must correspond to the setting for the base storage layer. For Docker installation, only HDFS is supported. For more information, see Set Base Storage Layer.

...

Set this value to 5.15.

Info

NOTE: Docker-based installation only integrates with CDH 5.15.

Kerberos properties:

If your Hadoop cluster is protected by Kerberos, please review the following properties.

PropertyDescription
KERBEROS_KEYTAB_FILE

Full path inside of the container where the Kerberos keytab file is located. Default value:

Code Block
/opt/trifacta/conf/trifacta.keytab


Info

NOTE: The keytab file must be imported and mounted to this location. Configuration details are provided later.


KERBEROS_KRB5_CONF

Full path inside of the container where the Kerberos krb5.conf file is located. Default:

Code Block
/opt/krb-config/krb5.conf


Hadoop distribution client JARs:

Please enable the appropriate path to the client JAR files for your Hadoop distribution. In the following example, the Cloudera path has been enabled, and the Hortonworks path has been disabled:

Code Block
     # Mount folder from outside for necessary hadoop client jars
     # For CDH
     - /opt/cloudera:/opt/cloudera
     # For HDP
     #- /usr/hdp:/usr/hdp

Please modify these lines if you are using Hortonworks.

Volume properties:

These properties govern where volumes are mounted in the container.

...

Full path in container to the directory where PostgreSQL data is stored.

PropertyDescription
volumes.conf

Full path in container to the

D s item
itemconfiguration
directory. Default:

Code Block
/opt/trifacta/conf


volumes.logs

Full path in container to the

D s item
itemlogs
directory. Default:

Code Block
/opt/trifacta/logs


volumes.license

Full path in container to the

D s item
itemlicense
directory. Default:

Code Block
/trifacta-license
postgresdb.volumes
Info

NOTE: This property applies only if you are connecting to the default local instance of PostgreSQL, which is installed with the image.

Default:

Code Block
/var/lib/postgresql/data


Start Server Container

After you have performed the above configuration, execute the following to initialize the Docker container:

...

NOTE: This directory is created only if PostgreSQL is hosted locally

.

DirectoryDescription
./trifacta-data

Used by the

D s item
itemcontainer
to expose the conf and logs directories.

./postgres-data

Used by the PostgreSQL container to expose its data.

Info

Import Additional Configuration Files

...

Code Block
docker-compose -f <docker-compose-filename>.yaml down

Local PostgreSQL:

Code Block
sudo rm -rf trifacta-data/ postgres-data/

Local MySQL or remote database:

Code Block
sudo rm -rf trifacta-data/


Verify Deployment

  1. Verify access to the server where the 

    D s platform
     is to be installed.

  2. Cluster Configuration: Additional steps are required to integrate the 

    D s platform
     with the cluster. See Prepare Hadoop for Integration with the Platform.

  3. Start the platform within the container. See Start and Stop the Platform.

...