Page tree


Outdated release! Latest docs are Release 8.7: Install Steps

   

After you have prepared your environment for installation, please complete the following steps to install the Trifacta® platform.

Special instructions for Ubuntu installs

These steps manually install the correct and supported version of the following:

  • nodeJS
  • nginX
  • Supervisord

Due to a known issue resolving package dependencies on Ubuntu, please complete the following steps prior to installation of other dependencies or software. 

  1. Login to the Trifacta node as an administrator.

  2. Execute the following command to install  nodeJS, nginX, and Supervisor:

    1. Ubuntu 16.04 (Xenial):

      sudo apt-get install supervisor=3.2.4 nginx=1.12.2-1~xenial nodejs=12.16.1-1nodesource1
    2. Ubuntu 18.04 (Bionic Beaver):

      sudo apt-get install supervisor=3.2.4 nginx=1.14.2-1~bionic nodejs=12.16.1-1nodesource1
  3. Continue with the installation process.

1. Install OpenJDK

You must install a supported version of OpenJDK on the  Trifacta node.

NOTE: If you are integrating with S3, a different version of Java may be required. For more information, see System Requirements.

Install commands:

Centos/RHEL:

sudo yum install java-1.8.0-openjdk-1:1.8.0.242.b08-0.el8_1.x86_64 java-1.8.0-openjdk-devel

NOTE: If java-1.8.0-openjdk-devel is not included, the batch job runner service, which is required, fails to start.

Ubuntu:

sudo apt-get install openjdk-8-jre-headless

JAVA_HOME:

By default, the JAVA_HOME environment variable is configured to point to a default install location for the OpenJDK package. The property value must be updated in the following locations:

  1. Edit the following file: /opt/trifacta/conf/env.sh

  2. Save changes.
  3. To apply this configuration change, login as an administrator to the Trifacta node. Then, edit
    trifacta-conf.json
    . Some of these settings may not be available through the Admin Settings Page. For more information, see Platform Configuration Methods.
  4. Update the following parameter value:

    "env.JAVA_HOME": "/usr/lib/jvm/java-1.8.0-openjdk.x86_64",
  5. Save changes.

2. Install Dependencies

NOTE: Install curl if not present on your system.

Install dependencies with Internet access for CentOS or RHEL:

Use the following to add the hosted package repository for CentOS/RHEL, which will automatically install the proper packages for your environment. These steps also install the proper version of PostgreSQL and the Trifacta database.

 

# If the client has curl installed ...
curl https://packagecloud.io/install/repositories/trifacta/dependencies/script.rpm.sh | sudo bash
 
# Otherwise, you can also use wget ...
wget -qO- https://packagecloud.io/install/repositories/trifacta/dependencies/script.rpm.sh | sudo bash

Install dependencies with Internet access for Ubuntu:

Use the following to add the hosted package repository for Ubuntu, which will automatically install the proper packages for your environment. 

NOTE:  When dependencies are acquired for Ubuntu, the operating system grabs the latest version of a dependency, even if it is later than the version on which the software is dependent. In some cases, this mismatch can result in installation errors, which can be fixed by manually installing the dependency with the correct version.

Then, execute the following command: 

curl https://packagecloud.io/install/repositories/trifacta/dependencies/script.deb.sh | sudo bash


Contents:


Offline dependencies should be included in the URL location that  Trifacta® provided to you. Please use the \*deps\* file.

NOTE: If your installation server is connected to the Internet, the required dependencies are automatically downloaded and installed for you. You may skip this section.

Use the steps below to acquire and install dependencies required by the Trifacta platform. If you need further assistance, please contact  Trifacta Support.

Install CentOS or RHEL dependencies without Internet access 

Install CentOS or RHEL software dependencies

  1. In a CentOS or RHEL environment, the dependencies repository must be installed into the following directory: 

    /var/local/trifacta
  2. The following commands configure Yum to point to the repository in /var/local/trifacta, which yum knows as local. Repo permissions are set appropriately. Commands:

    tar xvzf <DEPENDENCIES_ARCHIVE>.tar.gz
    mv local.repo /etc/yum.repos.d
    mv trifacta /var/local
    chown -R root:root /var/local/trifacta
    chmod -R o-w+r /var/local/trifacta
  3. The following command installs the RPM while disable all repos other than local, which prevents the installer from reaching out to the Internet for package updates:

    NOTE: The disabling of repositories only applies to this command.

    sudo yum --disablerepo=* --enablerepo=local install <INSTALLER>.rpm
  4. If the above command fails and complains about a missing repo, you can add the missing repo to the enablerepo list. For example, if the centos-base repo is reported as missing, then the command would be the following:

    sudo yum --disablerepo=* --enablerepo=local,centos-base install <INSTALLER>.rpm
  5. If you do not have a supported version of a Java Developer Kit installed on the Trifacta node, you can use the following command to install OpenJDK, which is included in the offline dependencies:

    sudo yum --disablerepo=* --enablerepo=local,centos-base install java-1.8.0-openjdk-1.8.0 java-1.8.0-openjdk-devel
  6. For CentOS 8.x: If you are installing on CentOS 8.x, you must complete the following manual dependency install for NodeJS.

    sudo yum --disablerepo=* --enablerepo=local nodejs-12.16.1-1nodesource.x86_64.rpm

Install CentOS or RHEL database dependencies

If you are installing the databases on a CentOS node without Internet access, you can install the dependencies using either of the following commands:

NOTE: This step is only required if you are installing the databases on the same node where the software is installed.


For PostgreSQL:

sudo yum --disablerepo=* --enablerepo=local install postgresql96-server


For MySQL:

sudo yum --disablerepo=* --enablerepo=local install mysql-community-server

NOTE: You must also install the MySQL JARs on the Trifacta node. These instructions are provided later.

Database are installed after the software is installed. For more information, see Install Databases in the Databases Guide.

Install Ubuntu dependencies without Internet access

Install Ubuntu software dependencies

In an Ubuntu environment, you can use the following sequence of commands to install the dependencies without Internet access. 

  1. Unzip the tar ball and change to the trifacta-repo directory. The following example filename is for Release 7.0.0 and Ubuntu 18.04 (Bionic Beaver):

    tar xvzf trifacta-server-deps-7.0.0-ubuntu-18.04.tar.gz
    cd trifacta-repo
  2. Execute the following commands to install the dependencies:

    sudo dpkg -i $(ls | grep minimal | sort)
    sudo dpkg -i $(ls | grep -v ^python | sort)
    sudo dpkg -i $(ls | grep python | sort)
    sudo apt-get -f -y install
    sudo dpkg -i <TRIFACTA_DEB_INSTALLER>

3. Install  Trifacta package

NOTE: Installing the Trifacta platform in a different directory other than the default one is not supported.

For CentOS or RHEL:

Install the package with yum, using root:

sudo yum install <rpm file>

For Ubuntu:

Install the package with apt, using root:

sudo dpkg -i <deb file>

The previous line may return an error message, which you may ignore. Continue with the following command: 

sudo apt-get -f -y install

The product is installed in the following directory: 

/opt/trifacta

4. Install License Key

Please install the license key provided to you by  Trifacta into the following directory:

/opt/trifacta/license

For more information, see License Key.

5. Store install packages

For safekeeping, you should retain all install packages that have been installed with this Trifacta deployment.

6. Install databases

Use the following commands to install the Trifacta databases in a local instance of PostgreSQL.


NOTE: The following distributions and commands are for PostgreSQL 9.6.


For CentOS 7.x:

wget https://download.postgresql.org/pub/repos/yum/9.6/redhat/rhel-7-x86_64/pgdg-redhat-repo-42.0-9.noarch.rpm
sudo yum -y install pgdg-redhat-repo-42.0-9.noarch.rpm
sudo yum -y install postgresql96-server

For CentOS 8.x:

wget https://download.postgresql.org/pub/repos/yum/9.6/redhat/rhel-8.1-x86_64/pgdg-redhat-repo-42.0-9.noarch.rpm
sudo yum -y install pgdg-redhat-repo-42.0-9.noarch.rpm
sudo yum -y install postgresql96-server

For Red Hat Enterprise Linux 7.x:

wget https://download.postgresql.org/pub/repos/yum/9.6/redhat/rhel-7-x86_64/pgdg-redhat-repo-42.0-9.noarch.rpm
sudo yum -y install pgdg-redhat-repo-42.0-9.noarch.rpm
sudo yum -y install postgresql96-server


For Red Hat Enterprise Linux 8.x:

wget https://download.postgresql.org/pub/repos/yum/9.6/redhat/rhel-8.1-x86_64/pgdg-redhat-repo-42.0-9.noarch.rpm
sudo yum -y install pgdg-redhat-repo-42.0-9.noarch.rpm
sudo yum -y install postgresql96-server

For Ubuntu 16.04:

Add the repository's archive key to your apt-key keyring:

wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -

Create a file named /etc/apt/sources.list.d/pgdg.list, containing the following:

deb http://apt.postgresql.org/pub/repos/apt/xenial-pgdg main

Run the following command:

sudo apt-get update
sudo apt-get install -y postgresql-9.6

For Ubuntu 18.04:

Add the repository's archive key to your apt-key keyring:

wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -

Create a file named /etc/apt/sources.list.d/pgdg.list, containing the following:

deb http://apt.postgresql.org/pub/repos/apt/bionic-pgdg main

Run the following command:

sudo apt-get update
sudo apt-get install -y postgresql-9.6

7. Configure databases

Contents:


Initialize

Use the following steps to initialize the databases of the Trifacta® platform.

NOTE: These steps assume that the Trifacta node is the host of these databases. Please modify the following steps if you are connecting to databases on other nodes.

Pre-requisites:

  • The initializing user must have write permissions to the directory from which the commands are executed.
  • The initializing user must have sudo privileges.

PostgreSQL

NOTE: In the following steps, the default version is PostgreSQL 9.6.

  1. For CentOS 7.x, CentOS 8.x:

    sudo /usr/pgsql-9.6/bin/postgresql96-setup initdb
  2. For CentOS 7.x - PostgreSQL 12.3:

    sudo /usr/pgsql-12/bin/postgresql12-setup initdb
  3. For RHEL 7.x, RHEL 8.x:

    sudo /usr/pgsql-9.6/bin/postgresql96-setup initdb
  4. For RHEL 7.x - PostgreSQL 12.3:

    NOTE: This feature is in Beta release.

    sudo /usr/pgsql-12/bin/postgresql12-setup initdb
  5. For Ubuntu 16.04 / 18.04:

    pg_createcluster -d /var/lib/postgresql/9.6/main 9.6 main

MySQL

No additional steps are required to initialize the databases in MySQL.

Set custom database parameters

Use the following steps to set custom database names, usernames, and passwords in the Trifacta platform:

  1. Edit 

    /opt/trifacta/conf/trifacta-conf.json

  2. For each database, you can review the parameters in the listed area and make modifications as needed.

    NOTE: For each database, you should change the default password. This change must also be applied on the database server. See Change Database Passwords for PostgreSQL . See Change Database Passwords for MySQL.

    NOTE: The type is set to POSTGRESQL by default. Modify the value if you are installing the databases into a different database server.


    DatabaseParameter area
    Main databasewebapp.database.*
    Jobs databasebatch-job-runner.database.*
    Scheduling databasescheduling-service.database.*
    Time-Based Trigger databasetime-based-trigger-service.database.*
    Configuration Service databaseconfiguration-service.database.*
    Job Metadata Service databasejob-metadata-service.database.*
    Artifact Storage Service database

    artifact-storage-service.database.*

    Authorization Service databaseauthorization-service.database.*
    Orchestration Service databaseorchestration-service.database.*
    Optimizer Service databaseoptimizer-service.database.*

    For more information, see Database Parameter Reference.

  3. Make changes in the file as needed and save.

Apply customizations on upgrade

If you have customized database properties, you must apply the edits from the new sample file to the existing configuration file after you have upgrade the Trifacta platform.

If you are using all defaults, you can just overwrite the existing file with the new version's sample file. 

PostgreSQL:

  1. Locate the sample Postgres configuration file:

    /opt/trifacta/bin/setup-utils/db/pg_hba.conf.SAMPLE
  2. If you are upgrading and have customizations in your existing version, you must apply the edits in the above to the following file. Otherwise, overwrite the following file with the above one based on your operating system:
    1. CentOS/RHEL dir: /var/lib/pgsql/9.6/data/pg_hba.conf
    2. Ubuntu dir: /etc/postgresql/9.6/main/pg_hba.conf

  3. From the SAMPLE file, copy the following declarations and paste them into the production pg_hba.conf file above any other declarations:

    NOTE: You can substitute different database usernames and groups for the ones listed below (trifacta and trifacta). These values may be needed for other configuration.

     

    1. Trifacta database:

      local   trifacta         trifacta                               md5
      host    trifacta         trifacta         127.0.0.1/32          md5
      host    trifacta         trifacta         ::1/128               md5
    2. Jobs database:

      local   trifacta-activiti         trifactaactiviti                               md5
      host    trifacta-activiti         trifactaactiviti         127.0.0.1/32          md5
      host    trifacta-activiti         trifactaactiviti         ::1/128               md5
    3. Scheduling database: 

      local   trifactaschedulingservice         trifactaschedulingservice                               md5
      host    trifactaschedulingservice         trifactaschedulingservice         127.0.0.1/32          md5
      host    trifactaschedulingservice         trifactaschedulingservice         ::1/128               md5
    4. Time-based Trigger database:

      local   trifactatimebasedtriggerservice         trifactatimebasedtriggerservice                               md5
      host    trifactatimebasedtriggerservice         trifactatimebasedtriggerservice         127.0.0.1/32          md5
      host    trifactatimebasedtriggerservice         trifactatimebasedtriggerservice         ::1/128               md5
    5. Configuration Service database:

      local   trifactaconfigurationservice         trifactaconfigurationservice                               md5
      host    trifactaconfigurationservice         trifactaconfigurationservice         127.0.0.1/32          md5
      host    trifactaconfigurationservice         trifactaconfigurationservice         ::1/128               md5
    6. Artifact Storage Service database:

      local   trifactaartifactstorageservice         trifactaartifactstorageservice                               md5
      host    trifactaartifactstorageservice         trifactaartifactstorageservice         127.0.0.1/32          md5
      host    trifactaartifactstorageservice         trifactaartifactstorageservice         ::1/128               md5
    7. Job Metadata Service database:

      local   trifactajobmetadataservice         trifactajobmetadataservice                               md5
      host    trifactajobmetadataservice         trifactajobmetadataservice         127.0.0.1/32          md5
      host    trifactajobmetadataservice         trifactajobmetadataservice         ::1/128               md5
    8. Authorization Service database:

      local   trifactaauthorizationservice         trifactaauthorizationservice                               md5
      host    trifactaauthorizationservice         trifactaauthorizationservice         127.0.0.1/32          md5
      host    trifactaauthorizationservice         trifactaauthorizationservice         ::1/128               md5
    9. Orchestration Service database:

      local   trifactaorchestrationservice         trifactaorchestrationservice                               md5
      host    trifactaorchestrationservice         trifactaorchestrationservice         127.0.0.1/32          md5
      host    trifactaorchestrationservice         trifactaorchestrationservice         ::1/128               md5
    10. Optimizer Service database:

      local   trifactoptimizerservice         trifactoptimizerservice                               md5
      host    trifactoptimizerservice         trifactoptimizerservice         127.0.0.1/32          md5
      host    trifactoptimizerservice         trifactoptimizerservice         ::1/128               md5
    11. Save the file.
  4. Restart the databases:

    1. If you are have also restarted the operating system, please execute the following first, followed by the O/S-specific commands:

      NOTE: This command is valid only if the Postgres DB is also hosted in the Trifacta node.


      chkconfig postgresql-9.6 on

      For PostgreSQL 12.3 on CentOS 7:

      chkconfig postgresql-12 on
    2. CentOS/RHEL:

      sudo service postgresql-9.6 start
    3. CentOS/RHEL 7 (PostgreSQL 12.3):

      sudo service postgresql-12 start
    4. Ubuntu:

      sudo service postgresql start

MySQL:

Upgrading MySQL versions is not supported in this release.

Next Steps

  1. If the configuration files indicate that the databases are listening on a port other than the default, this port number must be applied within the Trifacta platform configuration. For more information, see Change Database Port.
  2. If you are using non-default usernames and passwords, they must must be applied within the Trifacta platform configuration. For more information, see Change Database Passwords for PostgreSQL.
  3. When you have completed the above configuration, you can create the databases and their roles (users) and perform additional configuration. See Create Databases and Users.

For more information, see Configure the Databases.

If needed, you can change the default database port. See Change Database Port.

This page has no comments.