Page tree

 

After you have prepared your environment for installation, please complete the following steps to install the Trifacta® platform.

1. Install OpenJDK

You must install a supported version of OpenJDK on the  Trifacta node.

NOTE: If you are integrating with S3, a different version of Java may be required. For more information, see System Requirements.

Install commands:

Centos/RHEL:

sudo yum install java-1.8.0-openjdk-1.8.0 java-1.8.0-openjdk-devel

NOTE: If java-1.8.0-openjdk-devel is not included, the batch job runner service, which is required, fails to start.

Ubuntu:

sudo apt-get install openjdk-8-jre-headless

JAVA_HOME:

By default, the JAVA_HOME environment variable is configured to point to a default install location for the OpenJDK package. The property value must be updated in the following locations:

  1. Edit the following file: /opt/trifacta/conf/env.sh

  2. Save changes.
  3. To apply this configuration change, login as an administrator to the Trifacta node. Then, edit trifacta-conf.json. Some of these settings may not be available through the Admin Settings Page. For more information, see Platform Configuration Methods.
  4. Update the following parameter value:

    "env.JAVA_HOME": "/usr/lib/jvm/java-1.8.0-openjdk.x86_64",
  5. Save changes.

2. Install Dependencies

NOTE: Install curl if not present on your system.

Install dependencies with Internet access for CentOS or RHEL:

Use the following to add the hosted package repository for CentOS/RHEL, which will automatically install the proper packages for your environment. These steps also install the proper version of PostgreSQL and the Trifacta database.

 

# If the client has curl installed ...
curl https://packagecloud.io/install/repositories/trifacta/dependencies/script.rpm.sh | sudo bash
 
# Otherwise, you can also use wget ...
wget -qO- https://packagecloud.io/install/repositories/trifacta/dependencies/script.rpm.sh | sudo bash

Install dependencies with Internet access for Ubuntu:

Use the following to add the hosted package repository for Ubuntu, which will automatically install the proper packages for your environment. 

NOTE:  When dependencies are acquired for Ubuntu, the operating system grabs the latest version of a dependency, even if it is later than the version on which the software is dependent. In some cases, this mismatch can result in installation errors, which can be fixed by manually installing the dependency with the correct version.

Then, execute the following command: 

curl https://packagecloud.io/install/repositories/trifacta/dependencies/script.deb.sh | sudo bash


Offline dependencies should be included in the URL location that  Trifacta® provided to you. Please use the \*deps\* file.

NOTE: If your installation server is connected to the Internet, the required dependencies are automatically downloaded and installed for you. You may skip this section.

Use the steps below to acquire and install dependencies required by the Trifacta platform. If you need further assistance, please contact  Trifacta Support.

Install dependencies without Internet access for CentOS or RHEL:

  1. In a CentOS or RHEL environment, the dependencies repository must be installed into the following directory: 

    /var/local/trifacta
  2. The following commands configure Yum to point to the repository in /var/local/trifacta, which yum knows as local. Repo permissions are set appropriately. Commands:

    tar xvzf <DEPENDENCIES_ARCHIVE>.tar.gz
    mv local.repo /etc/yum.repos.d
    mv trifacta /var/local
    chown -R root:root /var/local/trifacta
    chmod -R o-w+r /var/local/trifacta
  3. The following command installs the RPM while disable all repos other than local, which prevents the installer from reaching out to the Internet for package updates:

    NOTE: The disabling of repositories only applies to this command.

    sudo yum --disablerepo=* --enablerepo=local install <INSTALLER>.rpm
  4. If the above command fails and complains about a missing repo, you can add the missing repo to the enablerepo list. For example, if the centos-base repo is reported as missing, then the command would be the following:

    sudo yum --disablerepo=* --enablerepo=local,centos-base install <INSTALLER>.rpm
  5. If you do not have a supported version of a Java Developer Kit installed on the Trifacta node, you can use the following command to install OpenJDK, which is included in the offline dependencies:

    sudo yum --disablerepo=* --enablerepo=local,centos-base install java-1.8.0-openjdk-1.8.0 java-1.8.0-openjdk-devel

Install dependencies without Internet access in Ubuntu:

If you are trying to perform a manual installation of dependencies in Ubuntu, please contact  Trifacta Support.

3. Install  Trifacta package

NOTE: Installing the Trifacta platform in a different directory other than the default one is not supported.

For CentOS or RHEL:

Install the package with yum, using root:

sudo yum install <rpm file>

For Ubuntu:

Install the package with apt, using root:

sudo dpkg -i <deb file>

The previous line may return an error message, which you may ignore. Continue with the following command: 

sudo apt-get -f -y install

The product is installed in the following directory: 

/opt/trifacta

4. Install License Key

Please install the license key provided to you by  Trifacta into the following directory:

/opt/trifacta/license

For more information, see License Key.

5. Store install packages

For safekeeping, you should retain all install packages that have been installed with this Trifacta deployment.

6. Install databases

Use the following commands to install the Trifacta databases in a local instance of PostgreSQL.


NOTE: The following distributions and commands are for PostgreSQL 9.6.


For CentOS 6.x:

wget https://download.postgresql.org/pub/repos/yum/9.6/redhat/rhel-6-x86_64/pgdg-centos96-9.6-3.noarch.rpm
sudo yum -y install pgdg-centos96-9.6-3.noarch.rpm
sudo yum -y install postgresql96-server

For CentOS 7.x:

wget https://download.postgresql.org/pub/repos/yum/9.6/redhat/rhel-7-x86_64/pgdg-centos96-9.6-3.noarch.rpm
sudo yum -y install pgdg-centos96-9.6-3.noarch.rpm
sudo yum -y install postgresql96-server

For Red Hat Enterprise Linux 6.x:

wget https://download.postgresql.org/pub/repos/yum/9.6/redhat/rhel-6-x86_64/pgdg-redhat96-9.6-3.noarch.rpm
sudo yum -y install pgdg-redhat96-9.6-3.noarch.rpm
sudo yum -y install postgresql96-server

For Red Hat Enterprise Linux 7.x:

wget https://download.postgresql.org/pub/repos/yum/9.6/redhat/rhel-7-x86_64/pgdg-redhat96-9.6-3.noarch.rpm
sudo yum -y install pgdg-redhat96-9.6-3.noarch.rpm
sudo yum -y install postgresql96-server

For Ubuntu 14.04:

Add the repository's archive key to your apt-key keyring:

wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -

Create a file named /etc/apt/sources.list.d/pgdg.list, containing the following:

deb http://apt.postgresql.org/pub/repos/apt/ trusty-pgdg main
deb-src http://apt.postgresql.org/pub/repos/apt/ trusty-pgdg main

Run the following command:

sudo apt-get update
sudo apt-get install -y postgresql-9.6


For Ubuntu 16.04:

Add the repository's archive key to your apt-key keyring:

wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -

Create a file named /etc/apt/sources.list.d/pgdg.list, containing the following:

deb http://apt.postgresql.org/pub/repos/apt/ xenial-pgdg main

Run the following command:

sudo apt-get update
sudo apt-get install -y postgresql-9.6

7. Configure databases

Contents:


Initialize

Use the following steps to initialize the databases of the Trifacta® platform.

NOTE: These steps assume that the Trifacta node is the host of these databases. Please modify the following steps if you are connecting to databases on other nodes.

Pre-requisites:

  • The initializing user must have write permissions to the directory from which the commands are executed.
  • The initializing user must have sudo privileges.

PostgreSQL

NOTE: The following steps are for configuring PostgreSQL 9.6.


  1. For CentOS 7.x:

    sudo /usr/pgsql-9.6/bin/postgresql96-setup initdb
  2. For CentOS 6.x, RHEL 6.x:

    sudo service postgresql-9.6 initdb
  3. For RHEL 7.x:

    sudo /usr/pgsql-9.6/bin/postgresql96-setup initdb
  4. For Ubuntu 14.04 / 16.04:

    pg_createcluster -d /var/lib/postgresql/9.6/main 9.6 main

Verify datastyle value

NOTE: This configuration step only applies if the Trifacta databases are installed on a PostgreSQL instance that was not created as part of a Trifacta Wrangler Enterprise installation.

Please verify the datestyle setting for your PostgreSQL instance.

Steps:

  1. Execute the following in PostgreSQL:

    trifacta=# show datestyle;
     DateStyle
    -----------
     ISO, MDY
    (1 row)
  2. If the first value is not ISO, please edit the following file (PostgreSQL 9.6):
    1. RHEL/CentOS:  /var/lib/pgsql/9.6/data/postgresql.conf
    2. Ubuntu:  /etc/postgresql/9.6/main/postgresql.conf
  3. Set the datestyle property to the following:

    datestyle = 'ISO,MDY';
  4. Save the file. Restart the Trifacta platform.

MySQL

No additional steps are required to initialize the databases in MySQL.

Set custom database parameters

Use the following steps to set custom database names, usernames, and passwords in the Trifacta platform:

  1. Edit  /opt/trifacta/conf/trifacta-conf.json

  2. For each database below, you can review the database name, username, and password. 

    DatabasePropertyNotes
    Main databasewebapp.database.name 
     webapp.database.username 
     webapp.database.passwordYou should change the default password. This change must also be applied on the database server. See Change Database Passwords for PostgreSQL. See Change Database Passwords for MySQL.
     webapp.database.typeChange this value only if you are installing the databases in a non-PostgreSQL environment.
    Jobs databasebatch-job-runner.database.name 
     batch-job-runner.database.username 
     batch-job-runner.database.passwordYou should change the default password. This change must also be applied on the database server. See Change Database Passwords for PostgreSQL. See Change Database Passwords for MySQL.
     batch-job-runner.database.typeChange this value only if you are installing the databases in a non-PostgreSQL environment.
    Scheduling databasescheduling-service.database.name 
     scheduling-service.database.username 
     scheduling-service.database.passwordYou should change the default password. This change must also be applied on the database server. See Change Database Passwords for PostgreSQL. See Change Database Passwords for MySQL.
     scheduling-service.database.type

    Change this value only if you are installing the databases in a non-PostgreSQL environment.

    NOTE: H2 database type is used for internal testing. It is not a supported database.

    Time-Based Trigger databasetime-based-trigger-service.database.name 
     time-based-trigger-service.database.username 
     time-based-trigger-service.database.passwordYou should change the default password. This change must also be applied on the database server. See Change Database Passwords for PostgreSQL. See Change Database Passwords for MySQL.
     time-based-trigger-service.database.type

    Change this value only if you are installing the databases in a non-PostgreSQL environment.

    NOTE: H2 database type is used for internal testing. It is not a supported database.

    Configuration Service databaseconfiguration-service.database.name 
     configuration-service.database.username 
     configuration-service.database.passwordYou should change the default password. This change must also be applied on the database server. See Change Database Passwords for PostgreSQL. See Change Database Passwords for MySQL.
     configuration-service.database.type

    Change this value only if you are installing the databases in a non-PostgreSQL environment.

    NOTE: H2 database type is used for internal testing. It is not a supported database.

    Artifact Storage Service databaseartifact-storage-service.database.name 
     artifact-storage-service.database.username 
     artifact-storage-service.database.passwordYou should change the default password. This change must also be applied on the database server. See Change Database Passwords for PostgreSQL. See Change Database Passwords for MySQL.
     artifact-storage-service.database.type

    Change this value only if you are installing the databases in a non-PostgreSQL environment.

    NOTE: H2 database type is used for internal testing. It is not a supported database.



     

  3. Make changes in the file as needed and save.

Apply customizations on upgrade

If you have customized database properties, you must apply the edits from the new sample file to the existing configuration file after you have upgrade the Trifacta platform.

If you are using all defaults, you can just overwrite the existing file with the new version's sample file. 

PostgreSQL:

  1. Locate the sample Postgres configuration file:

    /opt/trifacta/bin/setup-utils/db/pg_hba.conf.SAMPLE
  2. If you are upgrading and have customizations in your existing version, you must apply the edits in the above to the following file. Otherwise, overwrite the following file with the above one based on your operating system:
    1. CentOS/RHEL dir: /var/lib/pgsql/9.6/data/pg_hba.conf 
    2. Ubuntu dir: /etc/postgresql/9.6/main/pg_hba.conf 

  3. From the SAMPLE file, copy the following declarations and paste them into the production pg_hba.conf file above any other declarations:

    NOTE: You can substitute different database usernames and groups for the ones listed below (trifacta and trifacta). These values may be needed for other configuration.

     

    1. Trifacta database:

      local   trifacta         trifacta                               md5
      host    trifacta         trifacta         127.0.0.1/32          md5
      host    trifacta         trifacta         ::1/128               md5
    2. Jobs database:

      local   trifacta-activiti         trifactaactiviti                               md5
      host    trifacta-activiti         trifactaactiviti         127.0.0.1/32          md5
      host    trifacta-activiti         trifactaactiviti         ::1/128               md5



    3. Scheduling database: 

      local   trifactaschedulingservice         trifactaschedulingservice                               md5
      host    trifactaschedulingservice         trifactaschedulingservice         127.0.0.1/32          md5
      host    trifactaschedulingservice         trifactaschedulingservice         ::1/128               md5

      For more information on scheduling, see Configure Automator

    4. Time-based Trigger database:

      local   trifactatimebasedtriggerservice         trifactatimebasedtriggerservice                               md5
      host    trifactatimebasedtriggerservice         trifactatimebasedtriggerservice         127.0.0.1/32          md5
      host    trifactatimebasedtriggerservice         trifactatimebasedtriggerservice         ::1/128               md5

      For more information on scheduling, see Configure Automator.

    5. Configuration Service database:

      local   trifactaconfigurationservice         trifactaconfigurationservice                               md5
      host    trifactaconfigurationservice         trifactaconfigurationservice         127.0.0.1/32          md5
      host    trifactaconfigurationservice         trifactaconfigurationservice         ::1/128               md5
    6. Artifact Storage Service database:

      local   trifactaartifactstorageservice         trifactaartifactstorageservice                               md5
      host    trifactaartifactstorageservice         trifactaartifactstorageservice         127.0.0.1/32          md5
      host    trifactaartifactstorageservice         trifactaartifactstorageservice         ::1/128               md5
    7. Save the file.

  4. Restart the databases:

    1. If you are have also restarted the operating system, please execute the following first, followed by the O/S-specific commands:

      NOTE: This command is valid only if the Postgres DB is also hosted in the Trifacta node.


      chkconfig postgresql-9.6 on

       

    2. CentOS/RHEL:

      sudo service postgresql-9.6 start
    3. Ubuntu:

      sudo service postgresql start

MySQL:

Upgrading MySQL versions is not supported in this release.

Next Steps

  1. If the configuration files indicate that the databases are listening on a port other than the default, this port number must be applied within the Trifacta platform configuration. For more information, see Change Database Port.
  2. If you are using non-default usernames and passwords, they must must be applied within the Trifacta platform configuration. For more information, see Change Database Passwords for PostgreSQL.
  3. When you have completed the above configuration, you can create the databases and their roles (users) and perform additional configuration. See Create Databases and Users.

For more information, see Configure the Databases.

If needed, you can change the default database port. See Change Database Port.

This page has no comments.