Page tree

 

Contents:


This guide takes you through installation steps for Trifacta® Wrangler Enterprise.

Preparation

Before you begin, please complete the following.

NOTE: Except for database installation and configuration, all install commands should be run as the root user or a user with similar privileges. For database installation, you will be asked to switch the postgres user.

 

Steps:

  1. Set the node where Trifacta Wrangler Enterprise is to be installed. 
    1. Review the System Requirements and verify that all required components have been installed.
    2. Verify that all required system ports are opened on the node. See System Ports.
  2. Review the Desktop Requirements.

    NOTE: Trifacta Wrangler Enterprise requires the installation of Google Chrome on each desktop. Additionally, two plugins must be enabled and of sufficient versions to properly use the Photon running environment.

  3. Review the System Dependencies.

    NOTE: If you are installing on node without access to the Internet, you must download the offline dependencies before you begin. See Install Dependencies without Internet Access.

  4. Acquire your License Key.
  5. Install and verify operations of the datastore, if used.

    NOTE: In some cases, access to the Hadoop cluster is required.

  6. Verify access to the server where the Trifacta platform is to be installed.

  7. Databases: Initialize the Trifacta databases in the Postgres environment. See Set up the Databases.

  8. Hadoop: Additional steps are required to integrate the Trifacta platform with Hadoop. See Prepare Hadoop for Integration with the Platform.

Installation

1. Install Dependencies

For CentOS or RHEL with Internet access

Use the following to add the hosted package repository for CentOS/RHEL, which will automatically install the proper packages for your environment. 

 

# If the client has curl installed ...
curl https://packagecloud.io/install/repositories/trifacta/dependencies/script.rpm.sh | sudo bash
 
# Otherwise, you can also use wget ...
wget -qO- https://packagecloud.io/install/repositories/trifacta/dependencies/script.rpm.sh | sudo bash

 

For Ubuntu with Internet access

Use the following to add the hosted package repository for Ubuntu, which will automatically install the proper packages for your environment. 

NOTE: Install curl if not present on your system.

Then, execute the following command: 

NOTE: Run the following command as the root user. In proxied environments, the script may encounter issues with detecting proxy settings.

curl https://packagecloud.io/install/repositories/trifacta/dependencies/script.deb.sh | sudo bash

Install dependencies without Internet access

You may also download the dependency bundle with your release directly from Trifacta. For more information, see Install Dependencies without Internet Access.

Special instructions for Ubuntu installs

These steps manually install the correct and supported version of the following:

  • nodeJS
  • nginX

Due to a known issue resolving package dependencies on Ubuntu, please complete the following steps prior to installation of other dependencies or software. 

  1. Login to the Trifacta node as an administrator.

  2. Execute the following command to install the appropriate versions of nodeJS and nginX.

    1. Ubuntu 14.04:

      sudo apt-get install nginx=1.12.2-1~trusty nodejs=6.12.2-1nodesource1
    2. Ubuntu 16.04 

      sudo apt-get install nginx=1.12.2-1~xenial nodejs=6.12.2-1nodesource1
  3. Continue with the installation process.

2. Install JDK

By default, the  Trifacta node uses OpenJDK for accessing Java libraries and components. In some environments, basic setup of the node may include installation of a JDK. Please review your environment to verify that an appropriate JDK version has been installed on the node.

NOTE: Use of Java Development Kits other than OpenJDK is not currently supported. However, the platform may work with the Java Development Kit of your choice, as long as it is compatible with the supported version(s) of Java. See System Requirements.

NOTE: OpenJDK is included in the offline dependencies, which can be used to install the platform without Internet access. For more information, see Install Dependencies without Internet Access.

The following commands can be used to install OpenJDK. These commands can be modified to install a separate compatible version of the JDK.

Centos/RHEL:

sudo yum install java-1.8.0-openjdk-1.8.0 java-1.8.0-openjdk-devel

NOTE: If java-1.8.0-openjdk-devel is not included, the batch job runner service, which is required, fails to start.

Ubuntu:

sudo apt-get install openjdk-8-jre-headless

JAVA_HOME:

By default, the JAVA_HOME environment variable is configured to point to a default install location for the OpenJDK package. 

NOTE: If you have installed a JDK other than the OpenJDK version provided with the software, you must set the JAVA_HOME environment variable on the Trifacta node to point to the correct install location. 

The property value must be updated in the following locations:

  1. Edit the following file: /opt/trifacta/conf/env.sh

  2. Save changes.
  3. Edit the following file:  trifacta-conf.json
  4. Update the following parameter value:

    "env": {
      "JAVA_HOME": "/usr/lib/jvm/java-1.8.0-openjdk.x86_64"
    },
  5. Save changes.

3. Install and configure Trifacta databases

The Trifacta platform requires installation of four Postgres databases. 

Use the following command to check for other instances of Postgres on the Trifacta node:

grep -e '^\s*PGPORT.*=' postgresql postgresql-9.3

If the output indicates other instances, you may have to remove it or change the port used by the Trifacta platform instance. For more information, see Check for Other PostgreSQL Instances

If you have not done so already, you must install and configure the Postgres database used to store . See Set up the Databases.

4. Install Trifacta package

NOTE: If you are installing without Internet access, you must reference the local repository. The command to execute the installer is slightly different. See Install Dependencies without Internet Access.

NOTE: Installing the Trifacta platform in a directory other than the default one is not supported or recommended.

For CentOS or RHEL

Install the package with yum, using root:

sudo yum install <rpm file>

For Ubuntu

Install the package with apt, using root:

sudo dpkg -i <deb file>

The previous line may return an error message, which you may ignore. Continue with the following command: 

sudo apt-get -f -y install

5. Verify Install

The product is installed in the following directory: 

/opt/trifacta

6. Install License Key

Please install the license key provided to you by Trifacta. See License Key.

7. Store install packages

For safekeeping, you should retain all install packages that have been installed with this Trifacta deployment.

Configuration

After installation is complete, additional configuration is required.

The Trifacta platform requires additional configuration for a successful integration with the datastore. Please review and complete the necessary configuration steps. For more information, see Configure.

This page has no comments.