Page tree

 

Contents:


As needed, the Trifacta® databases can be installed as PostgreSQL DBs on Amazon RDS. Amazon Relational Database Service (Amazon RDS) makes it easy to set up, operate, and scale a relational database in the cloud. 

Limitations

  • MySQL is not supported for installation on Amazon RDS.
  • SSL connectivity is not supported.

Pre-requisites

NOTE: You can use the suggested defaults below for sizing your RDS instance. If you have questions or concerns about sizing recommendations, please contact Trifacta Support.

  • Admin access to an Amazon RDS account

Initialize RDS instance

Steps:

  1. In your RDS dashboard, click Launch a DB instance. 

    NOTE: The RDS instance must be launched in the same Amazon region as the Trifacta node.

  2. For Select Engine: Select PostgresSQL.
  3. For Production?:  Choose Yes if you are deploying the database for a production instance of the Trifacta platform.  Otherwise, select No
  4. DB Engine: postgres
  5. For the DB details, see below:

    NOTE: Except as noted below, properties should be specified according to your enterprise requirements.

    1. Instance Specifications:

      1. License Model: postresql-license
      2. DB Engine Version: For more information on the supported versions of PostgreSQL, see System Requirements.
      3. Allocated Storage: at least 10 GB

  6. For Advanced Settings, please apply the following settings:

    1. Network and Security:

      1. VPC security group must allow for access from the Trifacta platform

    2. Database Options:

      1. Database Name: trifacta

      2. Database Port: 5432

        1. The port number can be changed as needed. See System Ports.

  7. Populate other properties according to your enterprise requirements.

  8. To complete the set up click Launch DB Instance.

  9. When the RDS DB instance is up and running, please collect the following information, which is used later:
    1. Public DNS
    2. Port Number
    3. Admin username
    4. Admin password

Configure the Trifacta platform for RDS

Please complete the following steps to integrate the Trifacta platform with the DB instance you just created.

Steps:

  1. In the RDS console, you must find the Public DNS endpoint for the RDS instance you created: 

    1. Under Instances, expand the name of the instance you created.

    2. The DNS endpoint should be listed under the name in the Endpoint section.

  2. Set the host for each database to the Public DNS endpoint for the RDS instance:

    DatabaseProperty
    Main databasewebapp.db.host
    Jobs databasebatch-job-runner.db.host
    Scheduling databasescheduling-service.database.host
    Time-based Trigger databasetime-based-trigger-service.database.host
  3. To set custom database names, usernames, and passwords:

    1. Edit  /opt/trifacta/conf/trifacta-conf.json

    2. For each database below, you can review the database name, username, and password. 

      DatabaseProperty
      Main databasewebapp.db.name
       webapp.db.username
       webapp.db.password
      Jobs databasebatch-job-runner.db.name
       batch-job-runner.db.username
       batch-job-runner.db.password
      Scheduling databasescheduling-service.database.name
       scheduling-service.database.user
       scheduling-service.database.password
      Time-Based Trigger databasetime-based-trigger-service.database.name
       time-based-trigger-service.database.user
       time-based-trigger-service.database.password
    3. Make changes in the file as needed and save.

Initialize the databases

Use the following steps to initialize the databases required by the platform.

Pre-requisites:

  • The installing user must have write permissions to the directory from which the commands are executed.
  • The installing user must have sudo privileges.

Steps:

  1. Switch to the Postgres user. Launch psql.

    sudo su - postgres

    NOTE: Unless the port number for postgres has been modified, it should be listening at the default value: 5432.

  2. Launch psql using the following command, applying the admin password when prompted:

    psql --host=${RDS_HOST_NAME} --port=${RDS_PORT} --username=${RDS_ADMIN_USERNAME} --password=${RDS_ADMIN_PASSWORD}

    where:

    ParameterDescription
    hostPublic DNS value for the RDS instance
    portPort number value for the RDS instance
    usernameAdmin username for the RDS instance
    passwordPassword for admin username
  3. Execute the following commands using the postgres user.

    NOTE: The values in platform configuration must match the values that you use below. Below are the default values.

     

    1. For Trifacta database:

      CREATE ROLE trifacta LOGIN ENCRYPTED PASSWORD '<pwd_trifacta>';
      CREATE DATABASE trifacta WITH OWNER trifacta;

       

    2. For Jobs database:

      CREATE ROLE trifactaactiviti WITH LOGIN ENCRYPTED PASSWORD '<pwd_trifactaactiviti>';
      CREATE DATABASE "trifacta-activiti" WITH OWNER trifactaactiviti;

       

    3. For Scheduling database:

      CREATE ROLE trifactaschedulingservice LOGIN ENCRYPTED PASSWORD '<pwd_trifactaschedulingservice>';
      CREATE DATABASE trifactaschedulingservice WITH OWNER trifactaschedulingservice;

      For more information on scheduling, see Configure Scheduling.

    4. For Time-based Trigger database:

      CREATE ROLE trifactatimebasedtriggerservice LOGIN ENCRYPTED PASSWORD '<pwd_trifactatimebasedtriggerservice>';
      CREATE DATABASE trifactatimebasedtriggerservice WITH OWNER trifactatimebasedtriggerservice;

      For more information on scheduling, see Configure Scheduling.

    5. Exit:

      \q
      exit

       

Configure non-default connections

If you have used non-default values for the username, password, host, or port value for either database, you must update platform configuration. 

You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json. For more information, see Platform Configuration Methods. 

NOTE: Do not modify the other properties in these sections unless necessary.

Trifacta database

"webapp.db.username": "trifacta",
"webapp.db.logging": false,
"webapp.db.name": "trifacta",
"webapp.db.host": "localhost",
"webapp.db.password": "<pwd_trifactaDB>",
"webapp.db.type": "postgressql",
"webapp.db.port": 5432,
"webapp.db.pool.maxIdleTimeInMillis": 30000,
"webapp.db.pool.maxConnections": 10,

The following parameters apply to the Trifacta database only:

ParameterDescription
logging

Set this value to true to enable logging on the Trifacta database.

pool.maxIdleTimeInMillisSpecifies the maximum permitted idle time for a database connection before it is automatically closed.
pool.maxConnections

Defines the maximum permitted database connections for the Trifacta database.

Additional parameters are described below.

Jobs database

Modify the batch-job-runner.db settings:

"batch-job-runner.db.username": "trifactaactiviti", 
"batch-job-runner.db.name": "trifacta-activiti", 
"batch-job-runner.db.driver": "org.postgresql.Driver", 
"batch-job-runner.db.host": "localhost", 
"batch-job-runner.db.password": "<pwd_trifactaactivitiDB>", 
"batch-job-runner.db.port": 5432,  

Scheduling service database 

"scheduling-service.database.type": "POSTGRESQL",
"scheduling-service.database.host": "localhost",
"scheduling-service.database.port": "5432",
"scheduling-service.database.name": "trifactaschedulingservice",
"scheduling-service.database.user": "trifactaschedulingservice",
"scheduling-service.database.password": "<pwd_schedulingserviceDB>" 

Time-based trigger service database 

"time-based-trigger-service.database.type": "POSTGRESQL",
"time-based-trigger-service.database.host": "localhost",
"time-based-trigger-service.database.port": "5432",
"time-based-trigger-service.database.name": "trifactatimebasedtriggerservice",
"time-based-trigger-service.database.user": "trifactatimebasedtriggerservice",
"time-based-trigger-service.database.password": "<pwd_triggerserviceDB>"

Database Parameter Reference

The following generalized parameters apply to one or more of the databases. 

ParameterDescription
host

Host of the database. Default value is localhost, meaning the database is hosted on the Trifacta node.

portPort number for the database. Default value is 5432 for all databases.
nameName of the database. This value should match what was used during installation.
user or usernameThe username to use to connect to the database.
passwordPassword to use to connect to the database.
typeThis value should be set to POSTGRESQL. Do not modify.
driverName of the database. Do not modify.

Save your changes and restart the platform.

Logging

  1. To review database logs in RDS, locate the Instance details page in the RDS console.
  2. Click Recent Events and Logs.
  3. If your account has the appropriate permissions, all Trifacta database logs are available here. 

This page has no comments.