Page tree

 

Contents:


This section provides overview information on the key data and metadata that should be managed by your enterprise backup and recovery policies. 

NOTE: This section covers how to perform a basic cold backup of the product. Hot backups are not supported.


All backups should be performed in accordance with your enterprise's backup and recovery policies.

Stop All Services

Before you begin, the Trifacta platform and databases should be stopped. See Start and Stop the Platform.

Backup Platform Files

The following directories on the Trifacta node should be backed up on a regular basis:

Configuration

You can back up all key configuration files into the /tmp directory using the following commands:

cp /opt/trifacta/conf/trifacta-conf.json /tmp/trifacta-conf.json
cp /opt/trifacta/conf/env.sh /tmp/env.sh
cp /etc/init.d/trifacta /tmp/trifacta.service

License

You should backup your license key:

/opt/trifacta/license

See License Key.

Backup Databases

The Trifacta platform utilizes the following databases as part of normal operations. These databases should be backed up on a regular basis:

Database NameDatabaseIdDescription
Trifacta DB

trifacta

Stores users and metadata for flows, including datasets, and recipes.
Jobs DBtrifacta-activitiStores and maintains job execution status and details.
Scheduling DBtrifactaschedulingserviceStores metadata for scheduled jobs.
Time-based Trigger DBtrifactatimebasedtriggerserviceAdditional database required for scheduled jobs.

For more information on setting up these databases, see Set up the Databases.

Location of backup and recovery tools

PostgreSQL

Depending on your operating system, you can find the backup tool pg_dump in the following location.

NOTE: These locations apply to PostgreSQL 9.6.


CentOS/RHEL:

/usr/pgsql-9.6/bin/pg_dump
/usr/pgsql-9.6/bin/psql

 

Ubuntu: 

/usr/lib/postgresql/9.6/bin/pg_dump
/usr/lib/postgresql/9.6/bin/psql

MySQL

Please locate the following programs in your MySQL distribution:

mysqldump
mysql

Backup commands

The following commands can be used to back up the databases.

PostgreSQL

For more information on command options, see https://www.postgresql.org/docs/9.6/static/backup.html.

NOTE: These commands must be executed as the trifacta user.

NOTE: The following commands are for PostgreSQL 9.6 for all supported operating systems. For specific commands for other versions, please see the database documentation.


Trifacta DB:

pg_dump trifacta > trif_triDB_bkp_<date>.sql

Jobs DB:

pg_dump trifacta-activiti > trif_actDB_bkp_<date>.sql

Scheduling DB:

pg_dump trifactaschedulingservice > trif_schDB_bkup_<date>.sql

Time-Based Trigger DB:

pg_dump trifactatimebasedtriggerservice > trif_tbtsDB_bkup_<date>.sql

MySQL

For more information on command options, see https://dev.mysql.com/doc/refman/5.7/en/mysqldump-sql-format.html.

su - mysql

NOTE: The following commands are for MySQL 5.7 for all supported operating systems. For specific commands for other versions, please see the database documentation.


Trifacta DB:

mysqldump trifacta > trif_triDB_bkp_<date>.sql

Jobs DB:

mysqldump trifacta-activiti > trif_actDB_bkp_<date>.sql

Scheduling DB:

mysqldump trifactaschedulingservice > trif_schDB_bkup_<date>.sql

Time-Based Trigger DB:

mysqldump trifactatimebasedtriggerservice > trif_tbtsDB_bkup_<date>.sql

Scheduling

You can schedule nightly execution of these backups using a third-party scheduler such as cron.

Restart

You can restart the Trifacta platform now. See Start and Stop the Platform.

Recovery

Verify

Before you begin, please verify that you have valid backups for the following data from the version to which you are rolling back:

NOTE: When the databases are restored, internal identifiers such as job IDs, are reset in an order that may not correspond to the expected order. Consequently, references to specific identifiers may be corrupted. After restoring the databases, you should clear the job logs.

NOTE: If you do not have any of these items, you may not be able to recover your instance of the Trifacta platform to its previous state.

ItemDescription
Configuration filesBackup of your configuration files
DatabasesBackup of your databases
RPMRPM installer

 

Rollback steps

To recover the Trifacta platform based on backups:

NOTE: If any of the hosts, pathnames, or credentials have changed since the backups were performed, these updates must be applied through trifacta-conf.json or through the Admin Settings page after the restoration is complete.

Steps:

  1. Login to the Trifacta node as root user.

  2. Stop the Trifacta service:

    service trifacta stop
  3. Clear each current database and restore the backup of the version from the preceding release. In some cases, the database may not exist in the previous version.
    1. PostgreSQL:

      1. Login as a user that can run admin commands for PostgreSQL. This user may vary between deployments.

      2. Trifacta database:

        psql -c "DROP DATABASE trifacta;"
        psql -c "CREATE DATABASE trifacta WITH OWNER trifacta;"
        psql --dbname=trifacta < trifacta_backup_<date>.sql
      3. (Release 3.2 and later) Jobs database:

        NOTE: Please note the escaped quotes around the database name in the CREATE DATABASE command.

        psql -c "DROP DATABASE \"trifacta-activiti\";"
        psql -c "CREATE DATABASE \"trifacta-activiti\" WITH OWNER trifacta;"
        psql --dbname="trifacta-activiti" < trif_actDB_backup_<date>.sql
      4. (Release 4.1 and later) Scheduling database:

        psql -c "DROP DATABASE trifactaschedulingservice;"
        psql -c "CREATE DATABASE trifactaschedulingservice WITH OWNER trifacta;"
        psql --dbname=trifactaschedulingservice < trif_schedDB_backup_<date>.sql
        
      5. (Release 4.1 and later) Time-based Trigger Service database:

        psql -c "DROP DATABASE trifactatimebasedtriggerservice;"
        psql -c "CREATE DATABASE trifactatimebasedtriggerservice WITH OWNER trifacta;"
        psql --dbname=trifactatimebasedtriggerservice < tri_tbtsDB_backup_<date>.sql
    2. MySQL: For details, see https://dev.mysql.com/doc/refman/5.7/en/reloading-sql-format-dumps.html.
      1. Login:

        su - mysql
      2. Trifacta database:

        mysql trifacta < trifacta_backup_<date>.sql
      3. Jobs database:

        mysql trifacta-activiti < trif_actDB_backup_<date>.sql
      4. (Release 4.1 and later) Scheduling database:

        mysql trifactaschedulingservice < trif_schedDB_backup_<date>.sql
      5. (Release 4.1 and later) Time-based Trigger Service database:

        mysql trifactatimebasedtriggerservice < tri_tbtsDB_backup_<date>.sql
  4. Perform a clean install of the Trifacta software provided in your distribution. See Install.
  5. Restore your configuration files. The following commands assume that they were backed up to the /tmp directory on the node:

    cp /tmp/trifacta-conf.json  /opt/trifacta/conf/trifacta-conf.json
    cp /tmp/env.sh /opt/trifacta/conf/env.sh
    cp /tmp/trifacta.service /etc/init.d/trifacta
    
  6. Apply any patches or maintenance updates that may have been provided to you. See Maintenance Release Updater.
  7. Restart the platform. See Start and Stop the Platform.
  8. Login and verify operations. See Verify Operations.


This page has no comments.