Contents:
This section provides overview information on the key data and metadata that should be managed by your enterprise backup and recovery policies.
NOTE: This section covers how to perform a basic cold backup of the product. Hot backups are not supported.
All backups should be performed in accordance with your enterprise's backup and recovery policies.
Stop All Services
Before you begin, the Trifacta platform and databases should be stopped. See Start and Stop the Platform.
Backup Platform Files
The following directories on the Trifacta node should be backed up on a regular basis:
Configuration
You can back up all key configuration files into the /tmp
directory using the following commands:
cp /opt/trifacta/conf/trifacta-conf.json /tmp/trifacta-conf.json cp /opt/trifacta/conf/env.sh /tmp/env.sh cp /etc/init.d/trifacta /tmp/trifacta.service
License
You should backup your license key:
/opt/trifacta/license
See License Key.
Backup Databases
The Trifacta platform utilizes the following databases as part of normal operations. These databases should be backed up on a regular basis:
Database Name | DatabaseId | Description |
---|---|---|
Trifacta DB |
| Stores users and metadata for flows, including datasets, and recipes. |
Jobs DB | trifacta-activiti | Stores and maintains job execution status and details. |
Scheduling DB | trifactaschedulingservice | Stores metadata for scheduled jobs. |
Time-based Trigger DB | trifactatimebasedtriggerservice | Additional database required for scheduled jobs. |
For more information on setting up these databases, see Set up the Databases.
Location of backup and recovery tools
PostgreSQL
Depending on your operating system, you can find the backup tool pg_dump
in the following location.
NOTE: These locations apply to PostgreSQL 9.6.
CentOS/RHEL:
/usr/pgsql-9.6/bin/pg_dump /usr/pgsql-9.6/bin/psql
Ubuntu:
/usr/lib/postgresql/9.6/bin/pg_dump /usr/lib/postgresql/9.6/bin/psql
MySQL
Please locate the following programs in your MySQL distribution:
mysqldump mysql
Backup commands
The following commands can be used to back up the databases.
PostgreSQL
For more information on command options, see https://www.postgresql.org/docs/9.6/static/backup.html.
NOTE: These commands must be executed as the trifacta
user.
NOTE: The following commands are for PostgreSQL 9.6 for all supported operating systems. For specific commands for other versions, please see the database documentation.
Trifacta DB:
pg_dump trifacta > trif_triDB_bkp_<date>.sql
Jobs DB:
pg_dump trifacta-activiti > trif_actDB_bkp_<date>.sql
Scheduling DB:
pg_dump trifactaschedulingservice > trif_schDB_bkup_<date>.sql
Time-Based Trigger DB:
pg_dump trifactatimebasedtriggerservice > trif_tbtsDB_bkup_<date>.sql
MySQL
For more information on command options, see https://dev.mysql.com/doc/refman/5.7/en/mysqldump-sql-format.html.
su - mysql
NOTE: The following commands are for MySQL 5.7 for all supported operating systems. For specific commands for other versions, please see the database documentation.
Trifacta DB:
mysqldump trifacta > trif_triDB_bkp_<date>.sql
Jobs DB:
mysqldump trifacta-activiti > trif_actDB_bkp_<date>.sql
Scheduling DB:
mysqldump trifactaschedulingservice > trif_schDB_bkup_<date>.sql
Time-Based Trigger DB:
mysqldump trifactatimebasedtriggerservice > trif_tbtsDB_bkup_<date>.sql
Scheduling
You can schedule nightly execution of these backups using a third-party scheduler such as cron.
Restart
You can restart the Trifacta platform now. See Start and Stop the Platform.
Recovery
Verify
Before you begin, please verify that you have valid backups for the following data from the version to which you are rolling back:
NOTE: When the databases are restored, internal identifiers such as job IDs, are reset in an order that may not correspond to the expected order. Consequently, references to specific identifiers may be corrupted. After restoring the databases, you should clear the job logs.
NOTE: If you do not have any of these items, you may not be able to recover your instance of the Trifacta platform to its previous state.
Item | Description |
---|---|
Configuration files | Backup of your configuration files |
Databases | Backup of your databases |
RPM | RPM installer |
Rollback steps
To recover the Trifacta platform based on backups:
NOTE: If any of the hosts, pathnames, or credentials have changed since the backups were performed, these updates must be applied through trifacta-conf.json
or through the Admin Settings page after the restoration is complete.
Steps:
Login to the Trifacta node as root user.
Stop the Trifacta service:
service trifacta stop
- Clear each current database and restore the backup of the version from the preceding release. In some cases, the database may not exist in the previous version.
PostgreSQL:
Login as a user that can run admin commands for PostgreSQL. This user may vary between deployments.
Trifacta database:
psql -c "DROP DATABASE trifacta;" psql -c "CREATE DATABASE trifacta WITH OWNER trifacta;" psql --dbname=trifacta < trifacta_backup_<date>.sql
(Release 3.2 and later) Jobs database:
NOTE: Please note the escaped quotes around the database name in the
CREATE DATABASE
command.psql -c "DROP DATABASE \"trifacta-activiti\";" psql -c "CREATE DATABASE \"trifacta-activiti\" WITH OWNER trifacta;" psql --dbname="trifacta-activiti" < trif_actDB_backup_<date>.sql
(Release 4.1 and later) Scheduling database:
psql -c "DROP DATABASE trifactaschedulingservice;" psql -c "CREATE DATABASE trifactaschedulingservice WITH OWNER trifacta;" psql --dbname=trifactaschedulingservice < trif_schedDB_backup_<date>.sql
(Release 4.1 and later) Time-based Trigger Service database:
psql -c "DROP DATABASE trifactatimebasedtriggerservice;" psql -c "CREATE DATABASE trifactatimebasedtriggerservice WITH OWNER trifacta;" psql --dbname=trifactatimebasedtriggerservice < tri_tbtsDB_backup_<date>.sql
- MySQL: For details, see https://dev.mysql.com/doc/refman/5.7/en/reloading-sql-format-dumps.html.
Login:
su - mysql
Trifacta database:
mysql trifacta < trifacta_backup_<date>.sql
Jobs database:
mysql trifacta-activiti < trif_actDB_backup_<date>.sql
(Release 4.1 and later) Scheduling database:
mysql trifactaschedulingservice < trif_schedDB_backup_<date>.sql
(Release 4.1 and later) Time-based Trigger Service database:
mysql trifactatimebasedtriggerservice < tri_tbtsDB_backup_<date>.sql
- Perform a clean install of the Trifacta software provided in your distribution. See Install.
Restore your configuration files. The following commands assume that they were backed up to the
/tmp
directory on the node:cp /tmp/trifacta-conf.json /opt/trifacta/conf/trifacta-conf.json cp /tmp/env.sh /opt/trifacta/conf/env.sh cp /tmp/trifacta.service /etc/init.d/trifacta
- Apply any patches or maintenance updates that may have been provided to you. See Maintenance Release Updater.
- Restart the platform. See Start and Stop the Platform.
- Login and verify operations. See Verify Operations.
This page has no comments.