This section describes how you interact with your databases through the Trifacta® platform. The platform supports connections to multiple kinds of databases.
- Specific versions of each database are supported. See System Requirements.
- Connections must be enabled and configured for each type of supported database. See Connection Types.
Uses of Databases
NOTE: The Trifacta platform currently enables read access only for databases. You can read in database tables or views as your datasets and use them as sources in the product. Additional features may be available in future releases.
Before You Begin
Read Access: Your database administrator must configure read permissions to the appropriate databases, tables and views for your use.
NOTE: To ensure that all user credentials used to access the database system are securely stored, you must first deploy the encryption key file to the Trifacta node. See Enable Relational Connections.
Database access is managed through connections.
- Individual users can create private connections through the application. See Create Connection Window.
- An administrator can make your connection public or create public connections through the application or Trifacta command line interface. See CLI for Connections.
Storing Data in Relational Databases
NOTE: The Trifacta platform does not modify source data nor store transformed data in the relational systems. Datasets sourced from database tables or views are read without modification from their source locations.
Reading from Database Tables and Views
You can create a Trifacta dataset from a table or view stored in a connected database.
Tip: In some scenarios, you can improve performance of loading from database tables by creating a view on the table to restrict the amount of data loaded to only the required fields. Additional, you can pre-filter the dataset using custom SQL statements. See Create Dataset with SQL.
Additional Notes on Database Views
- Some metadata, such as row counts, is not available for database views.
- For complex view definitions that require significant processing on the database, there may be a significant delay when previewing the contents of those views. In some cases, the preview may time out waiting for the database to respond with the view contents.
For more information, see Database Browser.
Running Jobs from Database Sources
NOTE: When executing a job on the Spark running environment using a relational source, the job fails if one or more columns has been dropped from the underlying source table. As a workaround, the recipe panel may show steps referencing the missing columns, which be used to fix to either fix the script or the source data.
This page has no comments.