This page provides some tips and guidelines fo maintaining your backend storage.
Note
Except for temporary files that it creates as part of normal operations or storage used as part of feature execution, Designer Cloud Powered by Trifacta Enterprise Edition does not remove files from the backend storage for safety reasons. Unless resources have been provided to you by Alteryx, management of the backend datastore is the responsibility of the customer.
Note
Designer Cloud Powered by Trifacta Enterprise Edition does not store data on the Trifacta node where the software is installed.
Note
Designer Cloud Powered by Trifacta Enterprise Edition does not modify source data.
Log files are stored by default in the following location on the Trifacta node:
/opt/trifacta/logs
Service log files are automatically auto-rotated at 50 MB. For more information on configuring log rotation, see Configure Logging for Services.
Logs related to job execution are not automatically rotated.
Note
Job log files can accumulate over time. As a good rule of thumb, you can set up a recurring job through an external scheduler to purge old job logs that are older than six months.
Job log files are stored in the following directories:
/opt/trifacta/logs/jobs
/opt/trifacta/logs/jobgroups
They are organized by job identifier in sub-directories.
For more information on job logs, see Diagnose Failed Jobs.
Temporary files may be written to the temporary directory on the backend datastore, particularly during job execution.
/tmp
Note
These files may be purged during restarts of the platform.
During execution of jobs, Spark may use the following directories on backend storage for storage of temporary files:
/user/<UserID> /trifacta/tempfiles
The Designer Cloud Powered by Trifacta platform generates your samples and profiling statistics in one of the following directories for each user:
The default directory:
/trifacta/queryResults/.trifacta
The user-defined output directory
Note
These files should be removed on a periodic basis.
While samples and job results may be retained on backend storage, the Designer Cloud Powered by Trifacta platform does not store your source data.
Note
Datasets removed from the Library are removed as references to the product. The underlying data is not actually deleted.
The following features do store data on the base storage layer.
Data sources that are stored in a binary format, such as PDF or Excel, or that require additional processing, such as JSON, must be converted to file format that can be natively ingested by the Designer Cloud Powered by Trifacta platform. Typically these files are stored in the base storage layer in CSV format.
This feature is enabled by default.
When JDBC ingestion is enabled, some objects used in sampling that are sourced from JDBC sources may be stored in the base storage layer for faster retrieval. After job execution, these objects are deleted, or if datasource caching is enabled, are moved to the appropriate datasource cache.
For more information, see Configure JDBC Ingestion.
If datasource caching has been enabled, cached objects can be stored in either a global or user-specific cache. For more information, see Configure Data Source Caching.