Terminology applicable to Dataprep by Trifacta.
Note
This list is not comprehensive.
These terms apply to the underlying Dataprep by Trifacta platform.
A platform service for managing the storage of user-specified data, such as value mappings.
A platform service for managing access levels for Trifacta Application objects such as flows, connections, and plans.
A data serialization format for Hadoop. For more information, see Supported File Formats.
Short for Application Protocol Interface, the platform APIs permit programmatic access to developers to platform actions from outside of the application interface. For more information, see API Reference.
A platform service for queued and managing the execution of jobs through external running environments. For more information, see Configure Batch Job Runner.
A data serialization format for Hadoop. For more information, seeSupported File Formats.
The Trifacta Application can be served through a supported version of Google Chrome. For more information, see Browser Requirements.
A platform service for managing system, edition, and workspace/project levels of configuration.
A platform service for managing the configuration of platform-level connectors, their defaults, and their overrides. See Configure Connector Configuration Service.
A platform service for converting binary, relational or interpreted datasources into formats that are natively understood by the Dataprep by Trifacta platform.
Time-based job scheduling format. The Dataprep by Trifacta platform supports a modified form of cron. For more information, see cron Schedule Syntax Reference.
A platform service for managing connections and interactions with relational storage. For more information, see Configure Data Service.
The Trifacta Application can be served through a supported version of Mozilla Firefox. For more information, see Browser Requirements.
A file format for compression and decompression. For more information, see Supported File Formats.
A native format for the Tableau data visualization platform. The Dataprep by Trifacta platform can generate results in Hyper format. For more information, see Supported File Formats.
The process by which relational datasources can be retrieved from their origin and transferred to the backend datastore of the platform, which improves performance in sampling and job execution. For more information, see Configure JDBC Ingestion.
A platform service for managing the deployment and execution of user-defined functions (UDFs) authored in Java. See Java UDFs.
A Java-based platform service for managing the connectivity with file-based storage systems through a virtual file system (VFS). See Configure Java VFS Service.
A platform service for storing metadata related to job execution.
Javascript Object Notation (JSON) is a human-readable format for transmitting data objects. For more information, see Supported File Formats .
Microsoft Excel workbooks and worksheets can be used as imported datasets in the platform. For more information, see Import Excel Data.
A platform service for processing user activities for improving platform recommendations.
An open-source relational database management system. MySQL can host the Alteryx databases. For more information, see System Requirements.
The process by which computer systems use data as inputs for algorithms and statistical models to make decisions and perform tasks.
The process by which actions in the platform can be applied and scheduled in production environments.
A platform service for managing optimizations of flow execution within supported relational datastores through SQL query. See Configure Optimizer Service.
A platform service for managing the execution of plans. See Configure Orchestration Service.
An in-memory running environment for running jobs. Embedded in the Trifacta node, Trifacta Photon is fast and best-suited for small- to medium-sized jobs.
An in-browser client for managing the sampling and transformation of data on the web client. For more information, see Configure Photon Client.
An open-source relational database management system. PostgreSQL can host theAlteryx databases . For more information, seeSystem Requirements.
Specific to the Dataprep by Trifacta platform, predictive transformation serves as the foundation of design principles for how users interact with their data. For more information, see Overview of Predictive Transformation.
When a job is executed against a dataset, users can optionally choose to generate a visual profile of the results, which is processed as a separate job after the transformation job has completed. For more information, see Run Job Page.
One of several environments where transformation, profiling, and sampling jobs can be executed. The platform integrates with these environments and manages the queuing and monitoring of the jobs asynchronously, minimizing performance impacts on the Trifacta node. For more information, see Running Environment Options.
A platform service for managing the execution of schedules based on defined triggers. See Configure Scheduling.
A platform service for managing the use of secure tokens for access to third-party systems. See Configure Secure Token Service.
Users can optionally flows and connections with other users. For more information, see Overview of Sharing.
A platform service for managing job execution on Spark-based systems. See Configure for Spark.
Short for Single-Sign On, SSO enables users to access multiple systems within the enterprise domain through one set of credentials. The Dataprep by Trifacta platform can integrate with multiple types of SSO.
A fast compression and decompression format. For more information, see Supported File Formats.
A platform service for monitoring triggers for executing scheduled jobs. See Configure Scheduling.
The process by which a recipe is applied across the entire dataset to generate results at the specified output locations. For more information, see Run Job Page.
trifacta-conf.json
The primary configuration file of the Dataprep by Trifacta platform. This file is stored in JSON format on the Trifacta node.
Note
Administrators should perform platform configuration operations through the Admin Settings page, where possible. See Admin Settings Page.
For more information, see Platform Configuration Methods.
Short for user-defined function, a UDF is an externally developed function that can be used in your recipes to apply custom transformation logic. Building UDFs requires developer skills. For more information, see User-Defined Functions.
A JavaScript-based platform service for managing the connectivity with file-based storage systems through a virtual file system (VFS). See Configure VFS Service.
This service has been superseded by the Java VFS Service.
A platform service that can be optionally invoked to generate visual profiles on generated results for display in the Trifacta Application. For more information, see Overview of Visual Profiling.
A platform service for loading data through connections into the Trifacta Application for user interaction.
A Webhook is a message sent over HTTP via REST API request from one application to another. In the Dataprep by Trifacta platform, you can configure Webhooks to be sent to a third-party application based on the success or failure of a job execution. For more information, see Create Flow Webhook Task.