Page tree

Trifacta Dataprep



Contents:

   

Contents:


The following settings can be customized for the user experience in your Dataprep by Trifacta project. When you modify a setting, the change is immediately applied to the project. To access the page, select User menu > Admin console > Project settings.


NOTE: Users may not experience the changed environment until each user refreshes the application page or logs out and in again.

Enablement Options:

NOTE: Any values specified in this page applies exclusively to the specific project and override any system-level defaults.

OptionDescription
Default

The default value is applied. This value may be inherited from higher level configuration.

Tip: You can review the default value as part of the help text.

Enabled

The setting is enabled.

NOTE: If the setting applies to a feature, the feature is enabled. Additional configuration may be required. See below.

DisabledThe setting is disabled.
EditClick Edit to enter a specific value for the setting.

Disable Dataprep

To disable Dataprep by Trifacta for this project, click the link.

NOTE: To remove a user and his or her assets from a project, please contact Trifacta Support.

For more information, see Enable or Disable Dataprep.

General

Locale

Set the locale to use for inferring or validating data in the application, such as numeric values or dates. The default is  United States .

NOTE: After saving changes to your locale, refresh your page. Subsequent executions of the data inference service use the new locale settings.

For more information, see Locale Settings.

Session duration

Feature Availability: This feature is available in the following editions:

  • Dataprep Enterprise Edition by Trifacta
  • Dataprep Professional Edition by Trifacta
  • Dataprep Premium by Trifacta
  • Dataprep Standard by Trifacta

Specify the length of time in minutes before a session expires. Default is 10080 (one week).

API

Allow users to generate access tokens

Feature Availability: This feature is available in the following editions:

  • Dataprep Enterprise Edition by Trifacta
  • Dataprep Professional Edition by Trifacta
  • Dataprep Premium by Trifacta

When enabled, individual users can generate their own personal access tokens, which enable access to REST APIs. For more information, see Manage API Access Tokens.

Maximum lifetime for user generated access tokens (days)

Feature Availability: This feature is available in the following editions:

  • Dataprep Enterprise Edition by Trifacta
  • Dataprep Professional Edition by Trifacta
  • Dataprep Premium by Trifacta

Defines the maximum number of days that a user-generated access token is permitted for use in the product.

Tip: To permit generation of access tokens that never expire, set this value to -1.

For more information, see Manage API Access Tokens.

Connectivity

Custom SQL query

When enabled, users can create custom SQL queries to import datasets from relational tables. For more information, see Create Dataset with SQL.

Enable conversion of standard JSON files via conversion service

When enabled, the Trifacta application utilizes the conversion service to ingest JSON files and convert them to a tabular format that is easier to import into the application. For more information, see Working with JSON v2.

NOTE: This feature is enabled by default but can be disabled as needed. The conversion process performs cleanup and re-organization of the ingested data for display in tabular format.

When disabled, the Trifacta application uses the old version of JSON import, which does not restructure the data and may require additional recipe steps to manually structure it into tabular format.

NOTE: Although imported datasets and recipes created under v1 of the JSON importer continue to work without interruption, the v1 version is likely to be deprecated in a future release. You should switch your old imported datasets and recipes to using the new version. Instructions to migrate are provided at the link below.

NOTE: The legacy version of JSON import is required if you are working with compressed JSON files or only Newline JSON files.

For more information, see Working with JSON v1.


Manage access to data using user IAM permissions

Feature Availability: This feature is available in the following editions:

  • Dataprep Enterprise Edition by Trifacta
  • Dataprep Premium by Trifacta

When enabled, user access to data services in Google Cloud, such as Cloud Storage and Bigquery, is determined by the permissions defined in a user's assigned IAM role.

NOTE: When this feature is enabled, all Dataprep Premium by Trifacta users that belong to the project are automatically logged out of all Trifacta application sessions across all projects. For example, if a Dataprep Premium by Trifacta user is logged into the product through another project, the user is logged out of their Trifacta application session when this feature is enabled. When each user logs in to the Trifacta application again, any changes to the user's permissions are applied. Since each each API request requires authentication in the header, API users are not automatically logged out.

For more information on IAM-based permissions, Required Dataprep User Permissions .

Flows, recipes, and plans

Column from examples

Feature Availability: This feature is available in the following editions:

  • Dataprep Enterprise Edition by Trifacta
  • Dataprep Premium by Trifacta
  • Dataprep Standard by Trifacta

When enabled, users can access a tool through the column menus that enables creation of new columns based on example mappings from the selected column. For more information, see Overview of TBE.

Editor Scheduling

When enabled, flow editors are also permitted to create and edit schedules. For more information, see Flow View Page.

NOTE: The Scheduling feature may need to be enabled in your environment. When enabled, flow owners can always create and edit schedules.


When this feature is enabled, plan collaborators are also permitted to create and edit schedules. For more information, see Plan View Page.


Export

When enabled, users are permitted to export their flows and plans. Exported flows can be imported into other work areas or product editions. 

NOTE: If plans have been enabled in your project settings, enabling this flag applies to flows and plans.

For more information, see Export Flow.For more information, see Export Plan.

Import

When enabled, users are permitted to import exported flows and plans.

NOTE: If plans have been enabled in your project settings, enabling this flag applies to flows and plans.

For more information, see Import Flow.For more information, see Import Plan.


Plan feature

Feature Availability: This feature is available in the following editions:

  • Dataprep Enterprise Edition by Trifacta
  • Dataprep Professional Edition by Trifacta
  • Dataprep Premium by Trifacta

When enabled, users can create plans to execute sequences of recipes across one or more flows. For more information, see Plans Page.

For more information on plans and orchestration, see Overview of Operationalization.


Schematized output

Feature Availability: This feature is available in the following editions:

  • Dataprep Enterprise Edition by Trifacta
  • Dataprep Professional Edition by Trifacta
  • Dataprep Premium by Trifacta

When enabled, all output columns are typecast to their annotated types. This feature is enabled by default.


Webhooks

Feature Availability: This feature is available in the following editions:

  • Dataprep Enterprise Edition by Trifacta
  • Dataprep Professional Edition by Trifacta
  • Dataprep Premium by Trifacta

When enabled, webhook notification tasks can be configured on a per-flow basis in Flow View page. Webhook notifications allow you to deliver messages to third-party applications based on the success or failure of your job executions. For more information, see Create Flow Webhook Task.

Job execution

BigQuery execution

Feature Availability: This feature is available in the following editions:

  • Dataprep Enterprise Edition by Trifacta
  • Dataprep Professional Edition by Trifacta
  • Dataprep Premium by Trifacta
  • Dataprep Standard by Trifacta

When enabled, the Trifacta application can execute transformation jobs inside BigQuery when all data sources and outputs for the job are located in BigQuery.

NOTE: Logical and physical optimization of jobs must also be enabled.

To enable BigQuery execution on your flow jobs, you must enable all general and BigQuery optimizations within the flow. For more information, see Flow Optimization Settings Dialog .

For more information on BigQuery as a running environment, see Overview of Job Execution.


Logical and physical optimization of jobs

Feature Availability: This feature is available in the following editions:

  • Dataprep Enterprise Edition by Trifacta
  • Dataprep Professional Edition by Trifacta
  • Dataprep Starter Edition by Trifacta
  • Dataprep Premium by Trifacta
  • Dataprep Standard by Trifacta

When enabled, the Trifacta application attempts to optimize job execution through logical optimizations of your recipe and physical optimizations of your recipes interactions with data.

This workspace setting can be overridden for individual flows.

Tip: You should keep this feature enabled. Please enable it at the project level and disable it only if needed at the flow level.

For more information, see Flow Optimization Settings Dialog.

Require a companion service account for running jobs

Feature Availability: This feature is available in the following editions:

  • Dataprep Enterprise Edition by Trifacta
  • Dataprep Premium by Trifacta

By default, Dataprep by Trifacta utilizes a default compute service account for running jobs on Dataflow. Optionally, you can enable this feature, which requires each user in the project to provide their own companion service account to run jobs. This feature is disabled by default.

Prerequisites:

  • Service accounts must be created in the Google Cloud platform.
  • Companion service accounts must have a minimum set of permissions.
  • For more information, see Google Service Account Management.

When this feature is enabled:

  • Project administrators can review and specify companion service accounts for individual users of the project. For more information, see Service Accounts Page.
  • Individual users can specify their companion service account. For more information see User Profile Page.
  • At runtime, an override service account can be applied if needed. See Run Job Page.

When this feature is disabled:

  • By default, all users of the project use the Compute Engine service account specified for the project.
  • If companion service accounts has been enabled, when it's disabled, the default service account for the project is used.
  • For more information, see Google Service Account Management.

SQL Scripts

When enabled, users may define SQL scripts to execute as part of a job's run. Scripts can be executed before data ingestion, after output publication, or both through any write-supported relational connection to which the user has access.

For more information, see Create Output SQL Scripts.


Trifacta Photon execution

Feature Availability: This feature is not available in
Dataprep Legacy by Trifacta only.

When enabled, users can choose to execute their jobs on Trifacta Photon, a proprietary running environment built for execution of small- to medium-sized jobs in memory on the Trifacta node.

NOTE: Jobs executed in Trifacta Photon are executed within the Trifacta VPC. Data is temporarily streamed to the Trifacta VPC during job execution and is not persisted.

NOTE: Jobs that are executed on Trifacta Photon may be limited to run for a maximum of 10 minutes, after which they fail with a timeout error. If your job fails due to this limit, please switch to running the job on Dataflow.

Tip: When enabled, you can select to run jobs on Photon through the Run Jobs page. The default running environment is the one that is best for the size of your job.

When Trifacta Photon is disabled:

  • You cannot run jobs on the local running environment. All jobs must be executed on a clustered running environment.
  • Trifacta Photon is used for Quick Scan sampling jobs. If Trifacta Photon is disabled, the Trifacta application attempts to run the Quick Scan job on another available running environment. If that job fails or no suitable running environment is available, the Quick Scan sampling job fails.

For more information, see Run Job Page.

Scheduling and parameterization

Scheduling feature

Feature Availability: This feature is available in the following editions:

  • Dataprep Enterprise Edition by Trifacta
  • Dataprep Professional Edition by Trifacta
  • Dataprep Premium by Trifacta
  • Dataprep Standard by Trifacta

When enabled, project users can schedule the execution of flows. See Add Schedule Dialog.

Publishing

Notifications

Email notifications: on plan/flow share

When email notifications are enabled, users automatically receive notifications whenever an owner shares the plan or flow with the user.

Individual users can opt out of receiving notifications. For more information, see Preferences Page.

Experimental features

These experimental features are not supported. 

Experimental features are in active development. Their functionality may change from release to release, and they may be removed from the product at any time. Do not use experimental features in a production environment.

These settings may or may not change application behavior.

Default language

Select the default language to use in the Trifacta application.

Language localization

When enabled, the Trifacta application is permitted to display text in the selected language. 

Show user language preference

When enabled, users are permitted to select a preferred language in their preferences. See Preferences Page.

This page has no comments.