Page tree

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Trifacta Dataprep



Contents:

   

This section provides overview information on how to configure the running environments accessible from your deployment of the Trifacta application.

A running environment is the set of services that are used to execute a job.

  • A job can include tasks to do the following:
    • Ingest data
    • Transform data
    • Profile data
    • Sample data
    • Generate results
  • A running environment can be hosted on the Trifacta node or across a cluster that is connected to the product.

Trifacta Photon

Hosted on the Trifacta nodeTrifacta Photon is an in-memory running environment designed for high performance on small- to medium-sized jobs. 


NOTE: Trifacta Photon lives in the Trifacta VPC. These jobs are not executed in a customer VPC. Data is streamed to the Trifacta VPC for transformation and is not stored within the VPC.

NOTE: You cannot cancel jobs that have been launched on Trifacta Photon.

Configuration:

Trifacta Photon may require enablement in your project or workspace:



Dataflow

Dataflow is a fully managed, serverless data processing service that is hosted in the Trifacta platform. Managed by Google, this service is enabled by default when you enable Dataprep by Trifacta in any of your Trifacta platform projects.

Configuration:

  • Access to the Dataflow service is governed through permissions in the IAM roles for users. Access is enabled by default. For more information, see Required Dataprep User Permissions.
  • Jobs are run on Dataflow using service accounts. The default Compute Engine service account deployed to your project has sufficient permissions to run Dataflow jobs. For more information, see Google Service Account Management.

  • No labels

This page has no comments.