Page tree

Trifacta Dataprep



Contents:

   

Contents:


By default, Dataprep by Trifacta can access data within the project from which the product is run. To enable access for your project to a Google BigQuery  dataset owned by a different project, you must make the dataset accessible to the service accounts in your Dataprep by Trifacta project. 

NOTE: If you grant Dataprep by Trifacta access to a dataset in another project, disabling Dataprep by Trifacta does not remove these permissions. The permissions must be manually removed to fully revoke product access to datasets in other projects. For more information, see https://cloud.google.com/dataprep/docs/concepts/cross-bq-datasets#removing_service_account_access_to_a_bigquery_dataset.

To visit your current project on the Google Cloud console, see  https://console.cloud.google.com/dataprep/ .

Project Service Accounts

In the Google Cloud Console, select IAM > Service Accounts. The following service accounts are used by the product:

Service Account Name Owner Service Account Name
Compute Engine Google, Inc. <project-number>-compute@developer.gserviceaccount.com

Dataprep by Trifacta

Trifacta

service-<project-number>@trifacta-gcloud-prod.iam.gserviceaccount.com

where:

  • <project-number> is the numeric project identifier.

Minimum Required Permissions for Granting Dataset Access

To be able to assign or update dataset access controls, you must be granted bigquery.datasets.update and bigquery.datasets.get permissions at a minimum. The following predefined IAM roles include bigquery.datasets.update and bigquery.datasets.get permissions:

  • roles/bigquery.dataOwner
  • roles/bigquery.admin

NOTE: If a user has bigquery.datasets.create permissions, when that user creates a dataset, they are granted roles/bigquery.dataOwner access to it. roles/bigquery.dataOwner access gives users the ability to update datasets they create.

For more information on IAM roles and permissions in BigQuery, see https://cloud.google.com/bigquery/docs/access-control.

Methods for Granting Access

You can provide access to remote BigQuery datasets through one of the following methods: 

NOTE: When using a named service account to access data or run jobs in other projects, each user requesting access must be granted the roles/iam.serviceAccountUser role on the service account.


NOTE: OAuth users of the product require the following roles and permissions, too.

Grant access through BigQuery IAM role

To the IAM role used to access the BigQuery datasets, you must add the following service accounts:

  •  for the Dataprep by Trifacta project. This service account is required for reading the data.
  • Compute Engine Service Account for the Dataprep by Trifacta project. This service account is for running your Dataprep by Trifacta job on Dataflow using the BigQuery datasets.

This method of access enables all users of the Dataprep by Trifacta project to access all BigQuery datasets governed by the IAM role. For more information, see https://console.cloud.google.com/iam-admin/roles.

Grant access to specific BigQuery datasets

You can grant access to individual BigQuery datasets for specific service accounts. In this method, you grant access to your BigQuery dataset for the following service accounts:  

  •  for the Dataprep by Trifacta project. This service account is required for reading the data.
  • Compute Engine Service Account for the Dataprep by Trifacta project. This service account is for running your Dataprep by Trifacta job on Dataflow using the BigQuery datasets.

This method constrains Dataprep by Trifacta access to specific datasets. This method of access must granted for each BigQuery dataset in other projects that you wish to use in Dataprep by Trifacta

For more information on setting up service account access to specific BigQuery datasets, see https://cloud.google.com/bigquery/docs/dataset-access-controls#controlling_access_to_a_dataset .

This page has no comments.