Page tree


Contents:

Our documentation site is moving!

For up-to-date documentation of Dataprep, please visit us at https://help.alteryx.com/Dataprep/.

   

Contents:


By default,  Dataprep by Trifacta can access data within the project from which the product is run. To enable access for your project to a Google BigQuery  dataset owned by a different project, you must make the dataset accessible to the service accounts in your  Dataprep by Trifacta project. 

NOTE: If you grant Dataprep by Trifacta access to a dataset in another project, disabling Dataprep by Trifacta does not remove these permissions. The permissions must be manually removed to fully revoke product access to datasets in other projects. For more information, see https://cloud.google.com/dataprep/docs/concepts/cross-bq-datasets#removing_service_account_access_to_a_bigquery_dataset.

To visit your current project on the Google Cloud console, see  https://console.cloud.google.com/dataprep/ .

Project Service Accounts

In the Google Cloud Console, select IAM > Service Accounts. The following service accounts are used by the product:

Service Account Name Owner Service Account Name
Compute Engine Google, Inc. <project-number>-compute@developer.gserviceaccount.com

Dataprep Service Agent

Alteryx Inc

service-<project-number>@trifacta-gcloud-prod.iam.gserviceaccount.com

where:

  • <project-number> is the numeric project identifier.

Minimum Required Permissions for Granting Dataset Access

To be able to assign or update dataset access controls, you must be granted bigquery.datasets.update and bigquery.datasets.get permissions at a minimum. The following predefined IAM roles include bigquery.datasets.update and bigquery.datasets.get permissions:

  • roles/bigquery.dataOwner
  • roles/bigquery.admin

NOTE: If a user has bigquery.datasets.create permissions, when that user creates a dataset, they are granted roles/bigquery.dataOwner access to it. roles/bigquery.dataOwner access gives users the ability to update datasets they create.

For more information on IAM roles and permissions in BigQuery, see https://cloud.google.com/bigquery/docs/access-control.

Methods for Granting Access

You can provide access to remote BigQuery datasets through one of the following methods: 

NOTE: When using a named service account to access data or run jobs in other projects, each user requesting access must be granted the roles/iam.serviceAccountUser role on the service account.


NOTE: OAuth users of the product require the following roles and permissions, too.

Grant access through BigQuery IAM role

If you want users outside your project to access BigQuery and related assets for your project, you must assign the appropriate BigQuery role for your project to the  Dataprep by Trifacta service accounts. You must assign the following service accounts to the appropriate IAM roles that access BigQuery:

  • Dataprep by Trifacta Service Account for the  Dataprep by Trifacta project. This service account is required for reading the data.
  • Compute Engine Service Account for the  Dataprep by Trifacta project. This service account is for running your  Dataprep by Trifacta job on Dataflow using the BigQuery datasets.

This method of access enables all users of the  Dataprep by Trifacta project to access all BigQuery datasets governed by the IAM role. For more information, see https://console.cloud.google.com/iam-admin/roles.

Steps:

NOTE: This process must be repeated for each project to which you wish to grant access.

  1. In Google Cloud Platform console, select the project to which you wish to grant access.
  2. When the project is loaded, navigate to the IAM console: https://console.cloud.google.com/iam-admin/iam.
  3. Click Add.
  4. For each of the above  Dataprep by Trifacta Service Accounts:
    1. Enter the name of the Service Account.
    2. Assign the appropriate BigQuery role to the Service Account.
    3. Save your changes.
  5. Repeat the above step for each  Dataprep by Trifacta Service Account.
  6. Repeat all of these steps for each project to which you wish to provide access to your project's BigQuery assets.

Grant access to specific BigQuery datasets

You can grant access to individual BigQuery datasets for specific service accounts. In this method, you grant access to your BigQuery dataset for the following service accounts:  

  • Dataprep by Trifacta Service Account for the  Dataprep by Trifacta project. This service account is required for reading the data.
  • Compute Engine Service Account for the  Dataprep by Trifacta project. This service account is for running your  Dataprep by Trifacta job on Dataflow using the BigQuery datasets.

This method constrains  Dataprep by Trifacta access to specific datasets. This method of access must granted for each BigQuery dataset in other projects that you wish to use in  Dataprep by Trifacta

For more information on setting up service account access to specific BigQuery datasets, see https://cloud.google.com/bigquery/docs/dataset-access-controls#controlling_access_to_a_dataset.

Grant access to BigQuery authorized views

In this scenario, the  Dataprep by Trifacta project is attempting to access a BigQuery authorized view in another target project. This view is connected to an underlying BigQuery dataset. The following additional permissions are required:

  • Google IAM: The BigQuery User  permission must be granted to any Dataprep by Trifacta user account or service account that is trying to access the target project.
  • BigQuery: In the target project, the BigQuery Data Viewer  permission must be granted to any Dataprep by Trifacta user account or service account that is trying to access the target project.
  • The view must be an authorized view.
  • The table dataset underlying the authorized view must grant permissions to the view. For more information, see https://cloud.google.com/bigquery/docs/share-access-views.


This page has no comments.