Skip to main content

Access Cross-Project BigQuery Datasets

By default, Dataprep by Trifacta can access data within the project from which the product is run. To enable access for your project to a Google BigQuery dataset owned by a different project, you must make the dataset accessible to the service accounts in your Dataprep by Trifacta project.

Note

If you grant Dataprep by Trifacta access to a dataset in another project, disabling Dataprep by Trifacta does not remove these permissions. The permissions must be manually removed to fully revoke product access to datasets in other projects. For more information, see https://cloud.google.com/dataprep/docs/concepts/cross-bq-datasets#removing_service_account_access_to_a_bigquery_dataset.

To visit your current project on the Google Cloud console, see https://console.cloud.google.com/dataprep/.

Project Service Accounts

In the Google Cloud Console, select IAM > Service Accounts. The following service accounts are used by the product:

Service Account Name

Owner

Service Account Name

Compute Engine

Google, Inc.

<project-number>-compute@developer.gserviceaccount.com

Dataprep Service Agent

Alteryx

service-<project-number>@trifacta-gcloud-prod.iam.gserviceaccount.com

where:

  • <project-number> is the numeric project identifier.

Minimum Required Permissions for Granting Dataset Access

To be able to assign or update dataset access controls, you must be granted bigquery.datasets.update and bigquery.datasets.get permissions at a minimum. The following predefined IAM roles include bigquery.datasets.update and bigquery.datasets.get permissions:

  • roles/bigquery.dataOwner

  • roles/bigquery.admin

Note

If a user has bigquery.datasets.create permissions, when that user creates a dataset, they are granted roles/bigquery.dataOwner access to it. roles/bigquery.dataOwner access gives users the ability to update datasets they create.

For more information on IAM roles and permissions in BigQuery, see https://cloud.google.com/bigquery/docs/access-control.

Methods for Granting Access

You can provide access to remote BigQuery datasets through one of the following methods:

Note

When using a named service account to access data or run jobs in other projects, each user requesting access must be granted theroles/iam.serviceAccountUserrole on the service account.

Note

OAuth users of the product require the following roles and permissions, too.

Grant access through BigQuery IAM role

If you want users outside your project to access BigQuery and related assets for your project, you must assign the appropriate BigQuery role for your project to the Dataprep by Trifacta service accounts. You must assign the following service accounts to the appropriate IAM roles that access BigQuery:

  • Alteryx Service Account for the Dataprep by Trifacta project. This service account is required for reading the data.

  • Compute Engine Service Account for the Dataprep by Trifacta project. This service account is for running your Dataprep by Trifacta job on Dataflow using the BigQuery datasets.

This method of access enables all users of the Dataprep by Trifacta project to access all BigQuery datasets governed by the IAM role. For more information, see https://console.cloud.google.com/iam-admin/roles.

Steps:

Note

This process must be repeated for each project to which you wish to grant access.

  1. In Google Cloud Platform console, select the project to which you wish to grant access.

  2. When the project is loaded, navigate to the IAM console: https://console.cloud.google.com/iam-admin/iam.

  3. Click Add.

  4. For each of the above Dataprep by Trifacta Service Accounts:

    1. Enter the name of the Service Account.

    2. Assign the appropriate BigQuery role to the Service Account.

    3. Save your changes.

  5. Repeat the above step for each Dataprep by Trifacta Service Account.

  6. Repeat all of these steps for each project to which you wish to provide access to your project's BigQuery assets.

Grant access to specific BigQuery datasets

You can grant access to individual BigQuery datasets for specific service accounts. In this method, you grant access to your BigQuery dataset for the following service accounts:

  • Alteryx Service Account for the Dataprep by Trifacta project. This service account is required for reading the data.

  • Compute Engine Service Account for the Dataprep by Trifacta project. This service account is for running your Dataprep by Trifacta job on Dataflow using the BigQuery datasets.

This method constrains Dataprep by Trifacta access to specific datasets. This method of access must granted for each BigQuery dataset in other projects that you wish to use in Dataprep by Trifacta.

For more information on setting up service account access to specific BigQuery datasets, seehttps://cloud.google.com/bigquery/docs/dataset-access-controls#controlling_access_to_a_dataset.

Grant access to BigQuery authorized views

In this scenario, the Dataprep by Trifacta project is attempting to access a BigQuery authorized view in another target project. This view is connected to an underlying BigQuery dataset. The following additional permissions are required:

  • Google IAM: The BigQuery User permission must be granted to any Dataprep by Trifacta user account or service account that is trying to access the target project.

  • BigQuery: In the target project, the BigQuery Data Viewer permission must be granted to any Dataprep by Trifacta user account or service account that is trying to access the target project.

  • The view must be an authorized view.

  • The table dataset underlying the authorized view must grant permissions to the view. For more information, see https://cloud.google.com/bigquery/docs/share-access-views.