Skip to main content

Required Dataprep User Permissions

In the Google Cloud Platform, Identity and Access Management (IAM) allows you to control user and group access to your project's resources. This section describes the IAM permissions relevant to Dataprep by Trifacta and the IAM roles that grant those permissions. To access the IAM console, see https://cloud.google.com/iam.

  • A role is a set of one or more permissions. A role is assigned to users and groups.

  • A permission grants access to a resource. Different permissions can grant different access levels to the same resource.

Tools for manage IAM policies:

  • Google Cloud Console

  • API

  • gcloud CLI

For more information, see https://cloud.google.com/iam/docs/granting-changing-revoking-access.

Required Roles and Their Permissions

To use Dataprep by Trifacta, the following roles are required. Below, you can review each required role, its purpose, and the permissions that are enabled by it.

Role

Use

Permissions and roles

roles/dataprep.projects.user

Enables a user to run Dataprep by Trifacta in a project See below.

Permissions:

  • dataprep.projects.use

  • resourcemanager.projects.get

  • serviceusage.quotas.get

  • serviceusage.services.get

  • serviceusage.services.list

roles/dataprep.serviceAgent

Enables the platform to access and modify datasets and storage and to run and manage Dataflow jobs on behalf of the user within the project. For more information on this role, see https://console.cloud.google.com/iam-admin/roles/details/roles%3Cdataprep.serviceAgent.

Note

When the product is enabled within a project, this role is granted by the project owner as part of the enablement process. For more information, see Enable or Disable Dataprep.

Permissions:

  • storage.buckets.get

  • storage.buckets.list

  • storage.objects.create

  • storage.objects.delete

  • storage.objects.get

  • storage.objects.getIamPolicy

  • storage.objects.list

  • storage.objects.setIamPolicy

  • storage.objects.update

Roles:

  • roles/dataflow.developer

  • roles/bigquery.user

  • roles/bigquery.dataEditor

  • roles/storage.objectAdmin

  • roles/iam.serviceAccountUser

roles/dataprep.projects.user IAM Role

All users of any version ofDataprep by Trifactamust be assigned theroles/dataprep.projects.userIAM Role.

Warning

This role and its related permissions enable access to all data in a project. Other permissions do not apply.

Dataprep by Trifacta Application Permissions

The following base set of IAM permissions and some additional permissions are required for accessing the product. Below, you can review the required permissions for this product edition.

Note

These permissions provide basic access to the Trifacta Application. Additional features within the product or available through external integrations are considered optional.

General

Permission

Product Use

dataprep.projects.use

Allow a user to useDataprep by Trifacta

resourcemanager.projects.get

GetDataprep by Trifacta project details

Data access

Access to data stored in the Google Cloud Platform from the Trifacta Application when previewing or displaying sampled data is determined based on whether fine-grained access controls and access control lists (ACLs) are enabled in the Google Cloud Platform:

Fine-grained access controls

Permissions

Disabled

The appropriate service account for the individual or project is used.

Note

The service account in use to access Trifacta Application must have the same IAM role permissions that are required to execute jobs.

For more information, see Google Service Account Management.

Enabled

Each user's credentials for Google Cloud Platform are used for requesting data from the platform.

Note

The IAM role permissions of each user's account must have appropriate access to preview and sample data.

For more information on fine-grained access controls, please see https://cloud.google.com/storage/docs/access-control.

Dataflow

Run Alteryx jobs on Dataflow:

Permission

Product Use

compute.machineTypes.get

List available machine types for Dataflow jobs

dataflow.jobs.create

Create a Dataflow job

dataflow.jobs.get

List Dataflow jobs

dataflow.messages.list

Get Dataflow job details

dataflow.metrics.get

GetDataflowjob details

Connection Permissions

These permissions are required for connections that are common in Dataprep by Trifacta.

Cloud Storage

Read and write to Cloud Storage, the base storage for Dataprep by Trifacta:

Permission

Product Use

Requirement

storage.buckets.list

List Cloud Storage buckets in project

Required at project level

storage.buckets.get

Get bucket metadata

Required for staging bucket only

storage.objects.create

Create files

Required for staging bucket only

storage.objects.delete

Delete files

Required for staging bucket only

storage.objects.get

Read files

Required for staging bucket only

storage.objects.list

List files

Required for staging bucket only

BigQuery

Read and write to BigQuery, including views and custom SQL:

Permission

Product Use

Requirement

bigquery.jobs.create

For Custom SQL query support and launching Dataflow jobs with BigQuery data sources.

Required at project level to use BigQuery

bigquery.datasets.get

List and get metadata about datasets in project

Can be applied at project level or at individual dataset level

bigquery.tables.create

Execute custom queries

Can be applied at project level or at individual dataset level

bigquery.tables.get

Create tables in dataset

Can be applied at project level or at individual dataset level

bigquery.tables.get

Get table metadata

Can be applied at project level or at individual dataset level

bigquery.tables.getData

get table contents

Can be applied at project level or at individual dataset level

bigquery.tables.list

List tables in dataset

Can be applied at project level or at individual dataset level

Feature Permissions

Additional permissions may be required to use specific features. Individual users may be required to permit Dataprep by Trifacta access when the feature is first used.

Dataflow job cancellation

Permission

Product Use

dataflow.jobs.cancel

This permission is required to cancel jobs on Dataflow from within the Trifacta Application. It is not required for the product to work but may be helpful to add via IAM roles.

Note

The ability to cancel a job from within the Trifacta Application is temporarily disabled. When it is re-enabled, this permission will be required. You should leave this permission enabled, if possible.

Note

Auser may be able cancel a job from the Trifacta Application, even though the user is not permitted to cancel the job in the running environment. The service account associated with the user's Alteryx account may have the appropriate permissions, but the user's personal account does not. For more information, see Google Service Account Management.

BigQuery publishing options

The following permissions are required to publish to BigQuery:

Permission

Product Use

bigquery.datasets.create

Create datasets in BigQuery

Note

In some environments, users may not be permitted the bigquery.datasets.create permission, and Dataflow jobs on BigQuery sources fail. As a workaround, an administrator can define a BigQuery dataset for use for these jobs. This dataset is used byDataprep by Trifacta for storing temporary tables of intermediate query results when running Dataflow jobs on BigQuery sources, which eliminates the need for the bigquery.datasets.create permission in the service account. For more information on defining this BigQuery query dataset, see Dataprep Project Settings Page.

bigquery.datasets.update

Update datasets in BigQuery

The following permission is not required to publish to BigQuery.

Permission

Product Use

bigquery.datasets.delete

If this permission is not granted to a user, that user requires one of the following permissions to drop or truncate table data in BigQuery:

  • The user is granted editor or owner role on the project.

  • The user is granted bigquery.tables.delete for the project.

Note

If a user does not have this permission when publishing to a table, the user receives a warning that the target dataset is read-only.

BigQuery job execution

To enable execution of jobs in BigQuery, the following permission must be enabled. Additional configuration may be required. For more information on this feature, see BigQuery Running Environment.

Permission

Product Use

bigquery.jobs.create

This permission enables execution of jobs within BigQuery. It is also used for custom SQL queries, which is enabled by default. In most projects, this permission is enabled by default.

BigQuery job execution on Cloud Storage files

If you have enabled execution of jobs in BigQuery, you can extend that capability to execute jobs for data sources hosted in Cloud Storage. GCS execution in BigQuery requires that external tables be enabled in BigQuery. The following permissions are required to create and use external tables.

Tip

In most projects, these permissions are enabled by default.

Permission

Product Use

bigquery.tables.create

Enabled in the default Dataprep by Trifacta role.

bigquery.tables.getData

Enabled in the default Dataprep by Trifacta role.

bigquery.jobs.create

Required for job execution in BigQuery. See previous section.

Google Sheets access

Note

This feature may not be available in all product editions. For more information on available features, see Compare Editions.

  • drive.readonly

For more information, see Import Google Sheets Data.

Additional Permissions for Cloud IAM

Note

This feature may not be available in all product editions. For more information on available features, see Compare Editions.

Note

Any change in a user's permissions in Google Cloud Platform must be reflected in the service account assigned to the user.

Run Dataflow jobs

Every Dataprep by Trifacta job requires the use of a service account through which the job is submitted to Dataflow for execution. Each project user must have access to a service account. For more information, see Google Service Account Management.

Data access

In addition to the IAM roles above, users must also be granted the following to enable data access based on their Cloud IAM:

These permissions ensure that users can access the appropriate data within Dataprep by Trifacta.