Contents:
In the Dataprep by Trifacta Cloud, Identity and Access Management (IAM) allows you to control user and group access to your project's resources. This section describes the IAM permissions relevant to Dataprep by Trifacta and the IAM roles that grant those permissions. To access the IAM console, see https://cloud.google.com/iam.
- A role is a set of one or more permissions. A role is assigned to users and groups.
- A permission grants access to a resource. Different permissions can grant different access levels to the same resource.
For more information on the service accounts used by Dataflow to manage security and permissions while running Dataprep by Trifacta jobs, see https://cloud.google.com/dataflow/docs/concepts/security-and-permissions#security_and_permissions_for_pipelines_on_google_cloud_platform.
Tools for manage IAM policies:
- Google Cloud Console
- API
- gcloud CLI
For more information, see https://cloud.google.com/iam/docs/granting-changing-revoking-access.
Required Roles and Their Permissions
To use Dataprep by Trifacta, the following roles are required. Below, you can review each required role, its purpose, and the permissions that are enabled by it.
Role | Use | Permissions and roles |
---|---|---|
roles/dataprep.projects.user | Enables a user to run Dataprep by Trifacta in a project See below. | Permissions:
|
roles/dataprep.serviceAgent | Enables the platform to access and modify datasets and storage and to run and manage Dataflow jobs on behalf of the user within the project NOTE: When the product is enabled within a project, this role is granted by the project owner as part of the enablement process. For more information, see Enable or Disable Dataprep. | Permissions:
Roles:
|
roles/dataprep.projects.user IAM Role
All users of any version of Dataprep by Trifacta must be assigned the roles/dataprep.projects.user
IAM Role.
This role and its related permissions enable access to all data in a project. Other permissions do not apply.
Dataprep by Trifacta Application Permissions
The following base set of IAM permissions and some additional permissions are required for accessing the product. Below, you can review the required permissions for this product edition.
NOTE: These permissions provide basic access to the Dataprep by Trifacta application. Additional features within the product or available through external integrations are considered optional.
General
Permission | Product Use |
---|---|
dataprep.projects.use | Allow a user to use Dataprep by Trifacta |
resourcemanager.projects.get | Get Dataprep by Trifacta project details |
Dataflow
Run Dataprep by Trifacta jobs on Dataflow:
Permission | Product Use |
---|---|
compute.machineTypes.get | List available machine types for Dataflow jobs |
dataflow.jobs.create | Create a Dataflow job |
dataflow.jobs.get | List Dataflow jobs |
dataflow.messages.list | Get Dataflow job details |
dataflow.metrics.get | Get Dataflow job details |
Connection Permissions
These permissions are required for connections that are common in Dataprep by Trifacta.
Base Storage
Read and write to Base Storage, the base storage for Dataprep by Trifacta:
Permission | Product Use | Requirement |
---|---|---|
storage.buckets.list | List Base Storage buckets in project | Required at project level |
storage.buckets.get | Get bucket metadata | Required for staging bucket only |
storage.objects.create | Create files | Required for staging bucket only |
storage.objects.delete | Delete files | Required for staging bucket only |
storage.objects.get | Read files | Required for staging bucket only |
storage.objects.list | List files | Required for staging bucket only |
BigQuery
Read and write to BigQuery, including views and custom SQL:
Permission | Product Use | Requirement |
---|---|---|
bigquery.jobs.create | For Custom SQL query support and launching Dataflow jobs with BigQuery data sources. | Required at project level to use BigQuery |
bigquery.datasets.get | List and get metadata about datasets in project | Can be applied at project level or at individual dataset level |
bigquery.tables.create | Execute custom queries | Can be applied at project level or at individual dataset level |
bigquery.tables.get | Create tables in dataset | Can be applied at project level or at individual dataset level |
bigquery.tables.get | Get table metadata | Can be applied at project level or at individual dataset level |
bigquery.tables.getData | get table contents | Can be applied at project level or at individual dataset level |
bigquery.tables.list | List tables in dataset | Can be applied at project level or at individual dataset level |
Feature Permissions
Additional permissions may be required to use specific features. Individual users may be required to permit Dataprep by Trifacta access when the feature is first used.
Dataflow job cancellation
Permission | Product Use |
---|---|
dataflow.jobs.cancel | Enables users to cancel their jobs in progress. It is not required for the product to work but may be helpful to add via IAM roles. |
BigQuery publishing options
The following permissions are required to publish to BigQuery:
Permission | Product Use |
---|---|
bigquery.datasets.create | Create datasets in BigQuery |
bigquery.datasets.update | Update datasets in BigQuery |
The following permission is not required to publish to BigQuery.
Permission | Product Use |
---|---|
bigquery.tables.delete | If this permission is not granted to a user, that user requires one of the following permissions to drop or truncate table data in BigQuery:
NOTE: If a user does not have this permission when publishing to a table, the user receives a warning that the target dataset is read-only. |
Google Sheets access
- drive.readonly
For more information, see Import Google Sheets Data.
Additional Permissions for Cloud IAM
Run Dataflow jobs
To run jobs on Dataflow, one of the following must be applied:
- User must have
iam.serviceAccounts.actAs
permission on a compute service account, which must be specified during job execution. User must have
iam.serviceAccounts.actAs
permission specified at the project level or in the default compute service account:<project-number>-compute@developer.gserviceaccount.com
Project owners require no additional permissions on the projects that they own.
For more information, see https://cloud.google.com/dataflow/docs/concepts/security-and-permissions#security_and_permissions_for_pipelines_on_google_cloud_platform.
Data access
In addition to the IAM roles above, users must also be granted the following to enable data access based on their Cloud IAM:
- dataset level permissions in BigQuery: See https://cloud.google.com/bigquery/docs/dataset-access-controls.
- Cloud Storage object ACLs: See https://cloud.google.com/storage/docs/access-control.
These permissions ensure that users can access the appropriate data within Dataprep by Trifacta.
This page has no comments.