Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space DEV and version r089

D toc
 

D s product
rtrue
 uses service accounts to execute all its jobs on 
D s dataflow
. Access to 
D s dataflow
 is governed by Google service accounts. A service account is used by the 
D s webapp
 to access services and resources in the 
D s platform
.

Project Service Accounts

D gdp service accounts

Compute Engine Service Account

The Compute Engine service account is the default account for all project users to run jobs on

D s dataflow
Compute Engine service account enables access to platform services for the compute engine instance on which the 
D s webapp
 is hosted. When the product is enabled for your project, the appropriate compute engine service account is assigned at the project level. Automatically, all users of the project are assigned this account by default. 

This service account has the following name:

Code Block
<project-number>-compute@developer.gserviceaccount.com

where:

  • <project-number> - identifier for the the project using the compute service account.
Tip

Tip: For all project users to use the default compute service engine account, no additional configuration is required.

Service Account Permissions

Info

NOTE: A user may be able cancel a job from the

D s webapp
, even though the user is not permitted to cancel the job in the running environment. The service account associated with the user's
D s item
itemaccount
may have the appropriate permissions, but the user's personal account does not.

A service account must have the following permissions.

Required 
D s dataflow
 permissions

For the list of minimum required permissions for access 

D s dataflow
, see https://cloud.google.com/dataflow/docs/concepts/access-control#roles.

Canceling jobs

To enable users to cancel 

D s dataflow
 jobs, the service account must have the following permission:

Code Block
dataflow.jobs.cancel

actAs permission

To run

D s product
 jobs on 
D s dataflow
, the actAs permission must be provisioned based on the following applicable scenario:

  • When not using companion service account: User must have iam.serviceAccounts.actAs permission specified at the project level or in the default compute engine service account.
  • When using companion service account: User must have iam.serviceAccounts.actAs permission on companion service account or granted explicitly to the user.
    • IAM disabled: If you are not using IAM roles and have enabled companion service accounts, the Dataprep Service account role, which is assigned to the default service account has the actAs permission for the project. 
  • Project owners require no additional permissions on the projects that they own.

For more information, see https://cloud.google.com/dataflow/docs/concepts/security-and-permissions#security_and_permissions_for_pipelines_on_google_cloud_platform.

Data connectivity permissions

Info

NOTE: Any service account that is used to run jobs must have at least the same permissions that are available through the IAM role to connect to data through the

D s webapp
. For example, to run a job sourced from
D s storage
datasets, the service account must have the ability to read those datasets accessed through a user's IAM role. The same applies to publishing datasets.

For more information on

D s storage
 and BigQuery permissions, see Required Dataprep User Permissions.

Cross-Project Access to Data

If users are permitted to access data in 

D s storage
 or BigQuery that is owned by another project, additional permissions are required.

Using Service Accounts

D s product
rtrue
 only supports running jobs on 
D s dataflow
. Every job must be executed with a service account.

The service account that is used for a job is determined based on the following priority level, highest to lowest:

PriorityDescription
1

Job-level overrides: Individual users can override the default service account or their companion service account when executing individual jobs. Scheduled jobs use job-level overrides.

For more information, see Dataflow Execution Settings.

2

User preferences: If defined here, these service accounts are applied to individual users.

Info

NOTE: If preferred, project owners can require the use of individual service accounts for each user of the project. Companion service accounts are described below.

See Execution Settings Page.

3

Compute Engine service account: If no other service account is specified, then the project default service account is used.

Info

NOTE: When the product is enabled, the default Compute Engine service account is provisioned for each user of the project.

Project service accounts

At the project level, all users are assigned the Compute Engine service account by default. See above.

Companion Service Account:

D s ed
editionsgdpent,gdppr

Optionally, project owners can require that service accounts be assigned to individual users. When enabled, companion service accounts can be assigned by the project owner or by the individual user.

Info

NOTE: When companion service accounts are enabled, the Compute Engine service account is no longer available.

See "Companion Service Accounts" below.

User preferences

Users can specify the service account to use for all of their jobs. For example, if a user is invited into multiple projects, that user may be required to submit jobs in all projects using the same service account. 

Info

NOTE: A service account assigned to a user's preferences takes precedence over the project-level service account.

For more information, see Execution Settings Page.

A user's service account can be assigned by the user or, if companion service accounts is enabled, by the project owner, or both. See "Companion Service Accounts" below.

Job overrides

For individual jobs, a user can select the service account to use. This value overrides user preferences and project owner selections. For more information, see Run Job Page.

Custom Service Accounts

Any custom service accounts must be created through the Google IAM console. 

Custom service account requirements

Any custom service account or companion service account must meet the following requirements:

  1. Service account must be defined in IAM console.
    1. The minimum set of permissions to access 
      D s product
       and any related datastores must be included in the custom service account. See "Service Account Permissions" above.
    2. Permissions in each user's IAM role must be reflected in any custom service account applied to the user's account. Changes in one must be reflected in the other.
  2. Service account must be applied to the project in IAM console.
Tip

Tip: After custom service accounts are specified in the IAM console and assigned to the project, they can be used in the product. Custom service accounts can be applied at the project level, user level, or job execution level.

Companion Service Accounts

D s ed
editionsgdpent,gdppr
companion service account is a replacement for the single Compute Engine service account for submitting jobs to
D s dataflow
 on behalf of the user. For example, separate companion service accounts can be specified to enable access to different BigQuery tables between users. In this manner, a project owner can provide finer-grained access controls to individual users.

This service account must be specified in the Google IAM console and contain all of the permissions required to access a user's data and run jobs on 

D s dataflow
. For more information on permissions, see "Service Account Permissions" above.

When companion service accounts are enabled:

  • Companion service accounts must be specified for individual users of the project, instead of all users relying on the default Compute Engine service account in the project.
    • Project owners can apply them for each user. 
    • Individual users can apply their own.
  • The default Compute Engine service account is no longer available for use. 
  • Companion service accounts can be overridden for individual jobs when defining the job to execute.
  • Previously created scheduled jobs automatically inherit and use the companion service account specified for the user. 
Tip

Tip: Before enabling the feature, you should create and specify the companion service accounts. Then, when the feature is enabled, there is no service disruption.


Info

NOTE: Changes to a user's permissions must be reflected in

D s product
and in the related companion service account.

Create companion service accounts

Like any custom service account, companion service accounts must be created in the IAM console and applied to the project. See "Custom Service Accounts" above. 

Manage companion service accounts

After these service accounts have been created, you can assign a companion service account to each user of the project. For more information, see Service Accounts Page.

Tip

Tip: Individual users can also specify the companion service account through their user preferences. User preferences selections override any selections made by the project owner. See Execution Settings Page.

Enable companion service accounts

A project owner must enable the use of companion service accounts.

Info

NOTE: If the use of companion service accounts is later disabled, all project users revert to using the Compute Engine service account.

For more information, see Dataprep Project Settings Page.