Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space DEV and version r080

...

Excerpt

When you specify a 

D s dataflow
rtrue
 job, you may pass to the running environment a set of property values to apply to the execution of the job. Overrides are defined in the Run Job page and are applied to the configured job.

...

SettingDescription
Worker IP address configuration

If the VPC Network mode is set to custom, then choose one of the following:

  • Allow public IP addresses - Use
    D s dataflow
    workers that are available through public IP addresses. No further configuration is required.
  • Use internal IP addresses only -
    D s dataflow
    workers use private IP addresses for all communication.
    • If a Subnetwork is specified, then the Network value is ignored.
    • The specified Network or Subnetwork must have Private Google Access enabled.
Autoscaling Algorithms

The type of algorithm to use to scale the number of Google Compute Engine instances to accommodate the size of your job. Possible values:

  • Throughput based - Scaling is determined by the volume of data expected to be passed through
    D s dataflow
    .
  • None - None algorithm is applied.
    • If none is selected, use initial number of workers to specify a fixed number of Google Compute Engine instances.
Initial number of workersNumber of Google Compute Engine instances with which to launch the job. This number may be adjusted as part of job execution. This number must be an integer between 1 and 1000, inclusive.
Maximum number of workers

Maximum number of Google Compute Engine instances to use during execution. This value must be greater than the initial number of workers and must be an integer between 1 and 1000, inclusive.

Service accountEmail address of the service account under which to run the job.

Every

D s product
job executed in
D s dataflow
requires that the job be submitted through a service account. By default,
D s product
uses a default Compute Engine service account under which to run jobs.

Optionally, you can specify a different service account under which to run your job.

Info

NOTE: When using a named service account to access data and run jobs in other projects, the user running the job must be granted the roles/iam.serviceAccountUser role on the service account to use it.

For more information on service account usage and requirements, see Google Service Account Management

Labels

Create or assign labels to apply to the billing for the

D s product
jobs run in your project. You may reference up to 64 labels.

Info

NOTE: Each label must have a unique key name.

For more information, see https://cloud.google.com/resource-manager/docs/creating-managing-labels.

...