Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

When you specify a job, you may pass to the Spark running environment a set of Spark property values to apply to the execution of the job. These property values override the global Spark settings for your deployment.

Info

NOTE: A workspace administrator must enable Spark job overrides and configure the set of available parameters. For more information, see Enable Spark Job Overrides.

Spark overrides are applied to individual output objects. 

...

Spark parameter

Description

spark.driver.memory

Amount of physical RAM in GB on each Spark node that is made available for the Spark drivers.

By raising this number:

  • The drivers for your job are allocated more memory on each Spark node.
  • There is less memory available for other uses on the node.
spark.executor.memory

Amount of physical RAM in GB on each Spark node that is made available for the Spark executors.

By raising this number:

  • The Spark executors for your job are allocated more memory.
  • There is less memory available for other uses on the node.
spark.executor.cores

Number of cores on each Spark executor that is made available to Spark.

By raising this number:

  • The maximum number of cores available for your job is raised on each Spark executor.
  • There are fewer cores for other uses on the node.
transformer.dataframe.checkpoint.threshold

When checkpointing is enabled, the Spark DAG is checkpointed when the approximate number of expressions in this parameter has been added to the DAG. Checkpointing assists in managing the volume of work that is processed through Spark at one time; by checkpointing after a set of steps, the

D s platform
 can reduce the chances of execution errors for your jobs.

By raising this number:

  • You increase the upper limit of steps between checkpoints.
  • You may reduce processing time.
  • It may result in a higher number of job failures.

...