Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Warning

Before you modify these parameters, you should review with your cluster administrator what are appropriate settings for each parameter. In some cases, you can set these values to cause failures on the cluster. No validation is performed for inputted values.


Spark parameter

Description

Spark Driver Memory

Amount of physical RAM in GB on each Spark node that is made available for the Spark drivers.

By raising this number:

  • The drivers for your job are allocated more memory on each Spark node.
  • There is less memory available for other uses on the node.
Spark Executor Memory

Amount of physical RAM in GB on each Spark node that is made available for the Spark executors.

By raising this number:

  • The Spark executors for your job are allocated more memory.
  • There is less memory available for other uses on the node.
Spark Executor Cores

Number of cores on each Spark executor that is made available to Spark.

By raising this number:

  • The maximum number of cores available for your job is raised on each Spark executor.
  • There are fewer cores for other uses on the node.
Transformer Dataframe Checkpoint Threshold

When checkpointing is enabled, the Spark DAG is checkpointed when the approximate number of expressions in this parameter has been added to the DAG. Checkpointing assists in managing the volume of work that is processed through Spark at one time; by checkpointing after a set of steps,

D s product
 can reduce the chances of execution errors for your jobs.

By raising this number:

  • You increase the upper limit of steps between checkpoints.
  • You may reduce processing time.
  • It may result in a higher number of job failures.

For more details on setting these parameters, see Tune Cluster Performance.

...