When you specify a job in the Run Job page, you may pass to the Spark running environment a set of Spark property values to apply to the execution of the job. These property values override the global Spark settings for your deployment.
NOTE: A workspace administrator must enable the Custom Spark Options feature in the Workspace Settings page. For more information, see Workspace Settings Page.
In the Run Job page, click the Advanced Execution Settings caret.
Spark Execution Properties
|Transformer Dataframe Checkpoint Threshold|
When checkpointing is enabled, the Spark DAG is checkpointed when the approximate number of expressions in this parameter has been added to the DAG. Checkpointing assists in managing the volume of work that is processed through Spark at one time; by checkpointing after a set of steps, the can reduce the chances of execution errors for your jobs.
By raising this number:
|Enable whole-stage code generation for Spark||When enabled, whole-stage code generation optimizes Spark SQL queries for execution performance on the cluster.|
|Maximum number of fields that whole-stage code generation supports|
This defines the number of fields (columns) that are permitted in a whole-stage code generation query. If the number of fields in the query exceeds this value, then the disables whole-stage code generation to prevent performance issues and memory exceptions.
Default value: 100