Can specifying num-executors in spark-submit command override alreay enabled dynamic allocation (spark.dynamicAllocation.enable true) ?
You can see from log:
INFO util.Utils: Using initial executors = 60,
max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
That means spark will take the max(spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors, spark.executor.instances)
spark.executor.instances is --num-executor.
In your spark-defaults.conf file you can set the following to control the behaviour of dynamic allocation on Spark2
spark.dynamicAllocation.enabled=true
spark.dynamicAllocation.initialExecutors=1
spark.dynamicAllocation.minExecutors=1
spark.dynamicAllocation.maxExecutors=5
If your spark2-submit command does not specify anything then your job starts with 1 executor and increases to 5 if required.
If your spark2-submit command specifies the following
--num-executors=3
then your job will start with 3 executors and still grow to 5 executors if required.
Check your log messages for
Using initial executors = [initialExecutors], max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
Additionally if do not specify spark.dynamicAllocation.maxExecutors
at all then, given a resource hungry job, it will continue to use as many executors as it can (in the case of Yarn this could be restricted by a limit defined on the Queue you submitted your job to). I have seen "rogue" spark jobs on Yarn hog huge amounts of resource on shared clusters starving other jobs. Your Yarn administrators should prevent resource starvation, etc by configuring sensible defaults and splitting different types of work loads across different queues.
I would advise performance testing any changes you intend to make in overriding the defaults, particularly trying to simulate busy periods of your system.
To explicitly control the number of executors, you can override dynamic allocation by setting the "--num-executors
" command-line or spark.executor.instances
configuration property.
"--num-executor
" property in spark-submit
is incompatible with spark.dynamicAllocation.enabled
. If both spark.dynamicAllocation.enabled
and spark.executor.instances
are specified, dynamic allocation is turned off and the limited number of spark.executor.instances
is used".
Also, it will give the warning WARN SparkContext: Dynamic Allocation and num executors are both set, thus dynamic allocation is disabled.
© 2022 - 2024 — McMap. All rights reserved.
--num-executors
(orspark.executor.instances
) is set and larger than this value, it will be used as the initial number of executors." – Slype