At first, I read this article, which says spark.dynamicAllocation.maxExecutors
will have value equal to num-executors
if spark.dynamicAllocation.maxExecutors
is not explicitly set. However, from the following part of this article, it says "--num-executors
or spark.executor.instances
acts as a minimum number of executors with a default value of 2", And it confuses me.
My first question is what is the usage of
--num-executors
in Spark 2.x or later versions? Does it act like a obsolete option which is useful before dynamicAllocation? When dynamicAllocation is introduced,--num-executors
and--max-executors
act more like some default values tospark.dynamicAllocation.*
?What is the difference between
--conf spark.dynamicAllocation.maxExecutors
and--max-executor
? Does the later acts like an alias to the former?
Meanwhile, the article does not mention the relationship between num-executors
and spark.dynamicAllocation.initialExecutors
. So I make a experiment in which I send the following arguments:
--conf spark.dynamicAllocation.minExecutors=2 --conf spark.dynamicAllocation.maxExecutors=20 --conf spark.dynamicAllocation.enabled=true --conf spark.dynamicAllocation.initialExecutors=3 --conf spark.dynamicAllocation.maxExecutors=10 --num-executors 0 --driver-memory 1g --executor-memory 1g --executor-cores 2
And it turns out that initially 3 executor is allocated(correspond to initialExecutors
) and then reduced to 2(correspond to minExecutors
), and it seems that --num-executors
is useless here. However, the article says "-num-executors or spark.executor.instances acts as a minimum number of executors", so there is a contradiction now.
- My third question is what is the relationship between
num-executors
andspark.dynamicAllocation.initialExecutors
, isspark.dynamicAllocation.initialExecutors
prior tonum-executors
? From the document, I find whennum-executors
is set larger thanspark.dynamicAllocation.initialExecutors
, it will overridespark.dynamicAllocation.initialExecutors
.