So asking if anyone knows a way to change the Spark properties (e.g. spark.executor.memory, spark.shuffle.spill.compress, etc) during runtime, so that a change may take effect between the tasks/stages during a job...
So I know that...
1) The documentation for Spark 2.0+ (and previous versions too) state that once the Spark Context has been created, it can't be changed in runtime.
2) SparkSession.conf.set that may change a few things for SQL, but I was looking at more general, all encompassing configurations.
3) I could start a new context in the program with new properties, but the case here is to actually tune the properties once a job is already executing.
Ideas...
1) Would killing an Executor force it to read a configuration file again, or does it just get what's already configured during the beginning of the job?
2) Is there any command to force a "refresh" of the properties in spark context?
So hoping there might be a way or other ideas out there (thanks in advance)...
spark.locality.wait
may be appropriate for one stage processing a small amount of data, whereas a later stage processing a large amount of data should probably use a larger value for this parameter. – Aerophone