In Spark, there are 3 primary ways to specify the options for the SparkConf
used to create the SparkContext
:
- As properties in the conf/spark-defaults.conf
- e.g., the line:
spark.driver.memory 4g
- e.g., the line:
- As args to spark-shell or spark-submit
- e.g.,
spark-shell --driver-memory 4g ...
- e.g.,
- In your source code, configuring a
SparkConf
instance before using it to create theSparkContext
:- e.g.,
sparkConf.set( "spark.driver.memory", "4g" )
- e.g.,
However, when using spark-shell
, the SparkContext is already created for you by the time you get a shell prompt, in the variable named sc
. When using spark-shell, how do you use option #3 in the list above to set configuration options, if the SparkContext is already created before you have a chance to execute any Scala statements?
In particular, I am trying to use Kyro serialization and GraphX. The prescribed way to use Kryo with GraphX is to execute the following Scala statement when customizing the SparkConf
instance:
GraphXUtils.registerKryoClasses( sparkConf )
How do I accomplish this when running spark-shell
?