To submit a Spark application to a cluster, their documentation notes:
To do this, create an assembly jar (or “uber” jar) containing your code and its dependencies. Both sbt and Maven have assembly plugins. When creating assembly jars, list Spark and Hadoop as provided dependencies; these need not be bundled since they are provided by the cluster manager at runtime. -- http://spark.apache.org/docs/latest/submitting-applications.html
So, I added the Apache Maven Shade Plugin to my pom.xml
file. (version 3.0.0)
And I turned my Spark dependency's scope into provided
. (version 2.1.0)
(I also added the Apache Maven Assembly Plugin to ensure I was wrapping all of my dependencies in the jar when I run mvn clean package
. I'm unsure if it's truly necessary.)
Thus is how spark-submit
fails. It throws a NoSuchMethodError
for a dependency I have (note that the code works from a local instance when compiling inside IntelliJ, assuming that provided
is removed).
Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.base.Stopwatch.createStarted()Lcom/google/common/base/Stopwatch;
The line of code that throws the error is irrelevant--it's simply the first line in my main method that creates a Stopwatch
, part of the Google Guava utilities. (version 21.0)
Other solutions online suggest that it has to do with version conflicts of Guava, but I haven't had any luck yet with those suggestions. Any help would be appreciated, thank you.
relocation
section in the shade plugin fixed the error. Also, I was able to remove the assembly plugin and it still worked fine. – Seaside