Detected Guava issue #1635 which indicates that a version of Guava less than 16.01 is in use

Asked 26/4, 2016 at 23:58 Answered 5/6, 2018 at 13:47

I am running spark job on emr and using datastax connector to connect to cassandra cluster. I am facing issues with the guava jar please find the details as below I am using below cassandra deps

cqlsh 5.0.1 | Cassandra 3.0.1 | CQL spec 3.3.1

Running spark job on EMR 4.4 with below maven deps

org.apache.spark spark-streaming_2.10 1.5.0

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-core_2.10</artifactId>
    <version>1.5.0</version>
</dependency>

<dependency>
    <groupId>org.apache.spark</groupId><dependency>
    <groupId>com.datastax.spark</groupId>
    <artifactId>spark-cassandra-connector_2.10</artifactId>
    <version>1.5.0</version>
</dependency>

    <artifactId>spark-streaming-kinesis-asl_2.10</artifactId>
    <version>1.5.0</version>
</dependency>

facing issues when i submit spark job as below

ava.lang.ExceptionInInitializerError
       at com.datastax.spark.connector.cql.DefaultConnectionFactory$.clusterBuilder(CassandraConnectionFactory.scala:35)
       at com.datastax.spark.connector.cql.DefaultConnectionFactory$.createCluster(CassandraConnectionFactory.scala:87)
       at com.datastax.spark.connector.cql.CassandraConnector$.com$datastax$spark$connector$cql$CassandraConnector$$createSession(CassandraConnector.scala:153)
       at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$2.apply(CassandraConnector.scala:148)
       at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$2.apply(CassandraConnector.scala:148)
       at com.datastax.spark.connector.cql.RefCountedCache.createNewValueAndKeys(RefCountedCache.scala:31)
      at com.datastax.spark.connector.cql.RefCountedCache.acquire(RefCountedCache.scala:56)
       at com.datastax.spark.connector.cql.CassandraConnector.openSession(CassandraConnector.scala:81)
       at ampush.event.process.core.CassandraServiceManagerImpl.getAdMetaInfo(CassandraServiceManagerImpl.java:158)
       at ampush.event.config.metric.processor.ScheduledEventAggregator$4.call(ScheduledEventAggregator.java:308)
       at ampush.event.config.metric.processor.ScheduledEventAggregator$4.call(ScheduledEventAggregator.java:290)
       at org.apache.spark.api.java.JavaRDDLike$$anonfun$foreachPartition$1.apply(JavaRDDLike.scala:222)
       at org.apache.spark.api.java.JavaRDDLike$$anonfun$foreachPartition$1.apply(JavaRDDLike.scala:222)
       at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:902)
       at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:902)
       at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1850)
       at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1850)
       at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
       at org.apache.spark.scheduler.Task.run(Task.scala:88)
       at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: Detected Guava issue #1635 which indicates that a version of Guava less than 16.01 is in use.  This introduces codec resolution issues and potentially other incompatibility issues in the driver.  Please upgrade to Guava 16.01 or later.
       at com.datastax.driver.core.SanityChecks.checkGuava(SanityChecks.java:62)
       at com.datastax.driver.core.SanityChecks.check(SanityChecks.java:36)
       at com.datastax.driver.core.Cluster.<clinit>(Cluster.java:67)
       ... 23 more

please let me know how to manage guava deps here ?

Thanks

Snifter answered 26/4, 2016 at 23:58 Comment(1)

Your dependency blocks are incomplete – Lashondra 27/4, 2016 at 1:58

Another solution, Go to directory

spark/jars

. Rename guava-14.0.1.jar then copy guava-19.0.jar like this picture:

Radii answered 3/10, 2016 at 8:9 Comment(3)

As a note, Guava 20 won't work for this. Guava 19 does work, though. – Cowcatcher 17/11, 2016 at 18:7

Such a great hack! – Bernini 29/1, 2018 at 22:49

Renaming the old jar and adding the new one didn't work for me (spark 2.4.0). Removing the old jar resolved the problem. – Educatee 7/10, 2020 at 15:2

I've had the same problem, and resolved it by using the maven Shade plugin to shade the guava version that the Cassandra connector brings in.

I needed to exclude the Optional, Present and Absent classes explicitly because I was running into issues with Spark trying to cast from the non-shaded Guava Present type to the shaded Optional type. I'm not sure if this will cause any problems later on, but it seems to be working for me for now.

You can add this to the <plugins> section in your pom.xml:

<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-shade-plugin</artifactId>
    <version>2.4.3</version>
    <executions>
        <execution>
            <phase>package</phase>
            <goals>
                <goal>
                    shade
                </goal>
            </goals>
        </execution>
    </executions>

    <configuration>
        <minimizeJar>true</minimizeJar>
        <shadedArtifactAttached>true</shadedArtifactAttached>
        <shadedClassifierName>fat</shadedClassifierName>

        <relocations>
            <relocation>
                <pattern>com.google</pattern>
                <shadedPattern>shaded.guava</shadedPattern>
                <includes>
                    <include>com.google.**</include>
                </includes>

                <excludes>
                    <exclude>com.google.common.base.Optional</exclude>
                    <exclude>com.google.common.base.Absent</exclude>
                    <exclude>com.google.common.base.Present</exclude>
                </excludes>
            </relocation>
        </relocations>

        <filters>
            <filter>
                <artifact>*:*</artifact>
                <excludes>
                    <exclude>META-INF/*.SF</exclude>
                    <exclude>META-INF/*.DSA</exclude>
                    <exclude>META-INF/*.RSA</exclude>
                </excludes>
            </filter>
        </filters>

    </configuration>
</plugin>

Murther answered 4/5, 2016 at 20:57 Comment(2)

This will not solve the purpose here. The reason here is our deployment platform EMR. The way emr uses to build default classpath of spark is using less < 16 version of guava in classpath which is due to the fact that it pulls hadoop libraries which are still old with EMR 4.2\4.4\4.6. I have fixed mine by adding bootstrap process to emr to override default spark classpath with updated one. – Snifter 4/5, 2016 at 21:36

I confirm this does fix the issue for me on a Spark Standalone v1.5.2 cluster and Spark Cassandra connector v1.5.1. Thanks. – Appendectomy 28/6, 2016 at 14:55

I was facing the the same issue while retrieving records from Cassandra table using Spark (java) on Spark submit.

Please check your guava jar version used by Hadoop and Spark in cluster using find command and change it accordingly.

find / -name "guav*.jar"

Otherwise add guava jar externally during spark-submit for driver and executer spark.driver.extraClassPath and spark.executor.extraClassPath respectively.

spark-submit --class com.my.spark.MySparkJob --master local --conf 'spark.yarn.executor.memoryOverhead=2048' --conf 'spark.cassandra.input.consistency.level=ONE' --conf 'spark.cassandra.output.consistency.level=ONE' --conf 'spark.dynamicAllocation.enabled=false' --conf "spark.driver.extraClassPath=lib/guava-19.0.jar" --conf "spark.executor.extraClassPath=lib/guava-19.0.jar" --total-executor-cores 15 --executor-memory 15g  --jars $(echo lib/*.jar | tr ' ' ',') target/my-sparkapp.jar

It's working for me. Hope you can try it.

Perfumery answered 5/6, 2018 at 13:47 Comment(0)

Just add in your POM's <dependencies> block something like this:

<dependency>
    <groupId>com.google.guava</groupId>
    <artifactId>guava</artifactId>
    <version>19.0</version>
</dependency>

(or any version > 16.0.1 that you prefer)

Lashondra answered 27/4, 2016 at 1:56 Comment(6)

i was going through the link groups.google.com/a/lists.datastax.com/forum/#!topic/… which says Spark 1.5 uses Guava 14, Cassandra driver core requires Guava 16. The spark cassandra connetor rises an exception. so how adding above would solve my issue may be a newbie question. thanks – Snifter 27/4, 2016 at 2:28

Also as per link github.com/datastax/spark-cassandra-connector i am using 1.5 cassandra connector 1.5, 1.6 (spak) 3.0 (cassandra) ? not sure then why i am getting issue – Snifter 27/4, 2016 at 11:20

Not sure what you were asking. If you are interested to know why Maven resolved the old version of Guava, you may use mvn dependency:tree which it shows you how every dependency is resolved (or ignored) transitively – Lashondra 28/4, 2016 at 1:38

I got the root cause the problem i mentioned here is related to EMR. Here spark and cassandra are trying to use guava 16 and EMR is adding old hadoop lib to spark classpath fetching old guava 11. I am working with another request with amazon to solve this. Thanks – Snifter 29/4, 2016 at 22:47

Oh man I wish it was that easy. For me, spark itself has a ton of jars. The screen shot above almost works, but for our version of spark we had different versions of guava all over the place. I ended up just filtering them all out, like everywhere. And I added the stuff in the classpath like @adri – Bernini 29/1, 2018 at 21:43

@TonyFraser That's what <dependencyManagement> comes handy. Have spark declared in dependencyManagement of your parent POM, having trouble dependencies excluded. By doing so, POMs inheriting from your parent POM, when declaring spark as dependency, will have those trouble dependencies excluded. Of course you need to add the corresponding correct dependencies. – Lashondra 30/1, 2018 at 2:43

I was able to get around this by adding the guava 16.0.1 jar externally and then specifying the class-path on Spark submit with help of below configuration values:

--conf "spark.driver.extraClassPath=/guava-16.0.1.jar" --conf "spark.executor.extraClassPath=/guava-16.0.1.jar"

Hope this helps someone with similar error !

Ferdelance answered 22/1, 2018 at 20:2 Comment(0)

Thanks Adrian for your response.

I am on a little of a different architecture than everybody else on the thread but the Guava problem is still the same. I am using spark 2.2 with mesosphere. In our development environment we use sbt-native-packager to produce our docker images to pass into mesos.

Turns out, we needed to have a different guava for the spark submit executors than we need for the code that we run on the driver. This worked for me.

build.sbt

....
libraryDependencies ++= Seq(
  "com.google.guava" % "guava" % "19.0" force(),
  "org.apache.hadoop" % "hadoop-aws" % "2.7.3" excludeAll (
    ExclusionRule(organization = "org.apache.hadoop", name = "hadoop-common"), //this is for s3a
    ExclusionRule(organization = "com.google.guava",  name= "guava" )),
  "org.apache.spark" %% "spark-core" % "2.1.0"   excludeAll (
    ExclusionRule("org.glassfish.jersey.bundles.repackaged", name="jersey-guava"),
    ExclusionRule(organization = "com.google.guava",  name= "guava" )) ,
  "com.github.scopt" %% "scopt" % "3.7.0"  excludeAll (
    ExclusionRule("org.glassfish.jersey.bundles.repackaged", name="jersey-guava"),
    ExclusionRule(organization = "com.google.guava",  name= "guava" )) ,
  "com.datastax.spark" %% "spark-cassandra-connector" % "2.0.6",
...
dockerCommands ++= Seq(
...
  Cmd("RUN rm /opt/spark/dist/jars/guava-14.0.1.jar"),
  Cmd("RUN wget -q http://central.maven.org/maven2/com/google/guava/guava/23.0/guava-23.0.jar  -O /opt/spark/dist/jars/guava-23.0.jar")
...

When I tried to replace guava 14 on the executors with guava 16.0.1 or 19, it still wouldn't work. Spark submit just died. My fat jar which is actually the guava that is in use for my application in the driver I forced to be 19, but my spark submit executor I had to replace to be 23. I did try replacing to 16 and 19, but spark just died there too.

Sorry for diverting, but every time after all my google searches this one came up every time. I hope this helps other SBT/mesos folks too.

Bernini answered 1/2, 2018 at 0:3 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags