Spark Java IllegalArgumentException at org.apache.xbean.asm5.ClassReader
Asked Answered
A

3

8

I'm trying to use Spark 2.3.1 with Java.

I followed examples in the documentation but keep getting poorly described exception when calling .fit(trainingData).

Exception in thread "main" java.lang.IllegalArgumentException
at org.apache.xbean.asm5.ClassReader.<init>(Unknown Source)
at org.apache.xbean.asm5.ClassReader.<init>(Unknown Source)
at org.apache.xbean.asm5.ClassReader.<init>(Unknown Source)
at org.apache.spark.util.ClosureCleaner$.getClassReader(ClosureCleaner.scala:46)
at org.apache.spark.util.FieldAccessFinder$$anon$3$$anonfun$visitMethodInsn$2.apply(ClosureCleaner.scala:449)
at org.apache.spark.util.FieldAccessFinder$$anon$3$$anonfun$visitMethodInsn$2.apply(ClosureCleaner.scala:432)
at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
at scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:103)
at scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:103)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
at scala.collection.mutable.HashMap$$anon$1.foreach(HashMap.scala:103)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
at org.apache.spark.util.FieldAccessFinder$$anon$3.visitMethodInsn(ClosureCleaner.scala:432)
at org.apache.xbean.asm5.ClassReader.a(Unknown Source)
at org.apache.xbean.asm5.ClassReader.b(Unknown Source)
at org.apache.xbean.asm5.ClassReader.accept(Unknown Source)
at org.apache.xbean.asm5.ClassReader.accept(Unknown Source)
at org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$14.apply(ClosureCleaner.scala:262)
at org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$14.apply(ClosureCleaner.scala:261)
at scala.collection.immutable.List.foreach(List.scala:381)
at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:261)
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:159)
at org.apache.spark.SparkContext.clean(SparkContext.scala:2299)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2073)
at org.apache.spark.rdd.RDD$$anonfun$take$1.apply(RDD.scala:1358)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
at org.apache.spark.rdd.RDD.take(RDD.scala:1331)
at org.apache.spark.ml.tree.impl.DecisionTreeMetadata$.buildMetadata(DecisionTreeMetadata.scala:112)
at org.apache.spark.ml.tree.impl.RandomForest$.run(RandomForest.scala:105)
at org.apache.spark.ml.classification.DecisionTreeClassifier.train(DecisionTreeClassifier.scala:116)
at org.apache.spark.ml.classification.DecisionTreeClassifier.train(DecisionTreeClassifier.scala:45)
at org.apache.spark.ml.Predictor.fit(Predictor.scala:118)
at com.example.spark.MyApp.main(MyApp.java:36)

I took this dummy dataset for classification (data.csv):

f,label
1,1
1.5,1
0,0
2,2
2.5,2

My code:

SparkSession spark = SparkSession.builder()
            .master("local[1]")
            .appName("My App")
            .getOrCreate();

    Dataset<Row> data = spark.read().format("csv")
            .option("header", "true")
            .option("inferSchema", "true")
            .load("C:\\tmp\\data.csv");

    data.show();  // see output(1) below

    VectorAssembler assembler = new VectorAssembler()
            .setInputCols(Collections.singletonList("f").toArray(new String[0]))
            .setOutputCol("features");

    Dataset<Row> trainingData = assembler.transform(data)
            .select("features", "label");

    trainingData.show();  // see output(2) below

    DecisionTreeClassifier clf = new DecisionTreeClassifier();
    DecisionTreeClassificationModel model = clf.fit(trainingData);  // fails here (MyApp.java:36)
    Dataset<Row> predictions = model.transform(trainingData);

    predictions.show();  // never reached

Output(1):

+---+-----+
|  f|label|
+---+-----+
|1.0|    1|
|1.5|    1|
|0.0|    0|
|2.0|    2|
|2.5|    2|
+---+-----+

Output(2):

+--------+-----+
|features|label|
+--------+-----+
|   [1.0]|    1|
|   [1.5]|    1|
|   [0.0]|    0|
|   [2.0]|    2|
|   [2.5]|    2|
+--------+-----+

My build.gradle file looks like this:

plugins {
    id 'java'
    id 'application'
}

group 'com.example'
version '1.0-SNAPSHOT'

sourceCompatibility = 1.8
mainClassName = 'MyApp'

repositories {
    mavenCentral()
}

dependencies {
    compile group: 'org.apache.spark', name: 'spark-core_2.11', version: '2.3.1'
    compile group: 'org.apache.spark', name: 'spark-sql_2.11', version: '2.3.1'
    compile group: 'org.apache.spark', name: 'spark-mllib_2.11', version: '2.3.1'
}

What am I missing?

Asphyxiant answered 15/7, 2018 at 22:11 Comment(0)
S
16

What Java version do you have downloaded on your machine? Your problem is probably related to Java 9.

If you download Java 8 (jdk-8u171, for instance), the Exception will disappear, and output(3) of predictions.show() will look like this:

+--------+-----+-------------+-------------+----------+
|features|label|rawPrediction|  probability|prediction|
+--------+-----+-------------+-------------+----------+
|   [1.0]|    1|[0.0,2.0,0.0]|[0.0,1.0,0.0]|       1.0|
|   [1.5]|    1|[0.0,2.0,0.0]|[0.0,1.0,0.0]|       1.0|
|   [0.0]|    0|[1.0,0.0,0.0]|[1.0,0.0,0.0]|       0.0|
|   [2.0]|    2|[0.0,0.0,2.0]|[0.0,0.0,1.0]|       2.0|
|   [2.5]|    2|[0.0,0.0,2.0]|[0.0,0.0,1.0]|       2.0|
+--------+-----+-------------+-------------+----------+
Siloa answered 17/7, 2018 at 16:9 Comment(2)
I am in JD8 and I have the problemUnto
I had similar issue as I'm using JDK 11 and Scala version is 2.11 , spark version is 2.3.1. Once I have updated Scala version to 2.12 , spark version is 2.4.4, it working finePublish
W
3

I had the same ploblem, my system use Spark 2.2.0 with Java 8, now We would like the upgrade the server, but the Spark 2.3.1 doesn't work with Java 10 yet, in my case I continue working with Java 8 in the Spark Server and upgraded only Spark to 2.3.1

I read same post about the suubject:

https://issues.apache.org/jira/browse/SPARK-24421

Why apache spark does not work with java 10?

Wager answered 15/8, 2018 at 22:45 Comment(0)
H
0

For those of you who have java 8 installed and still got this exception you should check to which java version your eclipse is pointing to. In your eclipse IDE go to

Windows > Preferences > Java > Installed JREs

Now check if its pointing to jre 1.8 If not then click on Add > Standard VM > Next > (provide your jre directory) > Finish

Hayashi answered 29/4, 2021 at 8:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.