Apache Spark 3.3.0 breaks on Java 17 with "cannot access class sun.nio.ch.DirectBuffer"
Asked Answered
D

12

62

A similar question was asked at Running unit tests with Spark 3.3.0 on Java 17 fails with IllegalAccessError: class StorageUtils cannot access class sun.nio.ch.DirectBuffer, but that question (and solution) was only about unit tests. For me Spark is breaking actually running the program.

According to the Spark overview, Spark works with Java 17. I'm using Temurin-17.0.4+8 (build 17.0.4+8) on Windows 10, including Spark 3.3.0 in Maven like this:

<scala.version>2.13</scala.version>
<spark.version>3.3.0</spark.version>
...
<dependency>
  <groupId>org.apache.spark</groupId>
  <artifactId>spark-core_${scala.version}</artifactId>
  <version>${spark.version}</version>
</dependency>

<dependency>
  <groupId>org.apache.spark</groupId>
  <artifactId>spark-sql_${scala.version}</artifactId>
  <version>${spark.version}</version>
</dependency>

I try to run a simple program:

final SparkSession spark = SparkSession.builder().appName("Foo Bar").master("local").getOrCreate();
final Dataset<Row> df = spark.read().format("csv").option("header", "false").load("/path/to/file.csv");
df.show(5);

That breaks all over the place:

Caused by: java.lang.IllegalAccessError: class org.apache.spark.storage.StorageUtils$ (in unnamed module @0x59d016c9) cannot access class sun.nio.ch.DirectBuffer (in module java.base) because module java.base does not export sun.nio.ch to unnamed module @0x59d016c9
    at org.apache.spark.storage.StorageUtils$.<clinit>(StorageUtils.scala:213)
    at org.apache.spark.storage.BlockManagerMasterEndpoint.<init>(BlockManagerMasterEndpoint.scala:114)
    at org.apache.spark.SparkEnv$.$anonfun$create$9(SparkEnv.scala:353)
    at org.apache.spark.SparkEnv$.registerOrLookupEndpoint$1(SparkEnv.scala:290)
    at org.apache.spark.SparkEnv$.create(SparkEnv.scala:339)
    at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:194)
    at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:279)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:464)
    at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2704)
    at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:953)
    at scala.Option.getOrElse(Option.scala:201)
    at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:947)

Spark is obviously doing things one is not supposed to do in Java 17.

Disappointing. How do I get around this?

Dex answered 23/8, 2022 at 23:30 Comment(5)
Not much of a choice : you need to add the --add-opens options cited in the linked post to your program launch command. I find it strange that Spark has not already addressed such problem though.Gsuit
IMO it would be better for you to downgrade to JDK 8 or JDK 11 if you can. JDK 17 support was just recently added so this might not be your last issue with that...Drone
FWIW, it actually broke for me in 3.2.3 and appeared fixed in 3.3.1.Beattie
it happens on 3.2.2 too; i have to use 3.2.2 due to spark-excel dependencyGwen
I'm still seeing the error in 3.5.1 when running spark-submit in cluster modeIzawa
B
50

Following step helped me to unblock the issue.

If you are running the application from IDE (intelliJ IDEA) follow the below instructions.

Add the JVM option "--add-exports java.base/sun.nio.ch=ALL-UNNAMED"

enter image description here

source: https://arrow.apache.org/docs/java/install.html#java-compatibility

Bounteous answered 28/9, 2022 at 6:20 Comment(0)
F
40

Solution

A similar question was asked at Running unit tests with Spark 3.3.0 on Java 17 fails with IllegalAccessError: class StorageUtils cannot access class sun.nio.ch.DirectBuffer, but that question (and solution) was only about unit tests. For me Spark is breaking actually running the program.

Please, consider adding the appropriate Java Virtual Machine command-line options.
The exact way to add them depends on how you run the program: by using a command line, an IDE, etc.

Examples

The command-line options have been taken from the JavaModuleOptions class: spark/JavaModuleOptions.java at v3.3.0 · apache/spark.

Command line

For example, to run the program (the .jar file) by using the command line:

java \
    --add-opens=java.base/java.lang=ALL-UNNAMED \
    --add-opens=java.base/java.lang.invoke=ALL-UNNAMED \
    --add-opens=java.base/java.lang.reflect=ALL-UNNAMED \
    --add-opens=java.base/java.io=ALL-UNNAMED \
    --add-opens=java.base/java.net=ALL-UNNAMED \
    --add-opens=java.base/java.nio=ALL-UNNAMED \
    --add-opens=java.base/java.util=ALL-UNNAMED \
    --add-opens=java.base/java.util.concurrent=ALL-UNNAMED \
    --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED \
    --add-opens=java.base/sun.nio.ch=ALL-UNNAMED \
    --add-opens=java.base/sun.nio.cs=ALL-UNNAMED \
    --add-opens=java.base/sun.security.action=ALL-UNNAMED \
    --add-opens=java.base/sun.util.calendar=ALL-UNNAMED \
    --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED \
    -jar <JAR_FILE_PATH>

IDE: IntelliJ IDEA

References:

References

Forecourt answered 26/8, 2022 at 16:42 Comment(7)
Thanks for the response, but it's a pity no one investigates this further Surely those options (copied from an email thread) are overkill. I imagine most of options would work with --add-exports instead of --add-opens (see docs), because surely Spark isn't using reflection on all those packages. For a simple use case of reading CSV files and saving to JSON locally, just --add-exports java.base/sun.nio.ch=ALL-UNNAMED is working for me.Dex
Does anyone intend to fix this? Is there a Spark ticket filed?Dex
Dear @GarretWilson, I have updated the answer to specify that the command-line options have been taken from the JavaModuleOptions class: spark/JavaModuleOptions.java at v3.3.0 · apache/spark.Forecourt
@GarretWilson, I don't have such information. I did a quick search and found a related ticket: [SPARK-35557] Adapt uses of JDK 17 Internal APIs - ASF JIRA. Please, note the solution comment: --add-opens is mentioned as the solution. Maybe, it is worth reopening the ticket or opening a new one.Forecourt
I'm going to assign the bounty to this answer as you put a lot of work into it and it gives some good references. Still it doesn't provide a sufficient solution or more in-depth tests for me consider it the accepted answer. Sure, I know I can cram a lot of coarse, brute-force exceptions and figure one of them will cover the Spark limitations. I'm looking for something more finely tuned, and a path forward for getting this fixed in Spark.Dex
Thanks for the response, it worked from intelliJ, who launches the app using java -jar. If I am working in a cluster, do you know if I should set this options as driver and executor extra java options? I tried it on the code (on the Spark Application builder) but here it does not seem to be workingAffection
You are a life saver this is necessary for sparkNLP if you are trying to use it with java and I had to add it to vm arguments with run for eclipse.Ripp
G
22

These three methods work for me on a project using

  • Spark 3.3.2
  • Scala 2.13.10
  • Java 17.0.6 (my project is small, it even works on Java 19.0.1. However, if your project is big, it is better to wait Spark officially supports it)

Method 1

export JAVA_OPTS='--add-exports java.base/sun.nio.ch=ALL-UNNAMED'
sbt run

Method 2

Create a .jvmopts file in your project folder, with content:

--add-exports java.base/sun.nio.ch=ALL-UNNAMED

Then you can run

sbt run

Method 3

If you are using IntelliJ IDEA, this is based on @Anil Reddaboina's answer, and thanks!

This adds more info as I don't have that "VM Options" field by default.

Follow this:

enter image description here

Then you should be able to add --add-exports java.base/sun.nio.ch=ALL-UNNAMED to "VM Options" field.

or add fully necessary VM Options arguments:

--add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED

enter image description here

Goto answered 20/3, 2023 at 21:47 Comment(1)
Why does it not work with IDEA if I add the exports to .jvmopts ?Kagera
B
4

For those using Gradle to run unit tests for Spark, apply this in build.gradle.kts:

tasks.test {
    useJUnitPlatform()
 
    val sparkJava17CompatibleJvmArgs = listOf(
        "--add-opens=java.base/java.lang=ALL-UNNAMED",
        "--add-opens=java.base/java.lang.invoke=ALL-UNNAMED",
        "--add-opens=java.base/java.lang.reflect=ALL-UNNAMED",
        "--add-opens=java.base/java.io=ALL-UNNAMED",
        "--add-opens=java.base/java.net=ALL-UNNAMED",
        "--add-opens=java.base/java.nio=ALL-UNNAMED",
        "--add-opens=java.base/java.util=ALL-UNNAMED",
        "--add-opens=java.base/java.util.concurrent=ALL-UNNAMED",
        "--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED",
        "--add-opens=java.base/sun.nio.ch=ALL-UNNAMED",
        "--add-opens=java.base/sun.nio.cs=ALL-UNNAMED",
        "--add-opens=java.base/sun.security.action=ALL-UNNAMED",
        "--add-opens=java.base/sun.util.calendar=ALL-UNNAMED",
        "--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED",
    )
    jvmArgs = sparkJava17CompatibleJvmArgs
}
Baudekin answered 31/7, 2023 at 17:5 Comment(0)
P
3

You could use JDK 8. You maybe should really.

But if you can't you might try adding to your build.sbt file these java options. For me they were needed for tests so I put them into:

val projectSettings = Seq(
...
  Test / javaOptions ++= Seq(
    "base/java.lang", "base/java.lang.invoke", "base/java.lang.reflect", "base/java.io", "base/java.net", "base/java.nio",
    "base/java.util", "base/java.util.concurrent", "base/java.util.concurrent.atomic",
    "base/sun.nio.ch", "base/sun.nio.cs", "base/sun.security.action",
    "base/sun.util.calendar", "security.jgss/sun.security.krb5",
  ).map("--add-opens=java." + _ + "=ALL-UNNAMED"),
...
Poker answered 28/9, 2022 at 1:14 Comment(5)
I'm really curious about this, because it's the only answer targeted toward tests, but it didn't work for me. Would you be willing to link a minimum working example or the rest of your sparkConf/sparkSession.builder command or something?Beattie
Wow - after a ton of work, I figured out how to fix this by following your first suggestion to just use java 8. I'll try to post more tips for other soon. Thanks for that tip!Beattie
@Beattie yes I think the java options for the tests was specific to our setup at the time. Generally sticking to jdk8 is more of a broad stroke workaround. I'm sorry the specific options didn't work for your case.Poker
The above suggestion of adding Test / javaOptions ++ ... worked for me in vscode when all other suggestions didn't. Thank you.Heraclitus
JDK 17, Spark 3.3.2. Out of all the different solutions posited, this is the one that worked for me.Jailbird
M
3

Add this as explicit dependency in Pom.xml file. Don't change version other than 3.0.16

<dependency>
    <groupId>org.codehaus.janino</groupId>
    <artifactId>janino</artifactId>
    <version>3.0.16</version>
</dependency>

and then add the command line arguments. If you use VS code, add

"vmArgs": "--add-exports java.base/sun.nio.ch=ALL-UNNAMED"

in configurations section in launch.json file under .vscode folder in your project.

Mcmasters answered 25/10, 2022 at 15:27 Comment(1)
vmArgs param should go in launch.json as per the docsBronchi
V
2

simply upgrade to spark 3.3.2 solved my problem

I use Java 17 and pyspark in the command line.

Vulgarity answered 25/2, 2023 at 21:48 Comment(1)
Just add more info. In my case, for PySpark, I never had this issue no matter which Spark version. For Scala projects, I am on Spark 3.3.2, unfortunately, this does not help, and I still need --add-exports java.base/sun.nio.ch=ALL-UNNAMED.Goto
H
2

I have tried this using JDK 21 and Spark 3.5.0

  <properties>
        <spark.version>3.5.0</spark.version>
        <scala.binary.version>2.12</scala.binary.version>
        <maven.compiler.source>21</maven.compiler.source>
        <maven.compiler.target>21</maven.compiler.target>
  </properties>

Add below line as VM Option in IntelliJ Idea

--add-opens=java.base/sun.nio.ch=ALL-UNNAMED

And It Works!

Hutchings answered 3/11, 2023 at 13:24 Comment(0)
F
2

In case anyone else has the same problem try adding the list of opens from chehsunliu's answer (tweaked for groovy):

def sparkJava17CompatibleJvmArgs = [
        "--add-opens=java.base/java.lang=ALL-UNNAMED",
        "--add-opens=java.base/java.lang.invoke=ALL-UNNAMED",
        "--add-opens=java.base/java.lang.reflect=ALL-UNNAMED",
        "--add-opens=java.base/java.io=ALL-UNNAMED",
        "--add-opens=java.base/java.net=ALL-UNNAMED",
        "--add-opens=java.base/java.nio=ALL-UNNAMED",
        "--add-opens=java.base/java.util=ALL-UNNAMED",
        "--add-opens=java.base/java.util.concurrent=ALL-UNNAMED",
        "--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED",
        "--add-opens=java.base/sun.nio.ch=ALL-UNNAMED",
        "--add-opens=java.base/sun.nio.cs=ALL-UNNAMED",
        "--add-opens=java.base/sun.security.action=ALL-UNNAMED",
        "--add-opens=java.base/sun.util.calendar=ALL-UNNAMED",
        "--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED"
]

application {
    // Define the main class for the application.
    mainClass = 'com.cool.App'
    applicationDefaultJvmArgs = sparkJava17CompatibleJvmArgs
}
Fulbert answered 1/3, 2024 at 11:17 Comment(0)
C
1

I saw this while trying to set up and run a job against a spark standalone cluster in docker.

I tried all possible combinations of passing the --add-opens directives via spark.driver.defaultJavaOptions, spark.driver.extraJavaOptions, spark.executor.defaultJavaOptions, spark.executor.extraJavaOptions.

None worked. My guess is that there isn't a mechanism to decorate the spark-submit Java entrypoint with these options.

In the end I did this in the base docker image, which solved the problem:

ENV JDK_JAVA_OPTIONS='--add-opens=java.base/sun.nio.ch=ALL-UNNAMED ...'

JVM (-D...) properties can be passed to the job via: spark.driver.extraJavaOptions. No idea why JVM options are not picked up at this phase.

Chafee answered 22/1, 2024 at 23:1 Comment(0)
B
0

From https://blog.jdriven.com/2023/03/mastering-maven-setting-default-jvm-options-using-jvm-config/

Simple fix that worked for me:

  1. Add a .mvn folder to the root directory of your project
  2. Create a jvm.config file and add --add-exports java.base/sun.nio.ch=ALL-UNNAMED

file structure

Borzoi answered 13/1, 2024 at 7:19 Comment(0)
M
0

Update launch.json:

  1. Navigate to the launch.json file in your Visual Studio ).
  2. Locate or create the configuration you use to launch your Spark application. It might have a name like "Main" and target the com.spark.Main class.
  3. Add the following property within the configuration object:
"vmArgs": ["--add-opens=java.base/sun.nio.ch=ALL-UNNAMED"]

example:

{
  "version": "0.2.0",
  "configurations": [
    {
      "type": "java",
      "name": "Main",
      "request": "launch",
      "mainClass": "com.spark.Main",
      "projectName": "example",
      "vmArgs": ["--add-opens=java.base/sun.nio.ch=ALL-UNNAMED"]
    }
  ]
}

This solved the issue while running in the context of VS Code. But the jar files weren't running as expected.

Malignity answered 10/5, 2024 at 7:37 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.