Understanding Kryo serialization buffer overflow error
Asked Answered
D

1

9

I am trying to understand the following error and I am running in client ode.

 org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow. Available: 0, required: 61186304. To avoid this, increase spark.kryoserializer.buffer.max value.
        at org.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:300)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:313)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

Basically I am trying to narrow down the problem. Is my understanding right that this error is occurring in the spark driver side(i am on aws emr so I believe this will be running on master)? and I should be looking at spark.driver.memory ?

Divvy answered 1/4, 2018 at 4:24 Comment(0)
R
14

No, the problem is that kryo does not have enough room in its buffer. You should be adjusting spark.kryoserializer.buffer.max in your properties file, or use --conf "spark.kryoserializer.buffer.max=128m" in your spark-submit command. 128m should be big enough for you.

Redeploy answered 19/2, 2019 at 17:0 Comment(1)
see this question for more information too. #37710270Redeploy

© 2022 - 2024 — McMap. All rights reserved.