java.lang.OutOfMemoryError: Java heap space when SparkR collect
Asked Answered
E

0

1

My collected data size is 1.3g and all the configurations about driver memory are set to 3g.

Why the out of memory is still happening??

This is my detail configuration of sparkR and OOM exception message.

spark.default.confs=list(spark.cores.max="8",spark.executor.memory="15g", 
                         spark.driver.maxResultSize="3g",spark.driver.memory="3g",spark.driver.extraJavaOptions="-Xms3g")
#
sc <- sparkR.init(master="spark://10.58.70.155:7077",sparkEnvir = spark.default.confs)
ERROR RBackendHandler: dfToCols on org.apache.spark.sql.api.r.SQLUtils failed
java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:142)
    at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:74)
    at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:36)
    at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
    at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOf(Arrays.java:3236)
    at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:118)
    at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
    at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:135)
    at java.io.DataOutputStream.writeInt(DataOutputStream.java:200)
    at org.apache.spark.api.r.SerDe$.writeString(SerDe.scala:296)
    at org.apache.spark.api.r.SerDe$.writeObject(SerDe.scala:211)
    at org.apache.spark.sql.api.r.SQLUtils$$anonfun$colToRBytes$1.apply(SQLUtils.scala:129)
    at org.apache.spark.sql.api.r.SQLUtils$$anonfun$colToRBytes$1.apply(SQLUtils.scala:127)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
    at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
    at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
    at org.apache.spark.sql.api.r.SQLUtils$.colToRBytes(SQLUtils.scala:127)
    at org.apache.spark.sql.api.r.SQLUtils$$anonfun$dfToCols$1.apply(SQLUtils.scala:108)
    at org.apache.spark.sql.api.r.SQLUtils$$anonfun$dfToCols$1.apply(SQLUtils.scala:107)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
    at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
    at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
    at org.apache.spark.sql.api.r.SQLUtils$.dfToCols(SQLUtils.scala:107)
    at org.apache.spark.sql.api.r.SQLUtils.dfToCols(SQLUtils.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:142)
    at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:74)
Error: returnStatus == 0 is not TRUE
Eavesdrop answered 2/11, 2015 at 10:15 Comment(3)
I would hazard a guess that your input data set is much larger than 1.3Gb when you convert it into an in-memory data structure.Karisa
thanks for your answer. I have tried to set driver memory to 10g, but still throw the exception.Eavesdrop
I would hazard a guess that either that's not the right parameter ... or the space requirement is > 10Gb. Bear in mind that data structure loaders often require more memory during the load phase than after then data is fully loaded.Karisa

© 2022 - 2024 — McMap. All rights reserved.