Spark GraphX memory out of error SparkListenerBus (java.lang.OutOfMemoryError: Java heap space)
Asked Answered
E

0

6

I have problem with out of memory on Apache Spark (Graphx). Application run, but after some time shutdown. I use Spark 1.2.0. Cluster has enough memory a number of cores. Other application where I am not using GraphX, run without problem. Application use Pregel.

I submit application in Hadoop YARN mode:

HADOOP_CONF_DIR=/etc/hadoop/conf spark-submit --class DPFile --deploy-mode cluster --master yarn --num-executors 4 --driver-memory 10g --executor-memory 6g --executor-cores 8 --files log4j.properties spark_routing_2.10-1.0.jar road_cr_big2 1000

Spark configuration:

val conf = new SparkConf(true)
    .set("spark.eventLog.overwrite", "true")
    .set("spark.driver.extraJavaOptions", "-Dlog4j.configuration=log4j.properties")
    .set("spark.yarn.applicationMaster.waitTries", "60")
    .set("yarn.log-aggregation-enable","true")
    .set("spark.akka.frameSize", "500") 
    .set("spark.akka.askTimeout", "600") 
    .set("spark.core.connection.ack.wait.timeout", "600")
    .set("spark.akka.timeout","1000")
    .set("spark.akka.heartbeat.pauses","60000")
    .set("spark.akka.failure-detector.threshold","3000.0")
    .set("spark.akka.heartbeat.interval","10000")
    .set("spark.ui.retainedStages","100")
    .set("spark.ui.retainedJobs","100")
    .set("spark.driver.maxResultSize","4G")

Thank you for answers.

Log:

ERROR Utils: Uncaught exception in thread SparkListenerBus    
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2367)
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415)
at java.lang.StringBuilder.append(StringBuilder.java:132)
at scala.collection.mutable.StringBuilder.append(StringBuilder.scala:197)
at org.apache.spark.util.FileLogger.logLine(FileLogger.scala:192)
at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:88)
at org.apache.spark.scheduler.EventLoggingListener.onJobStart(EventLoggingListener.scala:113)
at org.apache.spark.scheduler.SparkListenerBus$$anonfun$postToAll$3.apply(SparkListenerBus.scala:50)
at org.apache.spark.scheduler.SparkListenerBus$$anonfun$postToAll$3.apply(SparkListenerBus.scala:50)
at org.apache.spark.scheduler.SparkListenerBus$$anonfun$foreachListener$1.apply(SparkListenerBus.scala:83)
at org.apache.spark.scheduler.SparkListenerBus$$anonfun$foreachListener$1.apply(SparkListenerBus.scala:81)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.SparkListenerBus$class.foreachListener(SparkListenerBus.scala:81)
at org.apache.spark.scheduler.SparkListenerBus$class.postToAll(SparkListenerBus.scala:50)
at org.apache.spark.scheduler.LiveListenerBus.postToAll(LiveListenerBus.scala:32)
at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:56)
at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:56)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:56)
at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply(LiveListenerBus.scala:47)
at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply(LiveListenerBus.scala:47)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1468)
at org.apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:46)
Exception in thread "SparkListenerBus" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2367)
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415)
at java.lang.StringBuilder.append(StringBuilder.java:132)
at scala.collection.mutable.StringBuilder.append(StringBuilder.scala:197)
at org.apache.spark.util.FileLogger.logLine(FileLogger.scala:192)
at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:88)
at org.apache.spark.scheduler.EventLoggingListener.onJobStart(EventLoggingListener.scala:113)
at org.apache.spark.scheduler.SparkListenerBus$$anonfun$postToAll$3.apply(SparkListenerBus.scala:50)
at org.apache.spark.scheduler.SparkListenerBus$$anonfun$postToAll$3.apply(SparkListenerBus.scala:50)
at org.apache.spark.scheduler.SparkListenerBus$$anonfun$foreachListener$1.apply(SparkListenerBus.scala:83)
at org.apache.spark.scheduler.SparkListenerBus$$anonfun$foreachListener$1.apply(SparkListenerBus.scala:81)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.SparkListenerBus$class.foreachListener(SparkListenerBus.scala:81)
at org.apache.spark.scheduler.SparkListenerBus$class.postToAll(SparkListenerBus.scala:50)
at org.apache.spark.scheduler.LiveListenerBus.postToAll(LiveListenerBus.scala:32)
at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:56)
at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:56)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:56)
at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply(LiveListenerBus.scala:47)
at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply(LiveListenerBus.scala:47)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1468)
at org.apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:46)
ERROR LiveListenerBus: SparkListenerBus thread is dead! This means SparkListenerEvents have notbeen (and will no longer be) propagated to listeners for some time.
ERROR ApplicationMaster: RECEIVED SIGNAL 15: SIGTERM
Eckert answered 8/6, 2015 at 12:28 Comment(5)
I am getting the same error while using spark sql on hdfs. Exception in thread "SparkListenerBus" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "SparkListenerBus" Exception in thread "org.apache.hadoop.hdfs.PeerCache@2d34fee9" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "org.apache.hadoop.hdfs.PeerCache@2d34fee9"Lhasa
@František Kolovský, did you solve the issue?Misdirection
I try solve it 3 month and I didn't solve it, but I think that problem is in YARN submit mode, because on Standalone mode everything run without problem. I performed about 130 iteration with large Graph, so I thing that somewhere in YARN doesn't clear memory.Schoolroom
Hi Frantisek, any solution for this or update?Substandard
No, I solved this problem so that I used the stardalone cluster instead of YARN.Schoolroom

© 2022 - 2024 — McMap. All rights reserved.