How to fix OutOfMemoryError (Java): GC overhead limit exceeded in r? [duplicate]
Asked Answered
R

1

21

I have to read a file in a list of folders and save data in R. I use following code for my test data and it works. When I use the code for the actual data then I get this error
Error: OutOfMemoryError (Java): GC overhead limit exceeded Called from: top level

This is what I have done for my test data

parent.folder <- "C:/Users/sandesh/Desktop/test_R"
sub.folder <- list.dirs(parent.folder, recursive =TRUE)[-1]
file <- file.path(sub.folder, "sandesh1.xlsx")
library(xlsx)
library(plyr)
fun <- function(file) {
  df <- read.xlsx(file, sheetIndex=1)
}
df.big <- ldply(file, fun)
Refund answered 26/11, 2014 at 16:16 Comment(0)
C
45

This is a typical problem in rJava. It is answered in the XLConnect documentation which also uses rJava to connect to excel the same way as the xlsx library. I quote from here:

"This is caused by the fact that XLConnect (same for xlsx) needs to copy your entire data object over to the JVM in order to write it to a file and the JVM has to be initialized with a fixed upper limit on its memory size. To change this amount, you can pass parameters to the R’s JVM just like you can to a command line Java process via rJava’s options support:

options(java.parameters = "- Xmx1024m")

Note, however, that these parameters are evaluated exactly once per R session when the JVM is initialized - this is usually once you load the first package that uses Java support, so you should do this as early as possible."

As it is mentioned above run the options function at the beginning of your script before loading any libraries and if you are running it through Rstudio make sure you restart it before you run the script.

Also, please note that it is still not certain that even this will work depending on the size of the file you are trying to parse.

Cavell answered 26/11, 2014 at 17:45 Comment(2)
LyzandeR, thanks for the answer.I used the option like you suggested and still got the same error. I found another package called openxlsx which I assume is considered a good option for JVM dependent packages like xlsx or XLConnect. So, when I used openxlsx package the script gets executed without generating error message however, for some reason, the result dataset is completely different..they don't even remotely match with the one in the excel. so..still scratching my head.Refund
Yeah. I had the same error as well and the above did work, but a colleague of mine couldn't solve it with the above. This is why I said that it might not work. Apparently, it works only for a few cases and according to the file/java version/r version/computer specs etc. You can also try the xlsx library or the gdata library or the Rexcel library. Check here. One of those might work. Usually, they all use Rjava though. Also, if you are working on a 32-bit machine that is a reason it fails too.Cavell

© 2022 - 2024 — McMap. All rights reserved.