R Error: java.lang.OutOfMemoryError: Java heap space
Asked Answered
B

4

48

I am trying to connect R to Teradata to pull data directly into R for analysis. However, I am getting the error of,

Error in .jcall(rp, "I", "fetch", stride, block) :
  java.lang.OutOfMemoryError: Java heap space

I have tried to set up my R options to increase the max heap size of JVM by doing:

options(java.parameters = "-Xmx8g")

I have also tried to initialize java parameters with rJava function .jinit as: .jinit(parameters="-Xmx8g"). But still failed.

The calculated size of the data should be approximately 3G (actually less than 3G).

Bouncy answered 6/1, 2016 at 1:2 Comment(3)
Can you try using less memory to verify that it works at all? Just because the raw data is only 3GB does not preclude the possibility that the JVM needs more memory than this.Rabideau
You have to make sure you run options(java.parameters = "-Xmx8g") before starting up your Java instance. So start in a fresh R session with NO packages loaded. Run that command and THEN load all your packages and try again. You should be fine but it's possible the JVM needs a lot for other reasons.Bandog
I guess "calculated size of the data" is the size of meaningful information stored. However data structures are not ideal in memory consumption - they have fields for internal usage, they allocate additional memory to prevent repeated allocations on data additions, so even empty data structures without any data consume some memory. So 3Gb of data can easily require more than 8Gb of operative memory.Hoecake
M
42

You need to make sure you're allocating additional memory before loading rJava or any other packages. Wipe the environment first (via rm(list = ls())), restart R/Rstudio if you must, and modify the options at the beginning of your script.

options(java.parameters = "-Xmx8000m")

See for example https://support.snowflake.net/s/article/solution-using-r-the-following-error-is-returned-javalangoutofmemoryerror-gc-overhead-limit-exceeded

Monostich answered 23/5, 2017 at 19:40 Comment(0)
N
17

I somehow had this problem in a not reproducible manner, partly solved it with -Xmx8g but run in to problems randomly.

I now found an option with a different garbage collector by using

options(java.parameters = c("-XX:+UseConcMarkSweepGC", "-Xmx8192m"))
library(xlsx)

at the beginning of the script and before any other package is loaded since other packages can load some java things by themselves and the options have to be set before any Java is loaded.

So far, the problem didn't occurred again.

Only sometimes in a long session it can still happen. But in this case a session restart normally solves the problem.

EDIT:

As user @user2B4L2 pointed out below, by adding gc() after the java options and also calling it after each change in the excel workbook object, it seems this solved the whole problem. Tanks!

Neysa answered 22/2, 2019 at 12:46 Comment(2)
Yes. Somehow, this also made my Code run faster than before. This really is the correct answer. Thank youIand
Yes! Thank tou for that Sr.! Worked perfectly hereQuechuan
L
4

Running the following two lines of code (before any packages are loaded) worked for me on a Mac:

options(java.parameters = c("-XX:+UseConcMarkSweepGC", "-Xmx8192m"))
gc()

This essentially combines two proposals previously posted herein: Importantly, only running the first line alone (as suggested by drmariod) did not solve the problem in my case. However, when I was additionally executing gc() just after the first line (as suggested by user2961057) the problem was solved.

Should it still not work, restart your R session, and then try (before any packages are loaded) instead options(java.parameters = "-Xmx8g") and directly after that execute gc(). Alternatively, try to further increase the RAM from "-Xmx8g" to e.g. "-Xmx16g" (provided that you have at least as much RAM).

EDIT: Further solutions: While I had to use the rJava for model estimations in R (explaining y from a large number of X), I kept receiving the above 'OutOfMemory' Errors even if I scaled up to "-Xmx60000m" (the machine I am using has 64 GB RAM). The problem was that some model specifications were simply too big (and would have required even more RAM). One solution that may help in this case is scaling the size of the problem down (e.g. by reducing the number of X's in the model), or – if possible – splitting the problem into independent pieces, estimating each separately, and putting those pieces together again.

Lyceum answered 22/12, 2020 at 1:36 Comment(2)
the gc() used regularly salved this problem for me. Thanks!Blackguard
This seems to work for me as well... Thanks for that hint. I basically add the gc() command every time I add a data frame to the excel workbook and this seems to do the trick. cool!Neysa
D
1

I added garbage collection and that solved the issue for me. I am connecting to Oracle databases using RJDBC.
simply add gc()

Duodenum answered 11/2, 2020 at 15:6 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.