Container Fails to Start: Insufficient memory for the Java Runtime Environment to continue
Asked Answered
E

2

27

We have an enterprise application running on Java 8. The deployment environment is built & updated through Bitbucket pipelines. I have a graphic showing the high-level architecture of the environment. We have two app servers running identical configurations apart from some application specific environment variables.

It was all working well until a week ago when after a successful pipeline run, the 2 app instances on one of the servers stopped working with the following error:

There is insufficient memory for the Java Runtime Environment to continue.
Cannot create GC thread. Out of system resources.

Both the instances are working fine on the other server. In contrast, the containers fail to start on this server.

Solutions Tried

The error accompanies the following information:

Possible reasons: The system is out of physical RAM or swap space The process is running with CompressedOops enabled, and the Java Heap may be blocking the growth of the native heap.

Possible solutions:

  • Reduce memory load on the system
  • Increase physical memory or swap space
  • Check if swap backing store is full
  • Decrease Java heap size (-Xmx/-Xms)
  • Decrease number of Java threads
  • Decrease Java thread stack sizes (-Xss)
  • Set larger code cache with -XX:ReservedCodeCacheSize=

We have tried:

  1. Adding more swap memory. The server has 8GB of RAM while we have tried the swap from 4GB to 9GB.
  2. Played with the heap sizes Xms & Xmx from 128m to 4096m.
  3. Increased the RAM on this server to 16GB while the other server that works still does on 8GB.

Here is how the memory & swap consumption looks like:

free -mh
              total        used        free      shared  buff/cache   available
Mem:           15Gi       378Mi        12Gi       1.0Mi       2.9Gi        14Gi
Swap:           9Gi          0B         9Gi

I have links to several related artifacts. These include the complete docker logs output and the output of docker info on the failing server and the operational server.

This is what docker ps -a gets us:

:~$ docker ps -a
CONTAINER ID   IMAGE                                                                                  COMMAND                  CREATED        STATUS                    PORTS                                       NAMES
d29747bf2ad3   :a7608a838625ae945bd0a06fea9451f8bf11ebe4   "catalina.sh run"        10 hours ago   Exited (1) 10 hours ago                                               jbbatch
0951b6eb5d42   :a7608a838625ae945bd0a06fea9451f8bf11ebe4   "catalina.sh run"        10 hours ago   Exited (1) 10 hours ago                                               jbapp

We are out of ideas right now as we have tried almost all the solutions on stack overflow. What are we missing?

Ebenezer answered 2/7, 2022 at 18:35 Comment(5)
Have you tried monitoring the heap- and non-heap-memory of the application, e.g. through prometheus/grafana (if the application provides a monitoring endpoint) or VisualVM (if the corresponding agent is attached to the vm)? Maybe some of the enviroment-specific configuration leads to higher memory pressure. --- Are you setting any memory limits on the docker containers?Schmeltzer
The containers fail to start and so the monitoring does not help.Ebenezer
@Schmeltzer there are no implicit memory limits on docker containers. Here is what things look like memory wise on the server that does work: pastebin.mozilla.org/xMJK1FTgEbenezer
Can you edit the question to include a minimal reproducible example? Make sure to include the relevant source code directly inline in the question, not behind a link. What you have so far does in fact suggest the JVM is running out of memory but without any source code or other details it's hard to give any more than generic monitoring and tuning suggestions.Facial
@DavidMaze It is an enterprise application and sharing reproduction steps might be difficult. I also believe that the code might not be an issue as the same code with the same pipeline is working on the other server. There are minor differences in the information docker info provides like the docker version & kernal version. Do you think that might cause an issue?Ebenezer
H
34

I see that your Docker image uses Ubuntu 22.04 LTS as its base. Recently base Java images were rebuilt on top of this LTS version, which caused a lot of issues on older Docker runtimes. Most likely this is what you're experiencing. It has nothing to do with memory, but rather with Docker incompatibility with a newer Linux version used as a base image.

Your operational server has Docker server version 20.10.10, while the failing server has version 20.10.09. The incompatibility issue was fixed exactly in Docker 20.10.10. Some more technical details on the incompatibility issue are available here.

The solution would be to upgrade the failing server to at least Docker 20.10.10.

Hussar answered 2/7, 2022 at 19:33 Comment(3)
I face the problem with 20.10.18 also. The problem was that the server did not have swap. By adding swap the problem was solved.Wendiewendin
@Icetinsoy how were you able to check for swap?Amyl
On linux, the free commandSheba
E
11

I had had the same error. Output of

# docker info

was:

....
Security Options:
 seccomp
  WARNING: You're not using the default seccomp profile
  Profile: /etc/docker/seccomp.json
....

The issue was resolved by putting

  security_opt:
    - seccomp:unconfined

in the docker-compose.yml for the service and removing and recreating the container

docker rm <container_name>
docker-compose up -d <service_name>

Maybe the same result could be achieved having /etc/docker/seccomp.jsonn tweaked - I tried and failed.

Eldredge answered 6/10, 2022 at 7:28 Comment(1)
docker run (blah,blah) --security-opt seccomp=unconfined (blah,blah) solved for me ( without the blah,blah of course )Agrobiology

© 2022 - 2024 — McMap. All rights reserved.