Java heap memory full with persistence enabled
Asked Answered
F

1

6

With persistence enabled in broker.xml and when -Xms & -Xmx are set to 4GB and global-max-size set to default 200Mb (most of the setting were just created by the Artemis create command) I am getting the below exceptions either within a day or three based on the rate of data processed. Configuration is similar to this post.

2020-08-03 02:25:58,969 WARN  [org.apache.activemq.artemis.core.server] AMQ222033: Page file 000000005.page had incomplete records at position 9,259,622 at record number 4,224
2020-08-03 02:31:19,390 WARN  [org.apache.activemq.artemis.core.server] AMQ222288: Page 5, message 4,224 could not be found on offset 9,259,622, with starting message 4,224. This represents a logic error or inconsistency on the data, and the system will try once again from the beggining of the page file.
2020-08-03 03:05:10,331 WARN  [org.apache.activemq.artemis.utils.actors.OrderedExecutor] Java heap space: java.lang.OutOfMemoryError: Java heap space

I was able to confirm the heap was 100%.

Heap Usage:
G1 Heap:
   regions  = 2048
   capacity = 4294967296 (4096.0MB)
   used     = 4285561048 (4087.0295028686523MB)
   free     = 9406248 (8.970497131347656MB)
   99.78099372237921% used
G1 Young Generation:
Eden Space:
   regions  = 0
   capacity = 8388608 (8.0MB)
   used     = 0 (0.0MB)
   free     = 8388608 (8.0MB)
   0.0% used
Survivor Space:
   regions  = 0
   capacity = 0 (0.0MB)
   used     = 0 (0.0MB)
   free     = 0 (0.0MB)
   0.0% used
G1 Old Generation:
   regions  = 2048
   capacity = 4286578688 (4088.0MB)
   used     = 4285561048 (4087.0295028686523MB)
   free     = 1017640 (0.9704971313476562MB)
   99.97625985490832% used

14525 interned Strings occupying 1435888 bytes.

Using Artemis version 2.13.0. VM is a 2 core Linux machine with 7 GB RAM. I had disabled the persistence. Also another observation is the RES size is growing gradually when I use the top command.

Client configuration: (using camel and activeMq JMS component)

  <bean id="conxnFactory" class="org.apache.activemq.ActiveMQConnectionFactory">
      <property name="brokerURL" value="tcp://domain-name:port" />
    </bean>

    <bean id="pooledconxnFactory" class="org.apache.activemq.pool.PooledConnectionFactory" init-method="start" destroy-method="stop">
      <property name="maxConnections" value="5" />
      <property name="connectionFactory" ref="conxnFactory" />
    </bean>

    <bean id="config" class="org.apache.camel.component.jms.JmsConfiguration">
      <property name="connectionFactory" ref="pooledconxnFactory" />
      <property name="concurrentConsumers" value="5" />
    </bean>

    <bean id="jmsqueue" class="org.apache.activemq.camel.component.ActiveMQComponent">
      <property name="configuration" ref="config" />
    </bean>
    
    <camelContext id="camel" xmlns="http://camel.apache.org/schema/spring">
      <endpoint id="inputQueue" uri="jmsqueue:queue:dataqueue1" />
      
      <route>
        <from uri="ref:inputQueue" />
...

I was not able to get the heapdump, but there was a hs_err_pid file generated.

...
Other Threads:
  0x00007f9d10282000 VMThread [stack: 0x00007f9d147a4000,0x00007f9d148a4000] [id=68249]
  0x00007f9d102d2800 WatcherThread [stack: 0x00007f9d00f00000,0x00007f9d01000000] [id=68257]

=>0x00007f9d10023000 (exited) GCTaskThread [stack: 0x00007f9d16eaa000,0x00007f9d16faa000] [id=68242]

VM state:at safepoint (normal execution)

VM Mutex/Monitor currently owned by a thread:  ([mutex/lock_event])
[0x00007f9d10007bd0] Threads_lock - owner thread: 0x00007f9d10282000
[0x00007f9d100080d0] Heap_lock - owner thread: 0x00007f9cf421d800

Heap:
 garbage-first heap   total 2097152K, used 2088898K [0x0000000080000000, 0x0000000080104000, 0x0000000100000000)
  region size 1024K, 0 young (0K), 0 survivors (0K)
 Metaspace       used 46102K, capacity 46842K, committed 47408K, reserved 1091584K
  class space    used 5086K, capacity 5295K, committed 5424K, reserved 1048576K

Heap Regions: (Y=young(eden), SU=young(survivor), HS=humongous(starts), HC=humongous(continues), CS=collection set, F=free, TS=gc time stamp, PTAMS=previous top-at-mark-start, NTAMS=next top-at-mark-start)
AC   0  O    TS     0 PTAMS 0x0000000080000000 NTAMS 0x0000000080000000 space 1024K,  99% used [0x0000000080000000, 0x0000000080100000)
AC   0  O    TS     0 PTAMS 0x0000000080100000 NTAMS 0x0000000080100000 space 1024K,  99% used [0x0000000080100000, 0x0000000080200000)
AC   0  O    TS     0 PTAMS 0x0000000080200000 NTAMS 0x0000000080200000 space 1024K,  99% used [0x0000000080200000, 0x0000000080300000)
AC   0  O    TS     0 PTAMS 0x0000000080300000 NTAMS 0x0000000080300000 space 1024K,  99% used [0x0000000080300000, 0x0000000080400000)
AC   0  O    TS     0 PTAMS 0x0000000080400000 NTAMS 0x0000000080400000 space 1024K,  99% used [0x0000000080400000, 0x0000000080500000)
AC   0  O    TS     0 PTAMS 0x0000000080500000 NTAMS 0x0000000080500000 space 1024K,  99% used [0x0000000080500000, 0x0000000080600000)
AC   0  O    TS     0 PTAMS 0x0000000080600000 NTAMS 0x0000000080600000 space 1024K,  98% used [0x0000000080600000, 0x0000000080700000)
....
...
Deoptimization events (10 events):
Event: 26931.254 Thread 0x00007f9ce0031000 Uncommon trap: reason=bimorphic action=maybe_recompile pc=0x00007f9d01589b58 method=java.util.AbstractCollection.toArray()[Ljava/lang/Object; @ 1
Event: 26940.746 Thread 0x00007f9cbc4e7800 Uncommon trap: reason=class_check action=maybe_recompile pc=0x00007f9d01c69704 method=org.apache.activemq.artemis.core.remoting.impl.netty.ActiveMQChannelHandler.lambda$channelInactive$0(Lio/netty/channel/ChannelHandlerContext;)V @ 13
Event: 27003.328 Thread 0x00007f9cbc4e7800 Uncommon trap: reason=unstable_if action=reinterpret pc=0x00007f9d01f98b44 method=org.apache.activemq.artemis.utils.UUID.toString()Ljava/lang/String; @ 4
Event: 27003.328 Thread 0x00007f9cbc4e7800 Uncommon trap: reason=class_check action=maybe_recompile pc=0x00007f9d02a77920 method=org.apache.activemq.artemis.core.server.impl.QueueImpl$ConsumerHolder.getPriority()I @ 4
Event: 27003.328 Thread 0x00007f9cbc4e7800 Uncommon trap: reason=class_check action=maybe_recompile pc=0x00007f9d011dcd40 method=org.apache.activemq.artemis.core.server.impl.QueueImpl$ConsumerHolder.getPriority()I @ 4
Event: 27003.328 Thread 0x00007f9cbc4e7800 Uncommon trap: reason=unstable_if action=reinterpret pc=0x00007f9d01ba2a84 method=java.lang.AbstractStringBuilder.append(I)Ljava/lang/AbstractStringBuilder; @ 16
Event: 27142.215 Thread 0x0000000001333000 Uncommon trap: reason=unstable_if action=reinterpret pc=0x00007f9d02660584 method=java.net.InetAddress.toString()Ljava/lang/String; @ 16
Event: 27142.216 Thread 0x0000000001333000 Uncommon trap: reason=class_check action=maybe_recompile pc=0x00007f9d029b9650 method=org.apache.activemq.artemis.utils.ActiveMQThreadPoolExecutor$ThreadPoolQueue.offer(Ljava/lang/Object;)Z @ 2
Event: 29769.075 Thread 0x00007f9d10b14000 Uncommon trap: reason=unstable_if action=reinterpret pc=0x00007f9d0267c37c method=java.util.concurrent.FutureTask.runAndReset()Z @ 115
Event: 30265.745 Thread 0x00007f9d106f3800 Uncommon trap: reason=unstable_if action=reinterpret pc=0x00007f9d0267c418 method=java.util.concurrent.FutureTask.run()V @ 66

Internal exceptions (10 events):
Event: 31469.887 Thread 0x00007f9cf4007800 Exception <a 'java/lang/OutOfMemoryError'> (0x000000008004efc8) thrown at [/HUDSON/workspace/8-2-build-linux-amd64/jdk8u131/8869/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp, line 159]
Event: 31486.605 Thread 0x00007f9cf4007800 Exception <a 'java/lang/OutOfMemoryError'> (0x000000008004efc8) thrown at [/HUDSON/workspace/8-2-build-linux-amd64/jdk8u131/8869/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp, line 159]
Event: 31504.901 Thread 0x00007f9cf4007800 Exception <a 'java/lang/OutOfMemoryError'> (0x000000008004efc8) thrown at [/HUDSON/workspace/8-2-build-linux-amd64/jdk8u131/8869/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp, line 159]
Event: 31522.574 Thread 0x00007f9cf4007800 Exception <a 'java/lang/OutOfMemoryError'> (0x000000008004efc8) thrown at [/HUDSON/workspace/8-2-build-linux-amd64/jdk8u131/8869/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp, line 159]
Event: 31538.769 Thread 0x00007f9d110a4000 Exception <a 'java/lang/OutOfMemoryError'> (0x000000008004efc8) thrown at [/HUDSON/workspace/8-2-build-linux-amd64/jdk8u131/8869/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp, line 159]
Event: 31556.358 Thread 0x00007f9d110a4000 Exception <a 'java/lang/OutOfMemoryError'> (0x000000008004efc8) thrown at [/HUDSON/workspace/8-2-build-linux-amd64/jdk8u131/8869/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp, line 159]
Event: 31574.508 Thread 0x00007f9d110a4000 Exception <a 'java/lang/OutOfMemoryError'> (0x000000008004efc8) thrown at [/HUDSON/workspace/8-2-build-linux-amd64/jdk8u131/8869/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp, line 159]
Event: 31590.147 Thread 0x00007f9d10e37800 Exception <a 'java/lang/OutOfMemoryError'> (0x000000008004efc8) thrown at [/HUDSON/workspace/8-2-build-linux-amd64/jdk8u131/8869/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp, line 159]
Event: 31606.743 Thread 0x00007f9d10e37800 Exception <a 'java/lang/OutOfMemoryError'> (0x000000008004efc8) thrown at [/HUDSON/workspace/8-2-build-linux-amd64/jdk8u131/8869/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp, line 159]
Event: 31606.743 Thread 0x00007f9d10e37800 Exception <a 'java/lang/OutOfMemoryError'> (0x000000008004f068) thrown at [/HUDSON/workspace/8-2-build-linux-amd64/jdk8u131/8869/hotspot/src/share/vm/runtime/deoptimization.cpp, line 790]

Events (10 events):
Event: 31590.147 Executing VM operation: G1CollectForAllocation done
Event: 31590.148 Executing VM operation: G1IncCollectionPause
Event: 31590.167 Executing VM operation: G1IncCollectionPause done
Event: 31590.167 Executing VM operation: G1IncCollectionPause
Event: 31590.175 Executing VM operation: G1IncCollectionPause done
Event: 31590.188 Executing VM operation: G1CollectForAllocation
Event: 31606.743 Executing VM operation: G1CollectForAllocation done
Event: 31606.743 Thread 0x00007f9d10e37800 DEOPT PACKING pc=0x00007f9d021831a8 sp=0x00007f9ce46b4f00
Event: 31606.743 Thread 0x00007f9d10e37800 DEOPT UNPACKING pc=0x00007f9d01047373 sp=0x00007f9ce46b4be0 mode 1
Event: 31606.749 Executing VM operation: G1IncCollectionPause
..
Update:

Not able to get the heap dump, as the broker instance hangs after logging the below message in logs

2020-08-07 00:17:32,835 WARNING [sun.rmi.transport.tcp] RMI TCP Accept-0: accept loop for ServerSocket[addr=localhost/127.0.0.1,localport=33088] throws: java.lang.OutOfMemoryError: Java heap space

2020-08-07 00:17:49,675 WARN  [org.apache.activemq.artemis.utils.actors.OrderedExecutor] Java heap space: java.lang.OutOfMemoryError: Java heap space
        at org.apache.activemq.artemis.core.paging.impl.Page.readMessage(Page.java:208) [artemis-server-2.13.0.jar:2.13.0]
        at org.apache.activemq.artemis.core.paging.cursor.impl.PageReader.getMessage(PageReader.java:94) [artemis-server-2.13.0.jar:2.13.0]
        at org.apache.activemq.artemis.core.paging.cursor.impl.PageSubscriptionImpl.internalGetNext(PageSubscriptionImpl.java:432) [artemis-server-2.13.0.jar:2.13.0]
        at org.apache.activemq.artemis.core.paging.cursor.impl.PageSubscriptionImpl.access$2200(PageSubscriptionImpl.java:67) [artemis-server-2.13.0.jar:2.13.0]
        at org.apache.activemq.artemis.core.paging.cursor.impl.PageSubscriptionImpl$CursorIterator.moveNext(PageSubscriptionImpl.java:1353) [artemis-server-2.13.0.jar:2.13.0]
        at org.apache.activemq.artemis.core.paging.cursor.impl.PageSubscriptionImpl$CursorIterator.next(PageSubscriptionImpl.java:1318) [artemis-server-2.13.0.jar:2.13.0]
        at org.apache.activemq.artemis.core.paging.cursor.impl.PageSubscriptionImpl$CursorIterator.tryNext(PageSubscriptionImpl.java:1445) [artemis-server-2.13.0.jar:2.13.0]
        at org.apache.activemq.artemis.core.paging.cursor.impl.PageSubscriptionImpl$CursorIterator.hasNext(PageSubscriptionImpl.java:1461) [artemis-server-2.13.0.jar:2.13.0]
        at org.apache.activemq.artemis.core.server.impl.QueueImpl.checkDepage(QueueImpl.java:2989) [artemis-server-2.13.0.jar:2.13.0]
        at org.apache.activemq.artemis.core.server.impl.QueueImpl.deliverAsync(QueueImpl.java:1243) [artemis-server-2.13.0.jar:2.13.0]
        at org.apache.activemq.artemis.core.server.impl.QueueImpl.deliverAsync(QueueImpl.java:1237) [artemis-server-2.13.0.jar:2.13.0]
        at org.apache.activemq.artemis.core.server.impl.ServerConsumerImpl.forceDelivery(ServerConsumerImpl.java:1154) [artemis-server-2.13.0.jar:2.13.0]
        at org.apache.activemq.artemis.core.server.impl.ServerConsumerImpl.promptDelivery(ServerConsumerImpl.java:1146) [artemis-server-2.13.0.jar:2.13.0]
        at org.apache.activemq.artemis.core.server.impl.ServerConsumerImpl.setStarted(ServerConsumerImpl.java:755) [artemis-server-2.13.0.jar:2.13.0]
        at org.apache.activemq.artemis.core.server.impl.ServerSessionImpl.setStarted(ServerSessionImpl.java:2013) [artemis-server-2.13.0.jar:2.13.0]
        at org.apache.activemq.artemis.core.server.impl.ServerSessionImpl.start(ServerSessionImpl.java:1667) [artemis-server-2.13.0.jar:2.13.0]
        at org.apache.activemq.artemis.core.protocol.openwire.amq.AMQSession.start(AMQSession.java:269) [artemis-openwire-protocol-2.13.0.jar:2.13.0]
        at org.apache.activemq.artemis.core.protocol.openwire.OpenWireConnection.addConsumer(OpenWireConnection.java:908) [artemis-openwire-protocol-2.13.0.jar:2.13.0]
        at org.apache.activemq.artemis.core.protocol.openwire.OpenWireConnection$CommandProcessor.processAddConsumer(OpenWireConnection.java:1194) [artemis-openwire-protocol-2.13.0.jar:2.13.0]
        at org.apache.activemq.command.ConsumerInfo.visit(ConsumerInfo.java:351) [activemq-client-5.14.5.jar:5.14.5]
        at org.apache.activemq.artemis.core.protocol.openwire.OpenWireConnection.act(OpenWireConnection.java:322) [artemis-openwire-protocol-2.13.0.jar:2.13.0]
        at org.apache.activemq.artemis.core.protocol.openwire.OpenWireConnection$$Lambda$124/267018299.onMessage(Unknown Source)
        at org.apache.activemq.artemis.utils.actors.Actor.doTask(Actor.java:33) [artemis-commons-2.13.0.jar:2.13.0]
        at org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:65) [artemis-commons-2.13.0.jar:2.13.0]
        at org.apache.activemq.artemis.utils.actors.ProcessorBase$$Lambda$35/656479172.run(Unknown Source)
        at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:42) [artemis-commons-2.13.0.jar:2.13.0]
        at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31) [artemis-commons-2.13.0.jar:2.13.0]
        at org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:65) [artemis-commons-2.13.0.jar:2.13.0]
        at org.apache.activemq.artemis.utils.actors.ProcessorBase$$Lambda$35/656479172.run(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_131]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_131]
        at org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118) [artemis-commons-2.13.0.jar:2.13.0]

With the persistence flag as false in broker.xml, the heap issue occurred only once. (as mentioned the logs above)

Enabling artemis.profile to log oom_heapdump.hprof didn't log any info but exception with Thread dump info.

The below message is also noticed frequently in the logs.

2020-08-09 17:51:03,484 WARN  [org.apache.activemq.artemis.core.server] AMQ222038: Starting paging on address 'ActiveMQ.Advisory.TempQueue'; size is currently: 2,147,471,789 bytes; max-size-bytes: -1; global-size-bytes: 2,147,486,900

The clients (consumer) uses OPENWIRE using activemq-5.13 client jar. From the broker console, the producer use CORE protocol.

Jvisual VM for Broker running straight 20 hrs

visualVM monitor

I enabled the JAVA_ARGS to print the OOM trace a 7 GB file got generated, below is the analysis using Eclipse MAT tool.

histogram of heap

1.2G shallow heap consumption by org.apache.activemq.artemis.core.paging.cursor.impl.PagePosistionImpl.

From the Docs, when address-full-policy is set to PAGE that is the time broker starts appending to page. Other values are DROP and BLOCK (blocks producer, until memory is freed). I planning to use address-full-policy to BLOCK to see if the OOM issue exists.

Fivespot answered 3/8, 2020 at 19:3 Comment(3)
Not able to get the heapdump, but hs_err_pid file got generated added few of the contents. using persistence as false so far didn't report any java memory issue. some times it takes three days. use case: Normal producer pushing the json message, and consumer processing it using camel and pushed to the endpoints. the size of the message should be only in KBs (not more than 1MB).Fivespot
Without a heap dump it's really impossible to say what the issue is. What protocol and client library do your clients use? Do the consumers keep up with the producers?Palmapalmaceous
borker hangs after logging java heap, when i try to gather heap dump using jmap/jcmd not getting the dump file. from artemis broker console, the Consumer mostly uses OPENWIRE protocol (From code we use acitvemq client 5.13.2 version of jar). The producer from the console has CORE protocol. ----Note: Updated the question with specific java heap message logged in the logs .Fivespot
F
0

The client connecting the Artemis broker was using OPENWIRE protocol. By default the Advisory queue is enabled. check Open Wire Document Link.

The problem with the ActiveMQ.Advisory.TempQueue is the messages are getting added to this queue and not getting consumed.

Most often noticed below message, rate of message is ~1.5 Million per day

AMQ222038: Starting paging on address 'ActiveMQ.Advisory.TempQueue'; size is currently: 2,147,398,989 bytes; max-size-bytes: -1; global-size-bytes: 2,147,495,495

And other issue writing to journal where page write failed.

After disabling the Advisory queue using the "supportAdvisory=false"; there was no such exception.

Fivespot answered 20/8, 2020 at 15:55 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.