WebappClassLoader memory leak even with no gc roots
Asked Answered
L

1

6

HERE IS THE HEAD DUMP (UPDATED ON 10/29/2013)

I'm working in a webapp with:

  • Tomcat 7.0.24
  • Java 6
  • Spring 3 (with aop - cglib)
  • SLF4J over Log4j
  • Oracle Coherence

After a lot of work, I managed to remove all the strong reference to the class loader and now it is a candidate for the garbage collector. So, memory leak solved? Of course not! Because after several hot deployments a OOME appears due to PermGen space.

Thanks to Yourkit, I was able to check that the WebappClassLoader was Pending Finalization which means that is waiting in the finalizer queue (actually, is not the WebappClassLoader itself but one of his referents). Checking a memory snapshot I found several Finalizer references to Oracle Coherence classes... enter image description here

This seems "okey": Coherence objects are waiting to be garbage collected thanks to all the hard work done removing all the strong references (killing all coherence threads, removing java security providers, etc). I think that there is nothing to do here.

So, I was thinking about some finalize execution that breaks something and then not allowing empty the finalizer queue. But the weird thing is that using JMX or jmap -finalizerinfo the finalizer queue seems to be empty! All this is very confusing so I kept searching in other places...

Do you think that is something to do here? I've read something about CGLIB enhancing the finalize method. If I have access to Enhancer I can create a callback filter as explained here but I dunno how to manage this with Spring AOP.

Well, searching in other places, I found several weak references from java.lang.reflect.Proxy. These are jdk dynamic proxies right? or they are related to Introspection memory leaks? with weak references?

enter image description here

INFO: I'm using a Spring's context listener that flushes instrospector's caches (java.beans.Introspector.flushCaches()). What else I can do with this?

Lets continue.

Then, we have several other weak references from java.io.ObjectStreamClass$Caches. A lot of my business objects have these kind of weak references.

enter image description here enter image description here

Maybe I need flushing these caches. But how??

Then we have these weak reference related to com.sun.internal.ResourceManager, java.util.logging.Logging and java.lang.reflect.Proxy

enter image description here

What I can do with this weak references? Do I need to worry about this or the problem is on the finalizer queue? Any clue will be of help ... really :-D

Ah, another thing, I found a weak reference from a tomcat "main" thread that will not be renewed ever by tomcat. I know that my application can leave some thread local var in some tomcat threads, but tomcat 7 renew these threads to avoid class loader memory leaks. enter image description here

I think this is the oddest thing on my memory snapshot, but is a weak referece right? What I can do with this?

EDIT: Reading the java.lang.ref javadoc i found this:

An object is weakly reachable if it is neither strongly nor softly reachable but can be reached by traversing a weak reference. When the weak references to a weakly-reachable object are cleared, the object becomes eligible for finalization.

So, can weak references retain objects in the heap when they implement a finalize method?

Meanwhile i found an answer to this, I manage to remove all the weak references to my classloader but two: ClassLoaderLogManager.classLoaderLoggers and the one related to the tomcat thread.

NOTE: Actually, I managed to remove the first one but this reference is set again by tomcat after/during undeployment.

EDIT: PLUMBR RESULTS

I've tried plumbr and no reports on the web console. Only this message on the standard output

Dumping heap to /opt/tomcat7/headdumps/java_pid9478.hprof ...
Heap dump file created [348373628 bytes in 3.984 secs]
#
# An unexpected error has been detected by Java Runtime Environment:
#
#  Internal Error (javaCalls.cpp:40), pid=9478, tid=1117813056
#  Error: guarantee(!thread->is_Compiler_thread(),"cannot make java calls from the compiler")
#
# Java VM: Java HotSpot(TM) 64-Bit Server VM (11.2-b01 mixed mode linux-amd64) [thread 1110444352 also had an error]
# An error report file with more information is saved as:
# [thread 1110444352 also had an error]
#
# If you would like to submit a bug report, please visit:
#   http://java.sun.com/webapps/bugreport/crash.jsp
#
******************************************************************************
*                                                                            *
* Plumbr has noticed that JVM has thrown an OutOfMemoryError: PermGen space. *
*                                                                            *
* You can increase PermGen size with -XX:MaxPermSize parameter.              *
* If you encountered this error after a redeploy, please read next article:  *
* http://plumbr.eu/blog/what-is-a-permgen-leak                               *
*                                                                            *
******************************************************************************
Lucky answered 19/10, 2013 at 19:45 Comment(6)
Do you have a heap dump after the OOM happened available somewhere for download? May be we would be able to look into it.Hyehyena
Hey @Nikem! That will be very nice! I've also tried with plumbr without any success :-( Here is the dump: dl.dropboxusercontent.com/u/9210700/java_pid12165.hprofLucky
Have you tried to redeploy your application while having Plumbr attached? If you like, you can contact us at [email protected] in order to investigate why Plumbr has issue with this case.Hyehyena
I've already installed plumbr and I managed to use it, but not memory leak was detected! I'll try again. I'll keep you posted!Lucky
I recommend the following steps: start your application with Plumbr attached, run it for some time, redeploy the application, let Plumbr think for about 5 minutes. If does not help, please contact us :)Hyehyena
I've tried plumbr and nothing. Even my application does not appear on the plumbr web console after the OOME. Maybe I missed something. Can you help me ot understand how to use plumbr to find my leak? Thx in advance!Lucky
H
3

I think Yourkit has led you to the wrong path.

I have looked into your heap dump using Eclipse Memory Analyzer. It showed, that WebappClassLoader is referenced by class com.inovasoftware.iap.data.access.platform.datarepository.CoherenceDataRepository$$EnhancerByCGLIB$$180c0a4e, which instance is alive in some thread local variable. Some googling showed this: https://hibernate.atlassian.net/browse/HHH-2481

So may be upgrading Hibernate version will be of help.

MAT screenshot

Hyehyena answered 25/10, 2013 at 10:25 Comment(7)
Thanks for this info! I'll kill that reference right now! I'll keep you posted!Lucky
Hey @Nikem! Thanks for this finding! I managed to release the thread local but the class loader still "Pending Finalization" (on yourkit). I've also checked with EMA and I saw only Finalizer references to some Coherence Objects. If you can, take a look to the new head dump dl.dropboxusercontent.com/u/9210700/Tomcat-2013-10-25.hprofLucky
I will look into it. Do you still get OOM?Hyehyena
Yeap ... After 3/4 hot deployments. Thanks in advance btw :-)Lucky
That is strange. I cannot see any duplicated class loaders. There is only one WebappClassLoader for your application. Was this dump takes from a couple of redeploys? It looks very fresh to me.Hyehyena
Yeap, that one was taken only starting and stopping the application. But the result is the same: A WebappClassLoader with started=false that is never collected. I'll update the memory dump with that one generated after the OOME. Thanks btwLucky
This is the dump after the OOME dl.dropboxusercontent.com/u/9210700/java_pid9478.hprof. Sorry for the delay! (PS: I've deleted a previos comment because it was pointing to a non existent dump)Lucky

© 2022 - 2024 — McMap. All rights reserved.