In a nutshell
I have a program that is gradually using more and more memory over time. I am using jmap and jhat to try and diagnose it but am still not quite there.
Background
The program is a long-running server backed by an hbase datastore providing a thrift service to a bunch of other stuff. However, after running for a few days, it will eventually hit the allocated heap limit, and thrash back and forth with nearly all time spent in garbage collection. It would seem references are getting kept to a lot of data somewhere
What I've done so far
After fiddling about with jstat and jconsole some I ended up taking heapdumps with jmap of the running process, and run it through jhat, and the numbers simple don't add up to anywhere near the memory utilisation
jmap -F -dump:live,format=b,file=heap.dump 12765
jmap -F -dump:format=b,file=heap.all 12765
Some stuff off the top of the histogram
Class Instance Count Total Size
class [B 7493 228042570
class java.util.HashMap$Entry 2833152 79328256
class [Ljava.util.HashMap$Entry; 541 33647856
class [Ljava.lang.Object; 303698 29106440
class java.lang.Long 2851889 22815112
class org.apache.hadoop.hbase.KeyValue 303593 13358092
class org.apache.hadoop.hbase.client.Result 303592 9714944
class [I 14074 9146580
class java.util.LinkedList$Entry 303841 7292184
class [Lorg.apache.hadoop.hbase.KeyValue; 303592 7286208
class org.apache.hadoop.hbase.io.ImmutableBytesWritable 305097 4881552
class java.util.ArrayList 302633 4842128
class [Lorg.apache.hadoop.hbase.client.Result; 297 2433488
class [C 5391 320190
While the totals here don't add up to it, at the point that heap dump was taken the process was using over 1gb of memory.
The immediate apparent culprit seems like I'm leaving HBase Result and KeyValue entries all over the place. Trying to trace up the references, I eventually hit
Object at 0x2aab091e46d0
instance of org.apache.hadoop.hbase.ipc.HBaseClient$Call@0x2aab091e46d0 (53 bytes)
Class:
class org.apache.hadoop.hbase.ipc.HBaseClient$Call
Instance data members:
done (Z) : true
error (L) : <null>
id (I) : 57316
param (L) : org.apache.hadoop.hbase.ipc.HBaseRPC$Invocation@0x2aab091e4678 (48 bytes)
this$0 (L) : org.apache.hadoop.hbase.ipc.HBaseClient@0x2aaabfb78f30 (86 bytes)
value (L) : org.apache.hadoop.hbase.io.HbaseObjectWritable@0x2aab092e31c0 (40 bytes)
References to this object:
Other Queries
Reference Chains from Rootset
Exclude weak refs
Include weak refs
Objects reachable from here
Help needed:
There seems to be no references to this final HBaseCLient$Call object(or any of the others like it, each which hold a thousand or so keyvalues with all their internal data). Shouldn't it be getting GCed? Am I just misunderstanding how the gc works or the extent to which jhat will verify references? If so what further can I do to track down my "missing" memory? What other steps can I take to track this down?