Which debugging tool can list strings internalized?
Asked Answered
D

2

20

I am looking to a debugging tool that can list the strings that have been internalized? Ideally, I would like to put a mark and have a list of the strings that been added after that mark.

Thanks in advance.

Doubleness answered 30/5, 2011 at 19:21 Comment(9)
@Ed Staub -- I am using eclipse on a window computer, but I am compiling with ant on a linux computer and running there. Why are you asking?Topcoat
Not sure whether this is useful: you can put a debugger detail formatter on String that will show if it's interned, e.g.: (this==this.intern())?("^"+toString()):toString(). Can you explain what you need the tool for? Is it because you rely on equality-testing, or are you looking at memory usage, or...Anthemion
@Ed Staub -- This will intern all the string. I won't be able to know which one are added by the application and which one by the debugger. I am not relying on equality-testing. The number of internalized string is growing in a process, I try to understand why.Topcoat
Oops - that was a dumb idea! Do you have non-standard class-loading going on? If so, that's the first place I'd look - most interning should be from classloading constant strings. Check for multiple instances of the same Class objects.Anthemion
@Ed Staub -- No, I don't have non-standard class-loading. I used jmap to see the number of internalized strings and the number of class loaded. Classloaded is almost not moving, but hundreds thousand string are internalized.Topcoat
Here's another technique, hopefully more useful. Set a breakpoint that will be hit post-initialization, after your app should be in a steady state. When it's hit, put a method-entry breakpoint on String.intern with a large count - 100 or more. Examine the stack each time it hits to figure out who's provoking all the interns. Caution: method breakpoints are VERY slow (not like line breakpoints).Anthemion
A slight improvement: instead of using a count, just manually enable the breakpoint occasionally, check the stack, disable the breakpoint, and continue. This will give samples that are a lot farther apart - less likely to accidentally all point at something that's not the culprit but was doing a lot of interning at that time.Anthemion
@Ed Staub -- Thanks for the suggestion of using a breakpoint. However, I got several hundred thousand calls that are totally legitimate. I have not run the application with a remote debugger yet, I'll try to explore it.Topcoat
After the app is warmed up, I'd expect that most interning would be suspect - given the problem you've described. The odds should be in your favor that you'll quickly get some insight. BTW... are you parsing a lot of XML with unique element or attribute names? I suspect that might cause this.Anthemion
P
2

Perhaps the easiest way is to use a bytecode viewer. Any String that is interned will be present in the constant_pool of the class file the String literal is included in. For instance, on a recent class file from another StackOverflow question I answered, I had the following String literal in my code: "sun.awt.noerasebackground". This shows up in the constant pool as a 'String_info' type. The bytecode viewer (and editor, so beware!) that I use is the JBE. JBE Download

Prosecutor answered 2/8, 2011 at 21:42 Comment(2)
Do you have a the url of the documentation on JBE?Topcoat
It's a small project, and has no documentation as far as I know.Prosecutor
C
1

On recent Hotspot VM, interned strings look just like any other - the only difference is that the underlying char array is being tracked by the VM (I thought that it has an extra JNI reference, but it does not show on YourKit dump - will be interesting to investigate).

That said, Yourkit provides a memory inspection for duplicated strings, which I believe does what you need. If you combine it with 'Trace Allocations', you can get straight to the code that allocated these strings.

See http://www.yourkit.com/docs/95/help/inspections_mem.jsp#duplicate_strings

--

Getting list of strings added between two points in time is easier:

  1. Get two heap-dumps using jmap or your favorite profiler
  2. Do a diff of the heaps
  3. Show all instances of the String class

Should be doable with any profiler or even jhat (if you are patient enough). If you use YourKit, you can use the bookmark feature and take only one heap snapshot.

Continence answered 17/7, 2011 at 14:23 Comment(1)
The issue is not duplicate string. Duplicated are in heap memory, which I got plenty. The issue is interned string. Having a list of all the string won't help if I cannot see which one are interned and which one aren't.Topcoat

© 2022 - 2024 — McMap. All rights reserved.