How to analyze a java thread dump?
Asked Answered
E

2

106

I am trying to understand more about java, especially about memory management and threads. For this reason I have recently found interest in looking at thread dumps.

Here are few lines taken from a web app using VisualVM, a built-in tool for java:

"Finalizer" daemon prio=8 tid=0x02b3d000 nid=0x898 in Object.wait() [0x02d0f000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x27ef0288> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
    - locked <0x27ef0288> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
    at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)

   Locked ownable synchronizers:
    - None

"Reference Handler" daemon prio=10 tid=0x02b3b800 nid=0x494 in Object.wait() [0x02cbf000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x27ef0310> (a java.lang.ref.Reference$Lock)
    at java.lang.Object.wait(Object.java:485)
    at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
    - locked <0x27ef0310> (a java.lang.ref.Reference$Lock)

First I have questions about some variable names:

  • what does tid and nid mean?
  • What is the figure in squared parenthesis after Object.wait?

Then for the stack trace itself:

  • what does it mean waiting on <.....> (a java.lang....) and what's the number in <..>
  • what does it mean locked <.....> (a java.lang....) same question, what's in <..>

I thought that the word locked was related in someway to a wait condition, however, I was wrong. In fact, I am wondering why locked is repeated three times, but the thread is in runnable state as seen in the same dump:

"Thread-0" prio=6 tid=0x02ee3800 nid=0xc1c runnable [0x03eaf000]
   java.lang.Thread.State: RUNNABLE
    at java.io.FileInputStream.readBytes(Native Method)
    at java.io.FileInputStream.read(FileInputStream.java:199)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
    - locked <0x23963378> (a java.io.BufferedInputStream)
    at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:264)
    at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:306)
    at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158)
    - locked <0x23968450> (a java.io.InputStreamReader)
    at java.io.InputStreamReader.read(InputStreamReader.java:167)
    at java.io.BufferedReader.fill(BufferedReader.java:136)
    at java.io.BufferedReader.readLine(BufferedReader.java:299)
    - locked <0x23968450> (a java.io.InputStreamReader)
    at java.io.BufferedReader.readLine(BufferedReader.java:362)
    at org.codehaus.plexus.util.cli.StreamPumper.run(StreamPumper.java:145)

Then last of all, this was the worst of them:

"CompilerThread0" daemon prio=10 tid=0x02b81000 nid=0x698 waiting on condition [0x00000000]
   java.lang.Thread.State: RUNNABLE

This thread is in runnable state, but it's waiting on condition. What condition and what is 0x00000?

Why the stack trace is so short without any evidence of the thread class?

If you could answer to all my questions I would be very grateful.

Thanks

Eads answered 29/9, 2011 at 15:21 Comment(0)
H
119

The TID is thead id and NID is: Native thread ID. This ID is highly platform dependent. It's the NID in jstack thread dumps. On Windows, it's simply the OS-level thread ID within a process. On Linux and Solaris, it's the PID of the thread (which in turn is a light-weight process). On Mac OS X, it is said to be the native pthread_t value.

Go to this link: Java-level thread ID: for a definition and a further explanation of these two terms.

On IBM's site I found this link: How to interpret a thread dump. that covers this in greater detail:

It explains what that waiting on means: A lock prevents more than one entity from accessing a shared resource. Each object in Java™ has an associated lock (gained by using a synchronized block or method). In the case of the JVM, threads compete for various resources in the JVM and locks on Java objects.

Then it describes the monitor as a special kind of locking mechanism that is used in the JVM to allow flexible synchronization between threads. For the purpose of this section, read the terms monitor and lock interchangeably.

Then it goes further:

To avoid having a monitor on every object, the JVM usually uses a flag in a class or method block to indicate that the item is locked. Most of the time, a piece of code will transit some locked section without contention. Therefore, the guardian flag is enough to protect this piece of code. This is called a flat monitor. However, if another thread wants to access some code that is locked, a genuine contention has occurred. The JVM must now create (or inflate) the monitor object to hold the second thread and arrange for a signaling mechanism to coordinate access to the code section. This monitor is now called an inflated monitor.

Here is a more in-depth explanation of what you are seeing on the lines from the thread dump. A Java thread is implemented by a native thread of the operating system. Each thread is represented by a line in bold such as:

"Thread-1" (TID:0x9017A0, sys_thread_t:0x23EAC8, state:R, native ID:0x6E4) prio=5

*The following 6 items explains this as I've matched them from the example, values in the brackets[]:

  1. name [Thread-1],
  2. identifier [0x9017A0],
  3. JVM data structure address [0x23EAC8],
  4. current state [R],
  5. native thread identifier [0x6E4],
  6. and priority [5].

The "wait on" appears to be a daemon thread associated with the jvm itself and not the application thread perse. When you get an "in Object.wait()", that means the daemon thread, "finalizer" here, is waiting on a notification about a lock on an object, in this case it shows you what notification it's waiting on: "- waiting on <0x27ef0288> (a java.lang.ref.ReferenceQueue$Lock)"

Definition of the ReferenceQueue is: Reference queues, to which registered reference objects are appended by the garbage collector after the appropriate reachability changes are detected.

The finalizer thread runs so the garbage collection operates to clean up resources associated with an object. If I'm seeing it corectly, the finalizer can't get the lock to this object: java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118) because the java object is running a method, so the finalizer thread is locked until that object is finished with it's current task.

Also, the finalizer isn't just looking to reclaim memory, it's more involved than that for cleaning up resources. I need to do more study on it, but if you have files open, sockets, etc... related to an objects methods, then the finalizer is going to work on freeing those items up as well.

What is the figure in squared parenthesis after Object.wait in the thread dump?

It is a pointer in memory to the thread. Here is a more detailed description:

C.4.1 Thread Information

The first part of the thread section shows the thread that provoked the fatal error, as follows:

Current thread (0x0805ac88):  JavaThread "main" [_thread_in_native, id=21139]
                    |             |         |            |          +-- ID
                    |             |         |            +------------- state
                    |             |         +-------------------------- name
                    |             +------------------------------------ type
                    +-------------------------------------------------- pointer

The thread pointer is the pointer to the Java VM internal thread structure. It is generally of no interest unless you are debugging a live Java VM or core file.

This last description came from: Troubleshooting Guide for Java SE 6 with HotSpot VM

Here are a few more links on thread dumps:

Hepner answered 29/9, 2011 at 15:47 Comment(1)
'Newer' Java versions (9 upwards?) have additional useful info in the thread descriptor line - like cpu=nnnn indicating how many seconds of CPU time the thread actually occupied. For example if you have a busy-looping ("100% utilized") thread you can have a look at cpu, count back, and find the approximate time that it actually entered the busy loop - for further troubleshooting through timestamped logs etcPedropedrotti
A
12

Further to @James Drinkard's excellent answer:

Note that, depending on the underlying implementation, the java.lang.Thread.State of a thread that is blocked in a native method may be reported as RUNNABLE, where A thread in the runnable state is executing in the Java virtual machine but it may be waiting for other resources from the operating system such as processor.

It turns out that this description also encompasses being blocked in an OS call such as a poll or read operation - presumably because there is no guarantee that the JVM can know when a native method call has blocked at the OS level.

Many discussions of JVM thread dumps that I've seen either ignore this possibility completely, or blithely skim over it without considering the implications - not least of which is that monitoring tools may confusingly report that several such threads are 'running', and furthermore that they are all running at 100%.

Angiology answered 6/4, 2016 at 17:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.