Java GC Question: How could an object become unreachable while one of its methods is still being executed?
Asked Answered
T

1

6

I have been reading these slides about Java finalizers. In it, the author describes a scenario (on slide 33) whereby CleanResource.finalize() could be run by the finalizer thread while CleanResource.doSomething() is still running on another thread. How could this happen?

If doSomething() is a non-static method, then to execute that method someone, somewhere must have a strong reference to it... right? So how could this reference get cleared out before the method returns? Can another thread swoop in and null out that reference? If that happened, would doSomething() still return normally on the original thread?

That's all I really want to know, but for a really above-and-beyond answer, you can tell me why the doSomething() on slide 38 is better than the doSomething() on slide 29. Why is it sufficient to simply invoke this keepAlive() method? Wouldn't you need to wrap the whole call to myImpl.doSomething() in a synchronized(this){} block?

Tuantuareg answered 29/7, 2010 at 13:39 Comment(0)
E
3

EDIT3:

The upshot is that the finalizer and a regular method can be executed concurrently on the same instance. Here's an explanation of how that can happen. The code is essentially:

class CleanResource {
   int myIndex;
   static ArrayList<ResourceImpl> all;

   void doSomething() {
     ResourceImpl impl = all.get(myIndex);
     impl.doSomething();
   } 

   protected void finalize() { ... }
}

Given this client code:

CleanResource resource = new CleanResource(...);
resource.doSomething();
resource = null; 

This might be JITed to something like this pseudo C

register CleanResource* res = ...; call ctor etc..
// inline CleanResource.doSomething()
register int myIndex = res->MyIndex;
ResourceImpl* impl = all->get(myInddex);
impl->DoSomething();
// end of inline CleanResource.doSomething()
res = null;

Executed like that, res is cleared after the inlined CleanResource.doSomething() is done, so the gc will not happen until after that method has finished executing. There is no possibility of finalize executing concurrently with another instance method on the same instance.

But, the write to res is not used after that point, and given that there are no fences, it can be moved earlier in the execution, to immediately after the write:

register CleanResource* res = ...; call ctor etc..
// inline CleanResource->doSomething()
register int myIndex = res->MyIndex;
res = null;    /// <-----
ResourceImpl* impl = all->get(myInddex);
impl.DoSomething();
// end of inline CleanResource.doSomething()

At the marked location (<---), there are no references to the CleanResource instance, and so it is eligible for collection and the finalizer method called. Since the finalizer can be called any time after the last reference is cleared, it is possible for the finalizer and the remainder of the CleanResource.doSomething() to execute in parallel.

EDIT2: The keepAlive() ensures that the this pointer is accessed at the end of the method, so that the compiler cannot optimize away use of the pointer. And that this access is guaranteed to happen in the order specified (the synchronized word marks a fence that disallows re-ordering of reads and writes before/after that point.)

Original Post:

The example is saying that the doSomething method is called, and once called, the data referenced via the this pointer can be read early (myIndex in the example). Once the referenced data is read, the this pointer is no longer needed in that method, and the cpu/compiler might overwrite the registers/declare the object as no longer reachable. So, the GC could then concurrently call the finalizer at the same time as the object's doSomething() method is running.

But since the this pointer is not used, it's hard to see how this will have any tangible effect.

EDIT: Well, perhaps if there are cached pointers to the object's fields that are being accessed via cache, computed from this before it was reclaimed, and the object is then reclaimed the memory references become invalid. There's a part of me that has a hard time believing this is possible, but then again, this does seem to be a tricky corner case, and I don't think there is anything in JSR-133 to prevent this happening by default. It's a question of whether an object is considered to be referenced only by pointers to its base or by pointers to it's fields as well.

East answered 29/7, 2010 at 13:54 Comment(4)
So could this happen on a stack-based VM as well? Now that I'm thinking at bytecode level like you are I'm understanding this much better. I guess in the case of a stack machine the reference doesn't need to be on the stack for invokevirtual to complete? So once the myIndex value is retrieved then this can be popped off the stack and potentially reclaimed?Tuantuareg
On a strict stack-based implementation, then this is not possible, since the this pointer will remain on the stack. But with method inlining, out of order execution and register allocation, concurrent call to the finalizer becomes possible. See my latest edit in this continuing saga. :)East
Great answer! I didn't even know anyone other than Android's Dalvik VM had implemented a register-based JVM. Is this common?Tuantuareg
As far as I know, the JIT is not confined to stack based conventions, e.g. method inlining is very common, as has become more aggressive in recent versions, since it allows the JIT to perform more optimizations with larger code blocks. See java.sun.com/products/hotspot/whitepaper.html#methodEast

© 2022 - 2024 — McMap. All rights reserved.