In what situation(s) would a reference point to an object that was queued for garbage collection?
Asked Answered
R

6

7

I'm reading through a C# topic on Dispose() and ~finalize and when to use which. The author argues that you should not use references within your ~finalize because it's possible the object you're referencing may already be collected. The example specifically stated is, ".. you have two objects that have references to each other. If object #1 is collected first, then object #2's reference to it is pointing to an object that's no longer there."

In what scenarios would an instance of an object be in a state where it has a reference in memory to an object that is being GC'd? My guess is there are at least two different scenarios, one where the object reference is pointing at an object and one where the object reference is pointing at another object reference (eg. when it was passed by ref in a method).

Rosol answered 28/2, 2012 at 21:33 Comment(4)
The exact wording is, ".. you have two objects that have references to each other. If object #1 is collected first, then object #2's reference to it is pointing to an object that's no longer there." My question is, how could object #1 be collected if object #2 has a reference to it?Rosol
Circular references are very common in .NET code, event handlers create them. The GC has no problem with them. The only problem with the finalizer is that objects are not finalized in any particular order. Not a problem either.Seasick
Updated the question - hopefully more clear as to what I'm looking for.Rosol
@HenkHolterman It's Head First C# by O'Reilly. Note, this wouldn't be the first technical error I've encountered, but I guess that's to be expected with most books like this.Rosol
S
12

You can have objects that reference each other, and the entire set can be eligible for GC.


Here is a simple code sample:

class Test { 
     public Test Other { get; set;} 

     static void Main()
     {
          Test one = new Test();
          Test two = new Test { Other = one; }
          one.Other = two;

          one = null;
          two = null
          // Both one and two still reference each other, but are now eligible for GC
     }
}
Salerno answered 28/2, 2012 at 21:36 Comment(11)
Now, that I have read this answer, I think I might have misunderstood the question.Photoperiod
While this statement may be true, unfortunately it doesn't help because it still doesn't explain WHY they can be (or are) eligibleRosol
@Rosol What isn't clear? The GC doesn't use reference counting - as soon as the entire set of interconnected objects isn't rooted, they are eligible, but they still can reference each other. Take 2 objects, referenced by variables "a" and "b", where the two objects refer to each other. Set both variables to null. They still reference each other, but now are eligible for GC.Salerno
@chopperdave: Imagine a filesystem, where you have folders that contain other files and folders. When you delete a folder, all the folders and files inside it are deleted at once. With filesystems you rarely get to observe a state where you have an open handle to a file but its parent folder no longer exists (although it's easier to see this happen on Unix than Windows), but in the case where the "files" are objects in your program, a finalizer lets you "sneak in" and you can encounter a state where sibling or parent references are "phantoms" that don't really exist anymore.Gang
@ReedCopsey Prior to the comments in this post, I wasn't aware of the concepts of a 'root' or 'reference counting'. Given that you still don't appear to be answering the question i'm trying to ask, I guess I'm not asking the question properly. Let me try to clarify.Rosol
@DanielPryden Completely understand this concept but it doesn't apply to the question I'm trying to have answered at this point - per my comment right above this one, perhaps I'm not asking the question in the right way.Rosol
@Rosol I think you have a fundamental misunderstanding here. From your comment "GC still requires that an object be made inaccessible (no references) before it does anything with it" - The GC requires that there are no "roots" to the object from your code, not that there are no references. The only requirement is that there are no references to that object that are in use by your process, but there are very often many references to the object. The references from your code are the "roots" to the object - so once the root is gone (ie:the variable goes out of scope or is set to null)Salerno
@ReedCopsey Ahh. Ok I think I know what you're getting at but it's still not 100% clear. I don't suppose you can whip up a quick code snippet that demonstrates this situation or can point me to one?Rosol
... Then the object becomes eligible for GC, regardless of whether other "unrooted" objects still have a reference to that specific object. An entire "network" of inter-related, self-referencing objects can become eligible for GC at one period of time.Salerno
@Rosol Edited my post to demo.Salerno
let us continue this discussion in chatRosol
S
4

Normally the GC will only reclaim memory for objects that don't have any reference pointing to them. However, objects with finalizers are treated differently.

Here's what MSDN says about it:

Reclaiming the memory used by objects with Finalize methods requires at least two garbage collections. When the garbage collector performs a collection, it reclaims the memory for inaccessible objects without finalizers. At this time, it cannot collect the inaccessible objects that do have finalizers. Instead, it removes the entries for these objects from the finalization queue and places them in a list of objects marked as ready for finalization. [...]
The garbage collector calls the Finalize methods for the objects in this list and then removes the entries from the list. A future garbage collection will determine that the finalized objects are truly garbage because they are no longer pointed to by entries in the list of objects marked as ready for finalization.

So there is no guarantee that other objects referenced in a finalizer will still be usable when the Finalize method gets executed by the GC, since they may already have been finalized during an earlier garbage collection while the object itself was waiting to be finalized.

Sprit answered 28/2, 2012 at 21:45 Comment(3)
If I read this correctly, GC still requires that an object be made inaccessible (no references) before it does anything with it - regardless of whether it has a finalize method or not. Coming back to my question, how can an object that is referenced by another object (and thus assumingly accessible) ever be GC'd?Rosol
+1: Best answer so far because you explained why you shouldn't attempt to access another object in the finalizer.Scenery
There is no guarantee that the objects will be in a state that any particular user-supplied methods will "like", but the only thing the GC will ever allow to "spontaneously" happen to any object while it is accessible via any means whatsoever is for Object.Finalize() to be called upon it. The garbage collector will not do anything to disturb the contents of objects' fields other than causing Object.Finalize to be run. Object.Finalize() may, of course, do whatever it wants to objects' fields.Grindstone
A
2

In short, objects that are not reachable from a GC root (static field, method parameter, local variable, enregistered variable) by following the chain of references are eligible to garbage collection. So it is fully possible that, say, object A refers to B that refers to C that refers to D, but suddenly A nulls out its reference to B, in which case B, C and D all can be collected.

Airily answered 28/2, 2012 at 21:45 Comment(2)
This is very close to what I'm looking for I think. To restate what you're saying, if my reference to an object is actually a reference to a reference (because it has been passed to me via a method), then it possible the reference my object references is removed without me knowing it and thus when I try to use it, I get null reference exception. Is that right? What about in the case where my object's reference is a direct reference to an object?Rosol
@chopperdave: A .net program includes substantial metadata, so that at any given point in a program's execution the garbage collector will be able to identify all of the object references that live code might possibly be able to access. If the third parameter to your method was an object reference, the GC gets triggered while your method is running, and your code hasn't reached a point where it will clearly never again look at that third parameter, the GC will not allow the object referred to by that parameter to be collected.Grindstone
G
2

Unfortunately, there is a lot of sloppy use of terminology surrounding garbage collection, which causes much confusion. A "disposer" or "finalizer" does not actually destroy an object, but rather serves to delay the destruction of an object which would otherwise be eligible for destruction, until after it has had a chance to put its affairs in order (i.e. generally by letting other things know that their services are no longer required).

It's simplest to think of the "stop the world" garbage collector as performing the following steps, in order:

  1. Untag all items which are new enough that they might be considered "garbage".
  2. Visit every garbage-collection root (i.e. thing which is inherently "live"), and if it hasn't been copied yet, copy it to a new heap, update the reference to point to the new object, and and visit all items to which it holds references (which will copy them if they haven't been copied). If one visits an item in the old heap that had been copied, just update the reference one used to visit it.
  3. Examine every item that has registered for finalization. If it hasn't yet been copied, unregister it for finalization, but append a reference to it on a list of objects which need to be finalized as quickly as possible.
  4. Items on the immediate-finalization list are considered "live", but since they haven't yet been copied yet, visit every item on that list and, if not yet copied, copy it to the new heap and visit all items to which it holds references.
  5. Abandon the old heap, since nobody will hold references to anything on it anymore.

It's interesting to note that while some other garbage-collection systems work by using double-indirected pointers for references, the .net garbage collector (at least the normal "stop the world" one) uses direct pointers. This increases somewhat the amount of work the collector has to do, but it improves the efficiency of code that manipulates objects. Since most programs spend more of their time manipulating objects than they spend collecting garbage, this is a net win.

Grindstone answered 29/2, 2012 at 16:15 Comment(0)
F
1

" ... then object #2's reference to it is pointing to an object that's no longer there."

That will never happen. Whenever your code has access to a reference then the object being referred to still exists. This is called memory safety and this still holds when an object is being finalized in the background. A reference never points to a collected instance.

But an existing, non collected object could already have been Finalized (Disposed). That is probably what your warning refers to.

class Foo
{
   private StreamWriter logFile = ...
   private StringBuilder sb = new StringBuilder("xx");    

   ~Foo()       
   {           
     if (sb.ToString() == "xx")  // this will always be safe and just work
     {
         // the next line might work or  
         // it might fail with "logFile already Disposed"          
         logFile.Writeline("Goodbye");  
      }             
   }
}
Floriated answered 28/2, 2012 at 22:59 Comment(4)
"A reference never points to a collected instance." - A doubly linked list which has many nodes is about to be collected. Within the list for each sequential pair of nodes, Node A has a reference to Node B and Node B has a reference to Node A. Someone has to be collected first.Kendall
@thatchuck - "Someone has to be collected first" that won't be observable behaviour, ie not as far as your code can tell. And in gen-0 at least everybody is collected in the same exact clock tick.Floriated
@ThatChuckGuy: For some combination of bits to be a reference to something else, there has to be something somewhere that ways that combination of bits should be interpreted as a reference, rather than as an integer, or one or more characters, or a program instruction, or whatever. When GC is running, both old and new objects will live in areas of memory that are marked as containing objects, so any references within the things within those areas of memory might be regarded as being "real references" (as opposed to something else). Once the GC is done, however, ...Grindstone
...the system will abandon anything that wasn't marked to be kept, meaning nothing will ever again look at those areas of memory again until they have been overwritten with new data. While there may be some weird corner cases involving things like WeakReference types, for all intents and purposes the memory used by all of the unused objects in a given GC "generation" will become available for reuse simultaneously.Grindstone
P
0

There is a term called resurrection in .NET.

In short, resurrection may happen when your object is in finalization queue but when the finalizer (~ClassName() method) is called, it moves the object back to the game. For example:

public class SomeClass
{
    public static SomeClass m_instance;

    ...

    ~SomeClass()
    {
    m_instance = this;
    }
}

You can read more on this here: Object Resurrection using GC.ReRegisterForFinalize. But I would really recommend the book CLR via C# by Jeffrey Richter as it explained this subject in depth in one of the chapters.

Photoperiod answered 28/2, 2012 at 21:45 Comment(5)
Interesting, but how did the object I referenced get into the finalization queue if it never became inaccessible?Rosol
Yes, sorry about that. It was f-reachable indeed. Objects which have finalizer methods are in finalization queue by default and only when they are no longer accessible from the application's roots are they placed in the f-reachable. That is done because the system needs to still have references to the objects in order to call their finalizers.Photoperiod
@HenkHolterman: How did Microsoft come up with those names? They seem backward to me. The set of objects with registered finalizers is more like a bag than a queue, so I would think the term "queue" would be more applicable to the list of objects that need to have their finalizers run ASAP. It also seems odd that the name "fReachable" would be used for a list of objects that are on the list because they were unreachable. I'd like to refer to those lists using their real names, except the names are sufficiently backward-seeming that it would just add to confusion.Grindstone
@HenkHolterman: It does, but the fact that the objects in that list are reachable isn't what's significant about them. Any object could be placed in that list and described as "reachable". If I had my druthers, the data structures would be the "registered finalizers list" and the "finalization queue". BTW, I wonder if anything is really gained by having all objects include a Finalize method, as opposed to having a distinct FinalizableObject base class. If a class isn't set up to handle unmanaged resources, derived classes generally shouldn't have any. Instead they should...Grindstone
...encapsulate any unmanaged resources into managed resource classes, and then hold references to those. It seems curious that all Object derivatives are given a vtable slot for Finalize() when it's only relevant for a very small minority of them.Grindstone

© 2022 - 2024 — McMap. All rights reserved.