What are the roots?
Asked Answered
H

5

72

What are the roots in garbage collection?

I have read the definition of root as "any reference that you program can access to" and definition of live is that an object that is being used, which can be a local variable, static variable.

I m little confused with discriminating the difference between root and live objects.

What is path to root? How does root and live objects work?

Can someone elaborate ?

Heritor answered 16/6, 2011 at 1:32 Comment(5)
What crummy definitions :) I would start at Garbage CollectionShelving
@user177833 - where did you read those definitions?Lyonnais
cs.berkeley.edu/~jrs/61b/lec/40.pdfHeritor
the definition in that page for the root is: "any object reference your program can access directly, without going through another object". That is vastly different from "any reference that you program can access to". It is very specific in that your program holds the references to the said managed object, and that your program does not need to traverse the heap to arrive at the root.Unlisted
you'll need to visualize the JVM/CLR as the actual processes that manage the heap. The only objects in the heap, that the process is aware of, is the set of thread stack frames under execution, the classes that have been loaded, amongst a few others. This is the GC root; every other object in the heap is either reachable or unreachable from this set.Unlisted
L
92

If you think of the objects in memory as a tree, the "roots" would be the root nodes - every object immediately accessible by your program.

Person p = new Person();
p.car = new Car(RED);
p.car.engine = new Engine();
p.car.horn = new AnnoyingHorn();

There are four objects; a person, a red car, its engine and horn. Draw the reference graph:

     Person [p]
        |
     Car (red)
   /           \
Engine    AnnoyingHorn

And you'll end up with Person at the "root" of the tree. It's live because it's referenced by a local variable, p, which the program might use at any time to refer to the Person object. This also goes for the other objects, through p.car, p.car.engine, etc.

Since Person and all other objects recursively connected to it are live, there would be trouble if the GC collected them.

Consider, however, if the following is run after a while:

p.car = new Car(BLUE);

And redraw the graph:

     Person [p]
        |
     Car (blue)       Car (red)
                    /           \
                Engine    AnnoyingHorn

Now the Person is accessible through p and the blue car through p.car, but there is no way the red car or its parts can ever be accessed again - they are not connected to a live root. They can be safely collected.

So it's really a matter of taking every starting point (every local variable, globals, statics, everything in other threads and stack frames) — every root — and recursively following all the references to make up a list of all the "live" objects: objects which are in use and unsuitable for deletion. Everything else is garbage, waiting to be collected.

Lipps answered 16/6, 2011 at 2:16 Comment(5)
This answer is incorrect. GC Roots are those classes loaded by the JVM per [Veneet's answer], specifically Threads, classes loaded by the system class loader, references from the stack, JNI and objects awaiting finalization.Shorthorn
See this other answer for a list of possible roots. In this answer Person is not a root, it's accessible by a (and most likely more than one) root.Shorthorn
As the question does not specify Java or JVM anywhere (apart from the tags, which also contain .NET and CLR if you look close enough) and seems to be rather generic, so was my answer. Thanks for the clarification of the Java part, but I fail to see how it invalidates my generic answer.Lipps
In your specific example, in any managed environment, Person is not a GC root; the GC root is the thing which holds the reference to Person. The difference is subtle, but important in the context of this question. Although my answer is specific to Java it is in general correct for any managed language. Your last paragraph is actually correct, but conflicts with the example given.Shorthorn
I still like this answer as it helps clarify how GC works "in general".Fictitious
T
40

The GC (Garbage Collector) roots are objects special for garbage collector. The Garbage Collector collects those objects that are not GC roots and are not accessible by references from GC roots.

There are several kinds of GC roots. One object can belong to more than one kind of root. The root kinds are:

  • Class - class loaded by system class loader. Such classes can never be unloaded. They can hold objects via static fields. Please note that classes loaded by custom class loaders are not roots, unless corresponding instances of java.lang.Class happen to be roots of other kind(s).
  • Thread - live thread
  • Stack Local - local variable or parameter of Java method
  • JNI Local - local variable or parameter of JNI method
  • JNI Global - global JNI reference
  • Monitor Used - objects used as a monitor for synchronization
  • Held by JVM - objects held from garbage collection by JVM for its purposes. Actually the list of such objects depends on JVM implementation. Possible known cases are: the system class loader, a few important exception classes which the JVM knows about, a few pre-allocated objects for exception handling, and custom class loaders when they are in the process of loading classes. Unfortunately, JVM provides absolutely no additional detail for such objects. Thus it is up to the analyst to decide to which case a certain "Held by JVM" belongs.

(credit to YourKit's website)

Not mentioned by YourKit is the fact that objects awaiting finalization will be retained as roots until the GC runs the finalize() method. That can cause transient retention of large graphs somewhat unexpectedly. The general rule of thumb is not to use finalizers (but that's a different question).

Tolyl answered 15/3, 2013 at 7:31 Comment(3)
You might consider sourcing this copy/pasted answer: yourkit.com/docs/12/help/gc_roots.jsp, or alternatively yourkit might consider sourcing you :-).Sharrisharron
Not mentioned are objects awaiting finalization, to which the JVM holds a reference until finalization is run.Shorthorn
Sometimes the references are stored in the operand stack before they are stored in the local variables table. How does the GC solve this?Etsukoetta
U
31

Roots or garbage collection roots are the objects that are always reachable. If an object is always reachable, then it is not eligible for garbage collection; roots therefore are always ineligible for collection. It is the initial set of objects from where reachability of all other objects on the heap are determined.

Other objects on the heap reachable from the garbage collection roots are considered to be live objects, and ineligible for collection; the objects that are unreachable can be marked for reclamation.

I know Java more than the .Net platform, so I'll speak only for one. On the Java platform, the GC roots are actually implementation dependent. In most runtime however, the GC roots tend to be the operands on the stack (for they are currently in use by threads) and class (static) members of classes. Reachability is calculated from these objects in most JVMs. There are other cases where local parameters and operands used by JNI calls will be considered part of the root set, and also used to calculate reachability.

I hope this clears any lingering doubts over what is a root (set) and what is a live object.

Unlisted answered 16/6, 2011 at 1:53 Comment(5)
can i say that roots are pointers to live objects? if there is no path from a root to an object that object can be claimed by garbage collection?Heritor
Root are live objects. Don't bring pointers into this and confuse yourself (GC algorithms use the number of references to an object to determine reachability; see what you did there by considering roots as pointers). Pointers/References have to be used to determine reachability.Unlisted
The above comment should have read as "Roots are live objects known to the JVM/CLR". The problem with treating them as pointers is that the GC algorithm will be more complex, for any GC algorithm deals with the number of pointers/references to objects to distinguish between live and collectable objects. Once a root is a pointer, all root pointers (sic) have to be handled differently, for no apparent benefit.Unlisted
@VineetReynolds "the GC roots tend to be the operands on the stack (for they are currently in use by threads)" What did you mean by "operands on the stack"?Calceolaria
@Geek, the local variables for a method, it's arguments etc.Unlisted
S
18

The IBM web site lists the following as GC roots.

Note that some of these are artificial constructs done by a memory analyzer, but still important to be aware of if you are looking at a heap dump.

  • System class

    A class that was loaded by the bootstrap loader, or the system class loader. For example, this category includes all classes in the rt.jar file (part of the Java runtime environment), such as those in the java.util.* package.

  • JNI local

    A local variable in native code, for example user-defined JNI code or JVM internal code.

  • JNI global

    A global variable in native code, for example user-defined JNI code or JVM internal code.

  • Thread block

    An object that was referenced from an active thread block.

  • Thread

    A running thread.

  • Busy monitor

    Everything that called the wait() or notify() methods, or that is synchronized, for example by calling the synchronized(Object) method or by entering a synchronized method. If the method was static, the root is a class, otherwise it is an object.

  • Java local

    A local variable. For example, input parameters, or locally created objects of methods that are still in the stack of a thread. Native stack

    Input or output parameters in native code, for example user-defined JNI code or JVM internal code. Many methods have native parts, and the objects that are handled as method parameters become garbage collection roots. For example, parameters used for file, network, I/O, or reflection operations.

  • Finalizer

    An object that is in a queue, waiting for a finalizer to run.

  • Unfinalized

    An object that has a finalize method, but was not finalized, and is not yet on the finalizer queue.

  • Unreachable

    An object that is unreachable from any other root, but was marked as a root by Memory Analyzer so that the object can be included in an analysis.

    Unreachable objects are often the result of optimizations in the garbage collection algorithm. For example, an object might be a candidate for garbage collection, but be so small that the garbage collection process would be too expensive. In this case, the object might not be garbage collected, and might remain as an unreachable object.

    By default, unreachable objects are excluded when Memory Analyzer parses the heap dump. These objects are therefore not shown in the histogram, dominator tree, or query results. You can change this behavior by clicking File > Preferences... > IBM Diagnostic Tools for Java - Memory Analyzer, then selecting the Keep unreachable objects check box.

  • Java stack frame

    A Java stack frame, which holds local variables. This type of garbage collection root is only generated if you set the Preferences to treat Java stack frames as objects. For more information, see Java Basics: Threads and thread stack queries.

  • Unknown

    An object of unknown root type. Some dumps, such as IBM Portable Heap Dump (.phd) files, do not have root information. In this case, the Memory Analyzer parser marks objects that have no inbound references, or are unreachable from any other root, as unknown. This action ensures that Memory Analyzer retains all the objects in the dump.

Shorthorn answered 20/1, 2015 at 23:35 Comment(0)
P
3

In java I would say threads are the root objects. Every live object can be back traced to a live thread. For example, a static object is referenced by a class, which is referenced by a class loader, which is referenced by another class, which is referenced by an instance of that class, ... which is referenced by a Runnable, which is referenced by a live thread. (Note, classes can be GC'ed, they can't be roots)

We can also consider a "real" root for all threads, however that is out of the realm of standard Java. We can't say what it is, and how it references all the threads.

Pyxidium answered 16/6, 2011 at 2:3 Comment(2)
Loaded classes are also roots (as they may contain global/static variables).Iq
Classes can be GC'd only if their class loaded becomes unreachable; classes loaded by the system loader cannot be GC'd.Shorthorn

© 2022 - 2024 — McMap. All rights reserved.