GC.Collect on only generation 2 & large object heap
Asked Answered
C

4

10

In my application there is a specific time when a number of large objects are all released at once. At that time I would like to do a garbage collection on specifically the large object heap (LOH).

I'm aware that you cannot do that, you must call GC.Collect(2) because the GC is only invoked on the LOH when it is doing a generation 2 collection. However, I've read in the documentation that calling GC.Collect(2) would still run a GC on generations 1 and 0.

Is it possible to force the GC to only collect gen 2, and not include gen 1 or gen 0?

If it is not possible, is there a reason for the GC to be designed that way?

Corneous answered 23/9, 2009 at 22:23 Comment(2)
Why would you want to do this, ie NOT collect from gen 0 or 1? The .NET GC runs best when left to it's own devices.Creswell
I'm aware of that. Basically, you NEVER want to manually force a GC, because they are intensive operations. Since that is the case, when I see the need to run a GC, I would want it to only run against the specific generation, rather than do a full GC. I'm trying to be more particular about my use of the GC, and it isn't letting me.Corneous
H
14

It's not possible. The GC is designed so that a generation 2 collection always also collects generation 0 and 1.

Edit: Found you a source for this on a GC developer's blog:

Gen2 GC requires a full collection (Gen0, Gen1, Gen2 and LOH! Large objects are GC’ed at every Gen2 GC even when the GC was not triggered by lack of space in LOH. Note that there isn’t a GC that only collects large objects.) which takes much longer than younger generation collections.

Edit 2: From the same blog's Using GC Efficiently Part 1 and Part 2 apparently Gen0 and Gen1 collections are fast compared to a Gen2 collection, so that it seems reasonable to me that only doing Gen2 wouldn't be of much performance benefit. There might be a more fundamental reason, but I'm not sure. Maybe the answer is in some article on that blog.

Hardison answered 23/9, 2009 at 22:25 Comment(1)
Thanks! Please note, I've edited my question to also ask why it is constrained in this way.Corneous
A
6

Since all new allocations (other than for large objects) always go in Gen0, the GC is designed to always collect from the specified generation and below. When you call GC.Collect(2), you are telling the GC to collect from Gen0, Gen1, and Gen2.

If you are certain you are dealing with a lot of large objects (objects that at allocation time are large enough to be placed on the LOH) the best option is to ensure that you set them to null (Nothing in VB) when you are done with them. LOH allocation attempts to be smart and reuse blocks. For example, if you allocated a 1MB object on the LOH and then disposed of it and set it to null, you would be left with a 1MB "hole". The next time you allocate anything on the LOH that is 1MB or smaller in size, it will fill in that hole (and keep filling it in until the next allocation is too large to fit in the remaining space, at which point it will allocate a new block.)

Keep in mind that generations in .NET are not physical things, but are logical separations to help increase GC performance. Since all new allocations go in Gen0, that is always the first generation to be collected. Each collection cycle that runs, anything in a lower generation that survives collection is "promoted" to the next highest generation (until in reaches Gen2).

In most cases, the GC doesn't need to go beyond collecting Gen0. The current implementation of the GC is able to collect Gen0 and Gen1 at the same time, but it can't collect Gen2 while Gen0 or Gen1 are being collected. (.NET 4.0 relaxes this constraint a great deal and for the most part, the GC is able to collect Gen2 while Gen0 or Gen1 are also being collected.)

Apomorphine answered 23/9, 2009 at 22:58 Comment(3)
Your explanation is a great overview of how the GC works, but it doesn't clarify why there is a constraint that precludes a strictly gen 1 or gen 2 collection.Corneous
Setting myVar = null doesn't accomplish anything. See the bottom of bryancook.net/2008/05/net-garbage-collection-behavior-for.htmlCorneous
I believe the reason is that promotion of objects to higher generations is only evaluated (and effected) during GC, so perhaps there's nothing (new) to do if you don't collect the lower generations...Lobule
C
0

To answer the question "why": physically, there is no such thing as Gen0 and Gen1 or Gen2. They all use the same memory block(s) on the virtual address space. Distinction between them really is made only virtually by moving around a imaginary border limit.

Every (small) object is allocated from the Gen0 heap area. If - after a collection - it survives, it is moved "downwards" to that area of the managed heap block, which eventually was just freed from garbage. This is done by compacting the heap. After the full collection finishes, the new "border" for Gen1 is set to the space right after those survived objects.

So if you would go out and try just to clear Gen0 and/or Gen1, you would open up holes in the heap which must get closed by compacting the "full" heap - even objects in Gen0. Obviously this would not make any sence, since most of those objects would be garbage anyway. There is no point in moving them around. And no point in creating and leaving large holes on the (otherwise compacting) heap.

Carborundum answered 27/1, 2011 at 8:41 Comment(2)
Conceptually, it's simplest to think of gen2 stuff as being at the bottom of the heap, with gen1 immediately on top, and gen0 at the very top. Older objects are always below younger objects. When compaction occurs, the system goes through all live objects in the generation(s) to be compacted, starting with the oldest objects, and moves each object to the lowest available spot. The "top of gen2" marker is set just above the topmost newly-copied gen1 object (now in gen2). Likewise the "top of gen1" marker is set just above the topmost newly-copied gen0 object.Fionafionna
The goal of all this is to free the contiguous available space above gen0. A big part of the reason for compacting gen2 is to allow gen1 objects to move down; likewise compacting gen1 allows gen0 objects to move down. There are some weird tricks .net uses to speed up determinations of whether objects are "live"; in practice, they mean that if a gen2 object holds a reference to a gen0 object, the gen0 object won't be collectible until either the reference is destroyed or the gen2 object itself can be collected.Fionafionna
F
0

Whenever the system performs a garbage-collection of a particular generation, it must examine every single object that might hold a reference to any object of that generation. In many cases, old objects will only hold references to other old objects; if the system is doing a Gen0 collection it can ignore any objects which only hold references to those of Gen1 and/or Gen2. Likewise if it's doing a Gen1 collection it can ignore any objects which only hold references to Gen2. Since examination and tagging of objects represents a large portion of the time required for garbage collection, being able to skip older objects entirely represents a considerable time savings.

Incidentally, if you're wondering how the system "knows" whether an object might hold references to newer objects, the system has special code to set a couple bits in each object's descriptor if the object is written. The first bit is reset at each garbage collection, and if it's still reset at the next garbage collection the system will know it can't contain any references to Gen0 objects (since any objects that existed when the object was last written and weren't cleared out by the previous collection will be Gen1 or Gen2). The second bit is reset at each Gen1 garbage collection and if it's still reset at the next Gen1 garbage collection, the system will know it can't contain any references to Gen0 or Gen1 objects (any objects to which it holds references are now Gen2). Note that the system doesn't know or care whether the information that was written to an object included a Gen0 or Gen1 reference. The trap required when writing to an untagged object is expensive, and would greatly impede performance if it had to be handled every time an object is written. To avoid this, objects are tagged whenever any write occurs, so that any additional writes before the next garbage-collection can proceed without interruption.

Fionafionna answered 24/6, 2011 at 15:6 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.