A smart garbage collector for sharing subranges of arrays?

Asked 28/7, 2011 at 8:6 Answered 19/6, 2012 at 14:26

Solved arrays garbage-collection theory language-implementation

In this popular question about why substring takes O(n) in C#, one of the main answers provided argued that if a large array were allocated and substrings computed by having the new strings just reference a small slice of the array, the garbage collector would not be able to reclaim the array of characters containing the larger string even if the original string were no longer being referenced.

This seems like a perfectly valid answer, but it seems like in theory one could construct a garbage collector for arrays that allowed for most of an array to be garbage collected while leaving behind some small subarray that's still in use. In other words, if there were a 50,000-element array of which only a small 100-element slice was still in use, the garbage collector could split the array into three pieces - the elements before the 100-element slice, the 100-element slice itself, and the elements after the 100-element slice - and then garbage collect the first and last of these pieces.

My question is whether any language implementations actually use this sort of garbage collector, or whether it exists only in theory. Does anyone know of an example of a language implementation that has an garbage collector like this?

Matins answered 28/7, 2011 at 8:6 Comment(1)

The GC fiddling with the internals of objects seems way to complex - abstraction is king. A way for objects to tell the GC if and how much memory they share and a callback to remove that sharing by one copying the data of the other sounds somewhat more practical... although I don't think it would pay off. The substrings for which this matters are, as you say, tiny (so O(n) doesn't mean much), and the GC being this smart has some performance and storage overhead. – Headrail 28/7, 2011 at 9:2

In theory, yes... it is possible. But there is a problem with GCs: to collect the garbage, it needs to know the layout of the data being stored in memory, and it must also store data to indicate if a memory block is in use or not... but the layout information is shared with the runtime, because the runtime needs to know object types (i.e. memory layout) in order to do type-casts.

How does GC works?

The GC starts reading the root objects it knows. It gets all the references and mark them as being in-use. For each of these referenced objects, it gets the layout and reads more references from theses ones, and marks them as in-use... and this process continues, until no more references remain.

Notes: I have used type information and layout information, with the same meaning.

Example:

Imagine we have some object layouts:
====================================
A: { int, object, double, object }
B: { object, object }
C: { int }


Memory Data:
============
Now we need to describe what is in memory.
The following list is composed of memory positions,
and the data contained in each position.
There are 2 kinds of notation I will use:
  - Address: ROOT[ list-of-root-references ]
  - Address: type-identifier { object-data }
     Note that each object can span multiple memory
     positions from the first one.
     e.g. 90: B { 0, 0 } -- this reads as
       "stored at address 90: type-identifier of B followed by raw data { 0, 0 }"
  - A reference is represented by a number,
    that point to the memory address of the pointed object.

 1: ROOT[ 90, 20, 30 ]
20: A { 1236, 30, 0.5, 60 }
30: B { 60, 70 }
40: C { 1237 }
50: A { 1238, 20, 0.8, 50 }    There is a self-reference here!
60: C { 1234 }
70: A { 1234, 0, 0.7, 80 }
80: C { 1235 }
90: B { 0, 0 }                 Note that 0 is a null reference!

   The GC need to know the layout of each object.
   Otherwise it wouldn't be abled to tell
   what knid of information is stored in each memory position.

Running the GC:
===============
Garbage collecting steps, to clean the memory described above:
Step 1: Get ROOT references: 2, 3, 9 and mark as 'in-use'
Step 2: Get references from 2, 3 and 9: 3, 6, 7. Note that 0 is a null reference!
Step 3: Get references from 3, 6 and 7: 6, 7, 8, and mark them.
Step 4: Get references from 6, 7, 8, and mark them: only 8!
Step 5: Get references from 8... it has none! We are finished with marking objects.
Step 6: Delete unmarked objects.


      This shows what happened in each step with each object stored in the memory.

Step ->  1  2  3  4  5
Memory                    
20       x           
30       x  x        
40                   DELETE
50                   DELETE
60          x  x     
70          x  x     
80             x  x  
90       x

What I described is a very basic GC algorithm.

Take a look at Tri-color marking... that is really awesome! This is how real modern GC are made.

Garbage collection (computer science) - describes some modern GC methodologies.

But... where is the information about layout stored?

This question is important, because it impacts both the GC and the runtime. To do fast type casting the type information must be placed near the reference, or near the allocated memory. We could think to store the type information in the catalog of allocated memory blocks, but then... type-casts would need to access the catalog, just like the new operator and the GC when it needs to delete the object.

If we store the layout information near the reference, then every reference to the same object would have the same information repeated, along with the pointer itself.

Example:

To represent the memory data I will introduce the following notation:
 - Address: { object-data } -- this time object type is not at the begining!
 - A reference is represented by a type-identifier and an address-number,
   that point to the memory address of the pointed object,
   in the following format: type/number...
   e.g. A/20 -- this reads as: "at address 20, there is an object of type A"
Note that: 0/0 is a null reference,
      but it still requires space to store the type.

The memory would look like this:
 1: ROOT[ B/90, A/20, B/30 ]
20: { 1236, B/30, 0.5, C/60 }
30: { C/60, A/70 }
40: { 1237 }
50: { 1238, A/20, 0.8, A/50 }
60: { 1234 }
70: { 1234, 0/0, 0.7, C/80 }
80: { 1235 }
90: { 0/0, 0/0 }

If we store the layout information near the allocated memory block, then it is nice! It is fast, and avoids repeated layout information.

Example:

The memory looks like the first sample:
 *This is the same notation used at the begining of this answer.
 1: ROOT[ 90, 20, 30 ]
20: A { 1236, 30, 0.5, 60 }
30: B { 60, 70 }
40: C { 1237 }
50: A { 1238, 20, 0.8, 50 }
60: C { 1234 }
70: A { 1234, 0, 0.7, 80 }
80: C { 1235 }
90: B { 0, 0 }

So far, so nice... but now I want shared memory.

The first thing we notice is that we cannot store the layout information near the allocated memory anymore.

Imagine an array with shared memory:

Example:

 I'll introduce a new notation for arrays:
    type-identifier < array-length >

 1: ROOT[ 20, 31 ] -- pointer 31 is invalid, destination has no type-identifier.
20: INT<5>  -- info about layout of the next memory data (spans by 10 memory units)
30:  2
31:  3 -- should have type-identifier, because someone
32:  5              is pointing here!! Invalid memory!!
33:  7
34: 11

We can still try to place the layout information next to the pointer, instead. The shared memory array is now possible:

Example:

 1: ROOT[ INT<5>/30, INT<2>/31 ]
30:  2
31:  3
32:  5
33:  7
34: 11

Remember that this aproach makes us repeat the layout information everywhere... but the point here is to use less memory isn't it??? But to share memory, we need more memory to store the layout-data/pointers. No donuts for us yet. =(

There is only one hope: lets degrade the runtime features!

THIS IS MY ANSWER - How I think it could be possible =>

What about using the memory allocation catalog, to store type informations?

This could be done, but then, dynamic casting would suffer, and also GC would suffer itself. Remember I told that GC need to access the memory catalog, just to delete objects... well, now it would need to access the catalog everytime it finds a reference, not just to delete. OMG!! We are about to kill GC performance with this, and also the runtime performance. Too high cost I think!

<= THIS IS MY ANSWER

But... and if the runtime does not support dynamic casting? if the compiler knows everything about a type at compile time... then GC would not even exist... it NEEDs the information about the type, because this information tells it what is the layout of the memory used by that type.

No easy, smart solution, in sight.

Maybe I am just plain wrong. But I cannot imagine a way to that better than it is already. Modern GCs are even more complicated than this... I have covered only the basics here, I think that, modern GCs are optimizing in other ways, that is, other more reliable ways.

Other references:

http://en.wikipedia.org/wiki/Garbage_collection_(computer_science)

http://www.memorymanagement.org/glossary/t.html

http://www.cs.purdue.edu/homes/sbarakat/cs456/gc-2.pdf

Tri-Color Incremental Updating GC: Does it need to scan each stack twice?

Aphorism answered 12/8, 2011 at 4:31 Comment(0)

D language supports array slices with exactly this kind of GC behavior. Look at http://www.dsource.org/projects/dcollections/wiki/ArrayArticle#WhosResponsible for more information.

Custommade answered 13/8, 2011 at 23:42 Comment(4)

This looks a lot like what I was looking for! I had a suspicion that D might do something like this, but I couldn't find any hard docs backing that claim. – Matins 14/8, 2011 at 0:4

Er, is this actually answering @templatetypedef's question? The link plainly says that The garbage collector is responsible for cleaning up dynamic arrays that no longer are referenced by any slices. --> clearly, the cleanup is an all-or-nothing operation, no? To me it looks like it's exactly not the behavior we're talking about. – Penumbra 14/8, 2011 at 19:0

@Mehrdad- Upon rereading this link I have to agree with you. I had interpreted this discussion to mean that there was a cool allocator that would deallocate just pieces of the array, but it looks like it more accurately says that it's just smart enough to know what blocks to allocate or deallocate as a whole. – Matins 14/8, 2011 at 23:16

I found the article about D's dynamic array quite confusing. I don't know if they are saying it can or cannot garbage collect like this question proposal... it would be nice if it could, with all performance they claim! =) This piece quite tells it can, but I cannot be sure:

What happens when no slices reference that data?Enter D's garbage collector. The garbage collector is responsible for cleaning up dynamic arrays that no longer are referenced by any slices.

, but they don't tell us how they implement this. It could be... or not. It'd be so nice if someone could clarify this. – Aphorism 15/8, 2011 at 3:8

Does anyone know of an example of a language implementation that has an garbage collector like this?

No. I don't think any language implementations currently do this.

What's the nearest thing that real languages do? Functional languages often replace flat arrays with hierarchical trees. So large strings are represented by a data structure called a rope. If a program takes substrings from a rope then most of the string can be collected without having to copy all of the data that is still reachable within the rope. However, such functional data structures are a lot slower than a flat array.

Why don't we do this? Probably because there is a perception that these kinds of approaches introduce a lot of complexity and solve a relatively unimportant problem (I've never had a problem with slices keeping too much space reachable). Also, there is a tradition of using GCs that prohibit interior pointers and, consequently, GCs cannot comprehend slices. In particular, production GCs all put per-object metadata in a header in front of the start of the heap-allocated block of memory.

I actually wrote a VM (HLVM) that avoids headers by using quadword references instead, i.e. the metadata is carried around with the reference rather than in the object it points to. From reference counting research we know that the vast majority of objects have a reference count of one so the potential duplication of per-reference headers is actually cheaper than you might imagine. The lack of a header means HLVM's internal representation of an array is C-compatible so interoperability is much easier and faster. Similarly, a slice could be a reference into the middle of an array. A suitable GC algorithm, such as mark region, could then free the parts of a heap-allocated block of memory that are no longer reachable and reuse them whilst retaining the reachable slices.

Another solution would be to use virtual memory. You can "fake" copying my mapping logical pages onto the same physical page and the GC could collect unreachable pages. However, this is very coarse-grained and, consequently, even more niche.

Delacourt answered 19/6, 2012 at 14:26 Comment(0)

Surely the danger of making a 'smarter' garbage collector is always that you cripple the user in some way, either preventing code from working or a hacky work-around for an overzealous garbage collector.

Diarthrosis answered 8/8, 2011 at 8:31 Comment(2)

I don't see how this answers the question. Can you elaborate on how this would break existing code? – Matins 8/8, 2011 at 8:42

I believe it is theory I can't think of an existing language but in answer to your question in the comment above: It's not common but just something I have heard mentioned as a bit of an issue and worth considering when implementing something new in garbage collection, a few result here - goo.gl/rg4If – Diarthrosis 8/8, 2011 at 9:10

I think all of the associative array implementations (PERL, PHP, javascript..) must be done this way. You call this "garbage collecting" but this means that the particular items must be first unset (removed, deleted) for the garbage collector to know they are unused. So it is normal deletion/removal/unsetting, which is for sure reflected not only in the associative array, but also in the data structure used by the particular language implementation. Otherwise, the memory could be exhausted by almost empty array...

Musetta answered 8/8, 2011 at 16:42 Comment(0)

How does GC works?

But... where is the information about layout stored?

So far, so nice... but now I want shared memory.

There is only one hope: lets degrade the runtime features!

No easy, smart solution, in sight.

Recommended topics

Hot tags