Finding roots for garbage collection in C

Asked 27/11, 2012 at 3:45 Answered 27/11, 2012 at 4:57

Solved c garbage-collection boehm-gc gc-roots

I'm trying to implement a simple mark and sweep garbage collector in C. The first step of the algorithm is finding the roots. So my question is how can I find the roots in a C program?

In the programs using malloc, I'll be using the custom allocator. This custom allocator is all that will be called from the C program, and may be a custom init().

How does garbage collector knows what all the pointers(roots) are in the program? Also, given a pointer of a custom type how does it get all pointers inside that?

For example, if there's a pointer p pointing to a class list, which has another pointer inside it.. say q. How does garbage collector knows about it, so that it can mark it?

Update: How about if I send all the pointer names and types to GC when I init it? Similarly, the structure of different types can also be sent so that GC can traverse the tree. Is this even a sane idea or am I just going crazy?

Borer answered 27/11, 2012 at 3:45 Comment(4)

Might be worth having a look at hpl.hp.com/personal/Hans_Boehm/gc and references therein. – Newlin 27/11, 2012 at 4:34

@kallikak- I tried looking into it, but it was way over my head :( – Borer 28/11, 2012 at 10:51

Not sure how you're going to "send" pointer locations to GC, but if you can attach/embed a structure to the object you can store pointer location info there. (Note that it need only be a bit array, to indicate which word is or isn't a pointer.) – Stipitate 28/11, 2012 at 13:20

If you follow the link from @Newlin you'll find a descritpion of the algorithm that is pretty good, and which seems to be quite readable (once you tackle a little jargon). I recommend it. – Stipitate 28/11, 2012 at 13:23

First off, garbage collectors in C, without extensive compiler and OS support, have to be conservative, because you cannot distinguish between a legitimate pointer and an integer that happens to have a value that looks like a pointer. And even conservative garbage collectors are hard to implement. Like, really hard. And often, you will need to constrain the language in order to get something acceptable: for instance, it might be impossible to correctly collect memory if pointers are hidden or obfuscated. If you allocate 100 bytes and only keep a pointer to the tenth byte of the allocation, your GC is unlikely to figure out that you still need the block since it will see no reference to the beginning. Another very important constraint to control is the memory alignment: if pointers can be on unaligned memory, your collector can be slowed down by a factor of 10x or worse.

To find roots, you need to know where your stacks start, and where your stacks end. Notice the plural form: each thread has its own stack, and you might need to account for that, depending on your objectives. To know where a stack starts, without entering into platform-specific details (that I probably wouldn't be able to provide anyways), you can use assembly code inside the main function of the current thread (just main in a non-threaded executable) to query the stack register (esp on x86, rsp on x86_64 to name those two only). Gcc and clang support a language extension that lets you assign a variable permanently to a register, which should make it easy for you:

register void* stack asm("esp"); // replace esp with the name of your stack reg

(register is a standard language keyword that is most of the time ignored by today's compilers, but coupled with asm("register_name"), it lets you do some nasty stuff.)

To ensure you don't forget important roots, you should defer the actual work of the main function to another one. (On x86 platforms, you can also query ebp/rbp, the stack frame base pointers, instead, and still do your actual work in the main function.)

int main(int argc, const char** argv, const char** envp)
{
    register void* stack asm("esp");
    // put stack somewhere
    return do_main(argc, argv, envp);
}

Once you enter your GC to do collection, you need to query the current stack pointer for the thread you've interrupted. You will need design-specific and/or platform-specific calls for that (though if you get something to execute on the same thread, the technique above will still work).

The actual hunt for roots starts now. Good news: most ABIs will require stack frames to be aligned on a boundary greater than the size of a pointer, which means that if you trust every pointer to be on aligned memory, you can treat your whole stack as a intptr_t* and check if any pattern inside looks like any of your managed pointers.

Obviously, there are other roots. Global variables can (theoretically) be roots, and fields inside structures can be roots too. Registers can also have pointers to objects. You need to separately account for global variables that can be roots (or forbid that altogether, which isn't a bad idea in my opinion) because automatic discovery of those would be hard (at least, I wouldn't know how to do it on any platform).

These roots can lead to references on the heap, where things can go awry if you don't take care.

Since not all platforms provide malloc introspection (as far as I know), you need to implement the concept of scanned memory--that is, memory that your GC knows about. It needs to know at least the address and the size of each of such allocation. When you get a reference to one of these, you simply scan them for pointers, just like you did for the stack. (This means that you should take care that your pointers are aligned. This is normally the case if you let your compiler do its job, but you still need to be careful when you use third-party APIs).

This also means that you cannot put references to collectable memory to places where the GC can't reach it. And this is where it hurts the most and where you need to be extra-careful. Otherwise, if your platform supports malloc introspection, you can easily tell the size of each allocation you get a pointer to and make sure you don't overrun them.

This just scratches the surface of the topic. Garbage collectors are extremely complex, even when single-threaded. When you add threads to the mix, you enter a whole new world of hurt.

Apple has implemented such a conservative GC for the Objective-C language and dubbed it libauto. They have open-sourced it, along with a good part of the low-level technologies of Mac OS X, and you can find the source here.

I can only quote Hot Licks here: good luck!

Okay, before I go even further, I forgot something very important: compiler optimizations can break the GC. If your compiler is not aware of your GC, it can very well never put certain roots on the stack (only dealing with them in registers), and you're going to miss them. This is not too problematic for single-threaded programs if you can inspect registers, but again, a huge mess for multithreaded programs.

Also be very careful about the interruptibility of allocations: you must make sure that your GC cannot kick in while you're returning a new pointer because it could collect it right before it is assigned to a root, and when your program resumes it would assign that new dangling pointer to your program.

And here's an update to address the edit:

Update: How about if I send all the pointer names and types to GC when I init it? Similarly, the structure of different types can also be sent so that GC can traverse the tree. Is this even a sane idea or am I just going crazy?

I guess you could allocate our memory then register it with the GC to tell it that it should be a managed resource. That would solve the interruptability problem. But then, be careful about what you send to third-party libraries, because if they keep a reference to it, your GC might not be able to detect it since they won't register their data structures with your GC.

And you likely won't be able to do that with roots on the stack.

Centrum answered 27/11, 2012 at 4:57 Comment(6)

Great post, thanks. Please read the update I have one weird idea. – Borer 28/11, 2012 at 10:45

Thank you for updating your answer @zneak, that surely helped me better understand. – Borer 28/11, 2012 at 16:53

One more question, if while scanning I encounter a data in stack say "123123123" which also belongs to the region of memory allocated. That doesn't mean its a pointer, right? It can be very well an integer with that value. I believe that is what you meant by indistinguishable and conservative gc. – Borer 28/11, 2012 at 16:54

Yes, this is what I mean: without knowledge of structure layouts, your GC cannot distinguish between pointers and integers. That's why you have to assume that any integer that looks like a pointer is, in fact, a pointer. The kind of garbage collector that does this is called "conservative". – Centrum 28/11, 2012 at 17:13

Yeah, they use the term "conservative" because it sounds better than "promiscuous". ;) – Stipitate 28/11, 2012 at 18:54

"If you allocate 100 bytes and only keep a pointer to the tenth byte of the allocation" That is called a derived pointer and the gc can deal with it by recording the start address and size of every object it allocates. When it encounters such a pointer it just makes a lookup into a relatively fast data structure holding object start addresses and it is business as usual. – Distilled 10/9, 2016 at 13:16

The roots are basically all static and automatic object pointers. Static pointers would be linked inside the load modules. Automatic pointers must be found by scanning stack frames. Of course, you have no idea where in the stack frames the automatic pointers are.

Once you have the roots you need to scan objects and find all the pointers inside them. (This would include pointer arrays.) For that you need to identify the class object and somehow extract from it information about pointer locations. Of course, in C many objects are not virtual and do not have a class pointer within them.

Good luck!!

Added: One technique that could vaguely make your quest possible is "conservative" garbage collection. Since you intend to have your own allocator, you can (somehow) keep track of allocation sizes and locations, so you can pick any pointer-sized chunk out of storage and ask "Might this possibly be a pointer to one of my objects?" You can, of course, never know for sure, since random data might "look like" a pointer to one of your objects, but still you can, through this mechanism, scan a chunk of storage (like a frame in the call stack, or an individual object) and identify all the possible objects it might address.

With a conservative collector you cannot safely do object relocation/compaction (where you modify pointers to objects as you move them) since you might accidentally modify "random" data that looks like an object pointer but is in fact meaningful data to some application. But you can identify unused objects and free up the space they occupy for reuse. With proper design it's possible to have a very effective non-compacting GC.

(However, if your version of C allows unaligned pointers scanning could be very slow, since you'd have to try every variation on byte alignment.)

Stipitate answered 27/11, 2012 at 3:57 Comment(12)

Isn't there any easier way? Extracting pointers from stack frames and extracting pointer objects from C structs sounds like hard problem.. – Borer 27/11, 2012 at 4:3

Yes, and this would be why there are no garbage collectors for standard C. – Stipitate 27/11, 2012 at 4:4

(It's hard enough in Java, where everything is designed around garbage collection.) – Stipitate 27/11, 2012 at 4:5

Thanks, I'll try that. Please read the update I have one weird idea. – Borer 28/11, 2012 at 10:43

Also, I understand scanning stack frames and the data section to identify pointer like variables but how can I extract pointer information from custom objects like struct? – Borer 28/11, 2012 at 10:47

You can use "conservative" scanning on plain structs, where you don't know the pointer locations. You do need to know the size of the struct, however. – Stipitate 28/11, 2012 at 13:25

Thanks again. I'll surely go through that algorithm. Also, by scanning the struct you mean iterating over all the data in it and trying to guess if it's a pointer, right? – Borer 28/11, 2012 at 16:52

One more question, if while scanning I encounter a data in stack say "123123123" which also belongs to the region of memory allocations I have done. That doesn't mean its a pointer, right? It can be very well an integer with that value. I believe that is what meant by conservative gc, right? – Borer 28/11, 2012 at 16:56

Right on both counts. In Java you have the advantage that a pointer must point to the START of an object, so you can check that specifically, to weed out "look-alike" pointers. But in the general case for C you must check if the pointer falls anywhere INSIDE the object. – Stipitate 28/11, 2012 at 18:47

@HotLicks, the only conservative GC that I know about doesn't even do that. You absolutely need a reference to the beginning of an allocation (and that makes it unsuitable for certain situations). – Centrum 28/11, 2012 at 19:5

@Centrum -- Like I said, "the general case". There are going to be so many damn exceptions and special cases with C (and even worse with optimized C) that it's hard to pare it down to some "clean" rules. – Stipitate 28/11, 2012 at 19:11

I'm trying to think -- I think we went back and forth on the GC for iSeries Classic JDK. It depended on what the JITC was generating and they could then couldn't then could ... guarantee that there was always a reference to the start of the object. But it's no harder to check for a pointer inside an object than at the beginning, if you have the object boundary stuff worked out. – Stipitate 30/11, 2012 at 2:23

Recommended topics

Hot tags