What's unsafe/legacy about brk/sbrk?
Asked Answered
F

1

11

I've heard in a lot of places (musl mailing list, macOS forums, etc.) that brk() and sbrk() are unsafe. Many of these places either don't give explanations at all, or give very vague explanations. For example, this link states that "these functions are fundamentally broken", and goes on to say that the malloc and sbrk subsystems are utterly broken, that they ruin the heap, et al.

My question is: Why is this so? If malloc is used in such a way that it allocates a block of memory with sbrk large enough to quell or substantially decrease the need for further allocations, then shouldn't sbrk and brk be perfectly safe to use?

Here are my implementations of sbrk and brk:

sbrk:

#include <unistd.h>
#include <stddef.h>

void *sbrk(intptr_t inc)
{
        intptr_t curbrk = syscall(SYS_brk, NULL);

        if( inc == 0 ) goto ret;

        if( curbrk < 0 ) return (void *)-1;

        curbrk((void *)(curbrk+inc));
ret:
        return (void *)curbrk;
}

brk:

#include <unistd.h>

intptr_t brk(void *ptr)
{
        if( (void *)syscall(SYS_brk, ptr) != ptr )
                return -1;
        else
                return 0;
}
Free answered 26/3, 2019 at 23:50 Comment(10)
There is explenation right there in the text For an application to use them correctly, it must depend on the malloc subsystem never being used, but this is impossible to guarantee since malloc may be used internally by libc functions without documenting this to the application.. If your app calls sbrk and your app calls malloc and malloc calls sbrk and sbrk function uses global context, the result will be a mess. They want to eliminate the sbrk function, to make users not use it, not because it is a bad design per se.Gerdagerdeen
What does "global context" mean? A quick Google search doesn't reveal anything.Free
sbrk usually operates on a single global variable, ex __curbrk in glibc or static char *heap_end; in newlib. That means that all "contexts" (threads, functions, malloc, printf and user app) share the same ("global" to the process) data segment.Gerdagerdeen
That's odd, my implementation just uses the return code from syscall(SYS_brk, 0).Free
My guess is that certain malloc implementations are written to assume that nobody else is calling sbrk behind their back. For instance, suppose you call p = malloc(N) where N is some large number (maybe a multiple of page size). malloc does sbrk(N) to get memory from the OS and returns this pointer to you. Later (without any intervening calls to malloc) you do free(p). Now malloc "knows" that it can do sbrk(-N) to return the memory to the OS. If you have called sbrk in between, then you have a problem.Resolute
Of an alternative which would be a lot safer is to sbrk a multiple of N and then implement a free(...) which only marks the memory as available for other uses. You really have a lot of control when you write a standard C library from scratch :).Free
That would alleviate the possibility of an sbrk screwing anything up, because malloc's memory has already been allocated.Free
Then, of course, at the end of program execution all freed memory could be cleaned up and deallocated.Free
awkwardly written, but good basic malloc introduction for writing your own memory manager with brk/sbrk (intro level only)Stoneham
Thanks, I'll look into it.Free
S
6

The reality highly depends on the implementation but here are some elements:

unsafe

brk/sbrk were invented to allow a process to request more memory from the system, and release it in a single contiguous segment. As such, they were used by many malloc and free implementations. The problem was that, as it returned a unique segment, things would go wrong as multiple modules (of the same process) use it directly. It became even worse in a multi-threaded process because of race conditions. Suppose 2 threads want to add new memory. They will look at the current top address with sbrk(0), see the same address, request new memory with either brk or sbrk, and because of the race condition, will both use the same memory.

Even in a single threaded process, some malloc and free implementations assume that they only are allowed to use the low level s/brk interface, and that any other code should use them. In that case things will go wrong if the image of the break segment that they internally maintain is no longer the assumed value. They should have to guess that some parts of the segment are "reserved" for other uses, possibly breaking the ability to release any memory.

For that reason, user code should never directly use brk/sbrk and only rely on malloc/free. If, and only if, you are writing an implementation of the standard library including malloc/realloc/calloc/free, you can safely use brk/sbrk

legacy

On modern system, mmap can make a much nicer usage of virtual memory management. You can use as many dynamic memory segments as you need with no interaction between them. So, on a modern system, unless you have a specific need for memory allocation using brk/sbrk, you should use mmap.

portability

The FreeBSD reference for brk and sbrk states this:

The brk() and sbrk() functions are legacy interfaces from before the advent of modern virtual memory management.

and later:

BUGS: Mixing brk() or sbrk() with malloc(3), free(3), or similar functions will result in non-portable program behavior.

Snooperscope answered 27/3, 2019 at 15:5 Comment(2)
That's about as good an explanation I've seen. Don't mix malloc with your owns calls to brk/sbrk. I think you are right. If malloc can't find the break where it left it, then it likely won't release any of the memory it originally allocated. I haven't looked at the malloc source, but that makes sense.Stoneham
This is definitely a better explanation than I've seen so far. However, it begs the question: If I can make malloc() and free() check for program break changes, is it possible to safely use sbrk() and brk() with my C library? I'll also give you a +1 for mentioning threads. I haven't implemented them yet, so I hadn't thought about it. Thanks!Free

© 2022 - 2024 — McMap. All rights reserved.