Minimal runnable example
What does brk( ) system call do?
Asks the kernel to let you you read and write to a contiguous chunk of memory called the heap.
If you don't ask, it might segfault you when you try to read and write from that area.
Without brk
:
#define _GNU_SOURCE
#include <unistd.h>
int main(void) {
/* Get the first address beyond the end of the heap. */
void *b = sbrk(0);
int *p = (int *)b;
/* May segfault because it is outside of the heap. */
*p = 1;
return 0;
}
With brk
:
#define _GNU_SOURCE
#include <assert.h>
#include <unistd.h>
int main(void) {
void *b = sbrk(0);
int *p = (int *)b;
/* Move it 2 ints forward */
brk(p + 2);
/* Use the ints. */
*p = 1;
*(p + 1) = 2;
assert(*p == 1);
assert(*(p + 1) == 2);
/* Deallocate back. */
brk(b);
return 0;
}
GitHub upstream.
The above might not hit a new page and not segfault even without the brk
, so here is a more aggressive version that allocates 16MiB and is very likely to segfault without the brk
:
#define _GNU_SOURCE
#include <assert.h>
#include <unistd.h>
int main(void) {
void *b;
char *p, *end;
b = sbrk(0);
p = (char *)b;
end = p + 0x1000000;
brk(end);
while (p < end) {
*(p++) = 1;
}
brk(b);
return 0;
}
Tested on Ubuntu 18.04.
Virtual address space visualization
Before brk
:
+------+ <-- Heap Start == Heap End
After brk(p + 2)
:
+------+ <-- Heap Start + 2 * sizof(int) == Heap End
| |
| You can now write your ints
| in this memory area.
| |
+------+ <-- Heap Start
After brk(b)
:
+------+ <-- Heap Start == Heap End
To better understand address spaces, you should make yourself familiar with paging: How does x86 paging work?.
Why do we need both brk
and sbrk
?
brk
could of course be implemented with sbrk
+ offset calculations, both exist just for convenience.
In the backend, the Linux kernel v5.0 has a single system call brk
that is used to implement both: https://github.com/torvalds/linux/blob/v5.0/arch/x86/entry/syscalls/syscall_64.tbl#L23
12 common brk __x64_sys_brk
Is brk
POSIX?
brk
used to be POSIX, but it was removed in POSIX 2001, thus the need for _GNU_SOURCE
to access the glibc wrapper.
The removal is likely due to the introduction mmap
, which is a superset that allows multiple range to be allocated and more allocation options.
I think there is no valid case where you should to use brk
instead of malloc
or mmap
nowadays.
brk
vs malloc
brk
is one old possibility of implementing malloc
.
mmap
is the newer stricly more powerful mechanism which likely all POSIX systems currently use to implement malloc
. Here is a minimal runnable mmap
memory allocation example.
Can I mix brk
and malloc?
If your malloc
is implemented with brk
, I have no idea how that can possibly not blow up things, since brk
only manages a single range of memory.
I could not however find anything about it on the glibc docs, e.g.:
Things will likely just work there I suppose since mmap
is likely used for malloc
.
See also:
More info
Internally, the kernel decides if the process can have that much memory, and earmarks memory pages for that usage.
This explains how the stack compares to the heap: What is the function of the push / pop instructions used on registers in x86 assembly?
brk()
system call is more useful in assembly language than in C. In C,malloc()
should be used instead ofbrk()
for any data-allocation purposes -- but this does not invalidate the proposed question in any way. – Dregbrk()
andsbrk()
? The stacks are managed by the page allocator, at a much lower level. – Joshi