The answer to both of them is the same and very simple indeed.
As you might know, Glibc malloc
will use mmap
to directly allocate a block larger than 128 KiB. However, it will need to write bookkeeping information below the pointer - because how else would free
know what it should be done when just given a pointer. If you print the pointer that malloc
returned, you'll see that it is not page aligned.
Here's a program that demonstrates all this:
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
#include <sys/resource.h>
#define ALLOCATION_SIZE (1 << 20)
int main(void) {
struct rusage usage = {0};
getrusage(RUSAGE_SELF, &usage);
printf("1st before malloc: %lu\n", usage.ru_minflt);
getrusage(RUSAGE_SELF, &usage);
printf("2nd before malloc: %lu\n", usage.ru_minflt);
char *p = malloc(ALLOCATION_SIZE);
printf("pointer returned from malloc: %p\n", p);
getrusage(RUSAGE_SELF, &usage);
printf("after malloc: %lu\n", usage.ru_minflt);
p[0] = 42;
getrusage(RUSAGE_SELF, &usage);
printf("after writing to the beginning of the allocation: %lu\n", usage.ru_minflt);
for (size_t i = 0; i < ALLOCATION_SIZE; i++) {
p[i] = 42;
}
getrusage(RUSAGE_SELF, &usage);
printf("after writing to every byte of the allocation: %lu\n", usage.ru_minflt);
}
outputs something like
1st before malloc: 108
2nd before malloc: 118
pointer returned from malloc: 0x7fbcb32aa010
after malloc: 119
after writing to the beginning of the allocation: 119
after writing to every byte of the allocation: 375
i.e. getrusage
and printf
cause page faults the first time around, so we call it twice - now the fault count is 118 before the malloc
call, and after malloc
it is 119. If you look at the pointer, 0x010 is not 0x000 i.e. the allocation is not page-aligned - those first 16 bytes contain bookkeeping information for free
so that it knows that it needs to use munmap
to release the memory block, and the size of the allocated block!
Now naturally this explains why the size increase was 1028 Ki instead of 1024 Ki - one extra page had to be reserved so that there would be enough space for those 16 bytes! It also explains the source of the page fault - because malloc
had to write the bookkeeping information to the copy-on-write zeroed page. This can be proved by writing to the first byte of the allocation - it doesn't cause a page fault any longer.
Finally the for loop will modify the pages and touch the remaining 256 pages out of those 257 mapped in.
And if you change ALLOCATION_SIZE
to ((1 << 20) - 16)
i.e. allocate just 16 bytes less, you'd see that the both virtual size and the number of page faults would match the values you were expecting.
malloc
will read/write to memory to maintain its data structures (e.g. used and free lists). – Mohlermalloc
actually write to memory. This is new to me. Thank you. – Pinot