Linux, will zeroed page pagefault on first read or on first write?
Asked Answered
B

1

11

My question is Linux specific and needs understanding of kernel, virtual memory, mmap, pagefaults. I have C program with large static arrays, which will go into bss section (memory, initialized to zero). When program starts, this memory is not physically allocated, there is only virtual memory, and every page of virtual memory is mapped to the special zero page (the page of all zeroes). When application access this page, the pagefault will be generated and physical page will be allocated.

The question is: Will such pagefault be generated on first read access or on first write access to the page from bss section?

Boxboard answered 24/8, 2012 at 19:19 Comment(0)
P
10

Linux allocates a zero page to this memory (one zero page for the whole region) and then will essentially do COW (copy on write) behavior on the page because you are changing the contents. So you will not get read faults (unless the page was swapped out or its a minor page fault which means the page was in memory but not mapped).

So only write faults will cause a fault that causes allocation of a new page on the zero page.

Pelagi answered 24/8, 2012 at 19:23 Comment(6)
Jesus, thanks for quick answer. Is there some way to disable this COW on zero pages? There is MAP_POPULATE for mmaps, but what with static arrays (.bss)Boxboard
@Boxboard Oh ok I see what you mean, just iterate the array and write zeroes to it again, it will cause a write fault (write fault doesnt care what you're writing it will just COW the page), this way all your pages will be in memory and mapped to separate physical pages.Pelagi
@Boxboard The above is exactly what map populate does on mmap, it just iterates the region to enforce that any access to the pages will not cause a minor fault.Pelagi
@Boxboard You can use mlock(const void *addr, size_t len) with the address of your first array and size of your .bss section.Barnet
mlock will do some extra action: it will disable paging.Boxboard
Yeah I wouldn't recommend using mlock unless you're using a real time application where regions of memory that might be swapped out could cause a deadline miss.Pelagi

© 2022 - 2024 — McMap. All rights reserved.