How does copy-on-write in fork() handle multiple fork?
Asked Answered
M

2

10

According to wikipedia (which could be wrong)

When a fork() system call is issued, a copy of all the pages corresponding to the parent process is created, loaded into a separate memory location by the OS for the child process. But this is not needed in certain cases. Consider the case when a child executes an "exec" system call (which is used to execute any executable file from within a C program) or exits very soon after the fork(). When the child is needed just to execute a command for the parent process, there is no need for copying the parent process' pages, since exec replaces the address space of the process which invoked it with the command to be executed.

In such cases, a technique called copy-on-write (COW) is used. With this technique, when a fork occurs, the parent process's pages are not copied for the child process. Instead, the pages are shared between the child and the parent process. Whenever a process (parent or child) modifies a page, a separate copy of that particular page alone is made for that process (parent or child) which performed the modification. This process will then use the newly copied page rather than the shared one in all future references. The other process (the one which did not modify the shared page) continues to use the original copy of the page (which is now no longer shared). This technique is called copy-on-write since the page is copied when some process writes to it.

It seems that when either of the process tries to write to the page. A new copy of the page is allocated and assigned to the process that generated the page fault. The original page is marked writable afterwards.

My question is: what happens if the fork is called multiple times before any of the process made an attempt to write to a shared page?

Mapes answered 11/12, 2012 at 4:26 Comment(1)
"which could be wrong" is not true, this is how it happens in at least one large scale high availability deployment that I know of. large scale is about 30K CPUs forking about 20K childs.Deglutinate
B
12

If fork is called multiple times from the original parent process, then each of the children and parent will have their pages marked as read-only. When a child process attempts to write data then the page from the parent process is copied to its address space and the copied page is marked as writeable in the child but not in the parent.

If fork is called from the child process and the grand-child attempts to write, the page from the original parent is copied to the first child, and then to the grand child, and all is marked as writeable.

Belia answered 11/12, 2012 at 4:47 Comment(2)
Thanks perreal, I think the first part of the answer looks the same as rici's. I have the same question here, where does the reference counter stored in linux? The best way that I can come up with is to store it on the pte, but obviously there's not enough room.Mapes
it is in the task->mm->map_countBelia
K
2

The original page is only marked writeable if it belongs to a single process, which might not be the case if there were multiple forks. The new page is always marked as writeable because it only belongs to the process which attempted to write it.

Klotz answered 11/12, 2012 at 4:42 Comment(4)
Hi rici, thanks for the answer. So how does the linux track how many virtual addresses get mapped to a shared page?Mapes
it keeps a reference count. After all, it's in complete charge of page allocations, so it's not difficult. Anyway, it needs to know when the page's last consumer terminates.Klotz
Sorry for being specific, but may I ask where is that reference count stored in linux?Mapes
@hydrology It's stored in the struct page, see lxr.linux.no/linux+v3.6.10/include/linux/mm_types.hKlotz

© 2022 - 2024 — McMap. All rights reserved.