Get the copy-on-write behaviour of fork()ing, without fork()
Asked Answered
L

1

12

I have a large buffer:

char *buf = malloc(1000000000); // 1GB

If I forked a new process, it would have a buf which shared memory with the parent's buf until one or the other wrote to it. Even then, only one new 4KiB block would need to be allocated by the kernel, the rest would continue to be shared.

I'd like to make a copy of buf, but I'm only going to change a little of the copy. I'd like copy-on-write behaviour without forking. (Like you get for free when forking.)

Is this possible?

Linkboy answered 12/6, 2012 at 14:38 Comment(9)
sure, but it won't be 'for free' - you'll have to your own memory management and keep track of changes.Sharmainesharman
Yes, I want 'for free'. I was wondering whether there were any mmap based solutions, or maybe something I hadn't even imagined.Linkboy
Perhaps mmap with MAP_ANONYMOUS and MAP_PRIVATE would do the job?Linkboy
possible duplicate of Can I do a copy-on-write memcpy in Linux?Wurster
1000000000 Byte is not 1 GB. It should be 1073741824 (1024 * 1024 * 1024).Mayolamayon
@wabepper it depends on whose definition you go by. According to microsoft windows, yes, that's the case, but, according to more recent versions of Mac OSX, it IS 1 GB. There isn't that big of a difference, and in most situations the difference is marginal, at the point which you are storing that amount of information.Anthropophagite
Also, I just wanted to say MOO!Anthropophagite
@wabepper (feeding the troll) 10^9 bytes == 1 GB, 2^30 bytes == 1 GiB. Giga is an SI prefix.Linkboy
See also: #16966005 tl;dr: it's difficult on Linux.Gregorygregrory
K
12

You'll want to create a file on disk or a POSIX shared memory segment (shm_open) for the block. The first time, map it with MAP_SHARED. When you're ready to make a copy and switch to COW, call mmap again with MAP_FIXED and MAP_PRIVATE to map over top of your original map, and with MAP_PRIVATE to make the second copy. This should get you the effects you want.

Kerge answered 12/6, 2012 at 15:18 Comment(12)
That looks very encouraging, but I can't get it to work. I get a bus error (on line 13). fd == 3. Could you point out my stupid mistake? gist.github.com/2924412Linkboy
You need ftruncate to give the shared memory segment a size. The initial size is zero.Kerge
Thanks, I added an ftruncate and now have a segfault instead of a bus error, still at line 14.Linkboy
I suspect the crash is actually at line 17 where you write to the buffer. Your debug printf's are useless because you're not flushing output and they don't even end in \n..Kerge
Or it could even be way down. Your use of MAP_FIXED is wrong. You The second call to mmap should have MAP_FIXED, if you intend on making further modifications through buf, but you need to pass buf instead of NULL as the first argument in that case. As it stands, both of your mmap calls with MAP_FIXED are trying to map at the 0 address, which probably fails (you're not checking any return values!) and then you end up trying to use MAP_FAILED (usually defined as (void *)-1) as if it were a valid pointer.Kerge
The final mmap call should NOT have MAP_FIXED since you want a new virtual address for it.Kerge
Thanks for all your help. This will take me a few minutes to get right.Linkboy
It works! gist.github.com/2924412 Whay was the point of the commented out remapping of buf? I don't seem to need it. Many thanks.Linkboy
If you don't to the re-mapping of buf and modify buf after making the new private mapping but before modifying the corresponding page in buf2, it's possible that your changes to buf will wrongly show up in buf2. If you never intend to modify buf after making buf2, you can skip re-mapping buf private. But if you want to be able to modify them both without making your program randomly misbehave, you need to private re-mapping of buf.Kerge
Thanks, that explains it. I just need a series of 'frozen' buffers and one which I can mutate.Linkboy
AFAIU, it does not work. It might work on some OSes but on Linux, when you modify the original MAP_SHARED mapping, the MAP_PRIVATE will still see the pages backing the real file. The MAP_PRIVATE pages will only "fork" from the file when the data is modified via the MAP_PRIVATE VMA.Louls
Not if you write to the private map first.Kerge

© 2022 - 2024 — McMap. All rights reserved.