I have a memory segment which was obtained via mmap
with MAP_ANONYMOUS
.
How can I allocate a second memory segment of the same size which references the first one and make both copy-on write in Linux (Working Linux 2.6.36 at the moment)?
I want to have exactly the same effect as fork
, just without creating a new process. I want the new mapping to stay in the same process.
The whole process has to be repeatable on both the origin and copy pages (as if parent and child would continue to fork
).
The reason why I don't want to allocate a straight copy of the whole segment is because they are multiple gigabytes large and I don't want to use memory which could be copy-on-write shared.
What I have tried:
mmap
the segment shared, anonymous.
On duplication mprotect
it to read-only and create a second mapping with remap_file_pages
also read-only.
Then use libsigsegv
to intercept write attempts, manually make a copy of the page and then mprotect
both to read-write.
Does the trick, but is very dirty. I am essentially implementing my own VM.
Sadly mmap
ing /proc/self/mem
is not supported on current Linux, otherwise a MAP_PRIVATE
mapping there could do the trick.
Copy-on-write mechanics are part of the Linux VM, there has to be a way to make use of them without creating a new process.
As a note: I have found the appropriate mechanics in the Mach VM.
The following code compiles on my OS X 10.7.5 and has the expected behaviour:
Darwin 11.4.2 Darwin Kernel Version 11.4.2: Thu Aug 23 16:25:48 PDT 2012; root:xnu-1699.32.7~1/RELEASE_X86_64 x86_64 i386
gcc version 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)
#include <sys/mman.h>
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#ifdef __MACH__
#include <mach/mach.h>
#endif
int main() {
mach_port_t this_task = mach_task_self();
struct {
size_t rss;
size_t vms;
void * a1;
void * a2;
char p1;
char p2;
} results[3];
size_t length = sysconf(_SC_PAGE_SIZE);
vm_address_t first_address;
kern_return_t result = vm_allocate(this_task, &first_address, length, VM_FLAGS_ANYWHERE);
if ( result != ERR_SUCCESS ) {
fprintf(stderr, "Error allocating initial 0x%zu memory.\n", length);
return -1;
}
char * first_address_p = first_address;
char * mirror_address_p;
*first_address_p = 'a';
struct task_basic_info t_info;
mach_msg_type_number_t t_info_count = TASK_BASIC_INFO_COUNT;
task_info(this_task, TASK_BASIC_INFO, (task_info_t)&t_info, &t_info_count);
task_info(this_task, TASK_BASIC_INFO, (task_info_t)&t_info, &t_info_count);
results[0].rss = t_info.resident_size;
results[0].vms = t_info.virtual_size;
results[0].a1 = first_address_p;
results[0].p1 = *first_address_p;
vm_address_t mirrorAddress;
vm_prot_t cur_prot, max_prot;
result = vm_remap(this_task,
&mirrorAddress, // mirror target
length, // size of mirror
0, // auto alignment
1, // remap anywhere
this_task, // same task
first_address, // mirror source
1, // Copy
&cur_prot, // unused protection struct
&max_prot, // unused protection struct
VM_INHERIT_COPY);
if ( result != ERR_SUCCESS ) {
perror("vm_remap");
fprintf(stderr, "Error remapping pages.\n");
return -1;
}
mirror_address_p = mirrorAddress;
task_info(this_task, TASK_BASIC_INFO, (task_info_t)&t_info, &t_info_count);
results[1].rss = t_info.resident_size;
results[1].vms = t_info.virtual_size;
results[1].a1 = first_address_p;
results[1].p1 = *first_address_p;
results[1].a2 = mirror_address_p;
results[1].p2 = *mirror_address_p;
*mirror_address_p = 'b';
task_info(this_task, TASK_BASIC_INFO, (task_info_t)&t_info, &t_info_count);
results[2].rss = t_info.resident_size;
results[2].vms = t_info.virtual_size;
results[2].a1 = first_address_p;
results[2].p1 = *first_address_p;
results[2].a2 = mirror_address_p;
results[2].p2 = *mirror_address_p;
printf("Allocated one page of memory and wrote to it.\n");
printf("*%p = '%c'\nRSS: %zu\tVMS: %zu\n",results[0].a1, results[0].p1, results[0].rss, results[0].vms);
printf("Cloned that page copy-on-write.\n");
printf("*%p = '%c'\n*%p = '%c'\nRSS: %zu\tVMS: %zu\n",results[1].a1, results[1].p1,results[1].a2, results[1].p2, results[1].rss, results[1].vms);
printf("Wrote to the new cloned page.\n");
printf("*%p = '%c'\n*%p = '%c'\nRSS: %zu\tVMS: %zu\n",results[2].a1, results[2].p1,results[2].a2, results[2].p2, results[2].rss, results[2].vms);
return 0;
}
I want the same effect in Linux.
/dev/shm
(tmpfs) is as far as I am willing to go with file-backed memory. – Raymervm_remap
call in the mach code. This is exactly the semantics that I want - just in Linux. – Raymer