I'm assuming if this is possible it would most likely be via mmap
My original question was can mmap(
) return the same pointer for each user of the same piece of shared memory?
Ultimately an address in each processes address space maps to the same physical memory.
The question is whether the address provided to each process can be made the same somehow.
Why does this matter?
Consider the case of sharing a data structure which contains pointers. If the address mapping is shared those pointers are valid for both processes. If the address mapping is not shared then a pointer may point to the address space for the process that wrote it. If the other process tries to read it, it will trigger an access violation.
For shared libraries we rely on position independent code but there can still be pointers that the loader needs to adjust on load. (I am unclear whether this is done exactly once or whether each process gets its own private copy of pages containing such pointers adjusted to fit its own address space).
If the address mapping is not shared then for data we either need to create position independent structures (unfortunate acronym) using offsets instead of absolute pointers or limit ourselves to data structures not containing any pointers.
In my application I have several data structures I wish to share which are essentially vectors. I do not wish to persist or serialise/deserialise them. I want the same physical memory to be used by each process. Basically its a search problem - the shared data is the haystack and each process searches for its own set of needles. For various reasons we do not want them to be threads though it would eliminate this question entirely because threads do share the address space.
I can create vectors inside shared memory using a custom allocator. Say:
SharedMemoryAllocator sharedAlloc(somePointerFromMmap);
class Foo
{
std::vector<Bar, SharedMemoryAllocator> A;
std::vector<Snafu, SharedMemoryAllocator> B;
Foo(SharedMemoryAllocator& sharedAlloc):
A(sharedAlloc)
B(sharedAlloc)
{
}
};
If the address mapping is shared it is safe to use such an allocator for the complete structure:
unique_ptr<Foo> fubar = sharedAlloc.new(Foo(sharedAlloc));
Because the pointers to storage for A and B with *fubar
will be in the same address space for both.
If on the other hand the address mapping is not shared I can only safely share the underlying memory blocks to which A and B point.
That is each process must have its own local instance of Foo where each vector is constructed by attaching to a shared memory block provided by the allocator. This is more portable but uglier.
Technically neither conforms to the C++ spec without the addition of std::bless() and other wizardry.
So we are in undefined (or rather platform defined) behaviour anyway. But then so is 'safe' use of malloc
(see link above).
As far as I know nothing about the address returned by mmap()
is guaranteed by Posix or Linux except for the case where you ask for a specific address and the call does not fail.
So if it happens to work it is probably at best undefined behaviour. Its generally a bad idea to rely on undefined behaviour for all the usual reasons. But perhaps it is actually platform defined rather than undefined?
As mentioned the exception is for a fixed address mapping but how can you get two independent processes to agree on the same safe address to use in advance? One possibility is given here
This suggests you need a second shared memory segment or some other IPC mechanism to share the address assigned to the creator of the shared memory block you wish to use.
Is this the correct way? Is it the only way?
Looking at When would one use mmap MAP_FIXED? - the main legitimate use of MAP_FIXED is to remap different kinds of memory segments which need to be at the same relative address to each other when loading a library
However other people are using this the way I suggest. The other answer to that question mentions:
- Shared memory may contain pointers.
I found a few other references to people using MAP_FIXED to do this but I have not yet found a working example.
Has use of MAP_FIXED this way been rendered non-functional by ASLR ?
It is actually the case that you can only share address space in very specific circumstances:
For example only if:
You use MAP_ANONYMOUS and
pass the FD to an unrelated process via sendmsg/recvmsg
Apparently linux does not support I_SENDFD/I_RECVFD
Will MAP_FIXED_NOREPLACE then work in this case?
Does it make a difference whether you use shm_open() or open() ?
I've looked into the code for boost interprocess and it does not seem to use either MAP_FIXED or sendmsg. It seems this is not supported.
The approach I am currently trying is simplifying:
process A:
fd = open("foobar", O_RDWR);
void* address = mmap(nullptr, PROT_READ|PROT_WRITE, MAP_SHARED, fd);
(*address) = address;
process B:
fd = open("foobar", O_RDONLY);
void* addressRequested = readAddressFromFoobar();
void* address = mmap(addressRequested, PROT_READ, MAP_SHARED|MAP_FIXED, fd);
This fails with E_NOMEM. While if I replace addressRequested with a nullptr it succeeded and I can access the same data but can't rely on any pointers.
Can somebody demonstrate or link to a way to do this that works or explain definitively why this cannot ever work in current Linux.
I am quite aware that we can share objects using internal offsets instead of pointers but loses a lot of convenience. STL types generally assume they can use pointers. I do not wish to write my own versions of every container I wish to use if I can possibly avoid it.
Apparently boost works around this issue by providing containers that use smart pointers instead of raw pointers. I had not realised that. This is a good solution to the general problem but a different question from this one. Locating a canonical explanation of that would be useful as a better answer to Does boost interprocess support sharing objects containing pointers between processes? or indeed a better question.
mmap
system call? It answers your question. – Honeymanmmap()
at that address could fail due to something else in the second process already using virtual memory addresses within that range, no? – Sorption0x1000000
through0x2000000
as part of its heap, and you try tommap()
your shared memory-region to start at virtual-address0x1000005
, then (hopefully)mmap()
will fail because the virtual-addresses inside process B that it wants to redefine (to point to the shared-memory region) are already in use for another purpose. And with ASLR, there's (deliberately) no way to know what virtual-address ranges are going to be in use or not. – SorptionMAP_FIXED
the mapping will succeed. Address space layout randomization, a non-negotiable, mandatory, security requirement of all modern operating systems (including Linux) makes any kind of a fixed mapping a non-starter and completely impossible. The End. – HoneymanMAP_FIXED_NOREPLACE
looks like it will keep the process from accidentally lobotomizing itself, but in the event of a would-be collision you're still left with the problem that yourmmap()
call has failed and so your process can't access the shared memory region. – Sorptionmalloc.c
which empirically works for each pointer size/platform of interest. Whether this is always safe, or only works when you're replacingmalloc
and no other shared libraries are callingmmap
, I don't know. Anyway, my answer is just seeking to discover a suitable base at runtime, instead of hardcoding it and hoping it never changes. – Byerly