Sharing executable memory pages in Linux?
Asked Answered
T

2

2

Is it possible to share executable pages on Linux for the sake of preserving space? I know that there are shared memory APIs that can be used to share memory between different processes but I don't think that is meant to be used for that.

Basically, I want to have a shared memory region where some commonly used shared libraries can be loaded into. I want to get the dynamic linker to link against the preloaded (read only) images instead of having to load all of the shared library images into every single process (which seems like a waste).

Is this possible on the Linux kernel? The Darwin kernel implements this using a feature of Mach VM known as commpages (the dyld shared cache is stored there). The commpages are accesible to and shared between every process.

Just to clarify, I know what shared objects (libraries) are. Currently, what the dynamic linker does on Linux is it loads all the required libraries into the program's address space, which means that each application that links against libc (for example) will have an image of libc somewhere in its address space. On Darwin, this issue can be eliminated by having the executable (and other read only) sections of libc on a set of shared memory pages. The writable sections of the shared images are still separate.

Edit: I know that the ELF format does not support separating DATA and TEXT segments of the shared libraries. I'm not using ELF, I'm using a different binary format (with my own binfmt kernel module and my own dynamic linker). I'm interested if the Linux kernel supports a commpage-like feature.

Edit 2: The only way I can think of doing this would be to allocate a big slab of memory in the kernel and map it into every binary that gets executed. The first time any binary is executed, the dynamic linker could unprotect it, fill it with the desired data and protect it. Then somehow, the kernel would have to make sure that the memory segment is not modified by anything else as it would open a massive security hole. Another

Thrash answered 11/4, 2012 at 0:52 Comment(0)
T
6

As geekosaur said, Linux already does this.

At application startup the dynamic linker (ld.so) mmap()s the shared libraries. It performs several calls to mmap() for each library:

  • mmap(PROT_READ|PROT_EXEC) for the executable section (i.e. .text)
  • mmap(PROT_READ|PROT_WRITE) for the data (i.e. .data and .bss)

(You can check this for yourself using strace.)

The kernel, being a clever little bit of code, realises that the executable section, identified by offset and the inode (known through the fd), is already mapped. As it's read-only there's no point in allocating more memory for it.

This also means that if you have any other file which you mmap() read-only from several application the memory will also be consumed only once.

Thelmathem answered 11/4, 2012 at 19:48 Comment(1)
Kristof's answer is typically glossed in discussions of linking and loading. There are two notions of sharing: referring to the same code object (in a file) and accessing the same page (of code) in physical memory. The kernel mmap() function understands the latter, and linking only understands it enough to let it happen. Even two processes running the same statically linked program may share physical code pages, the latter notion.Peculiar
P
4

Linux already does this; in fact, that's what a shared object is about/for.

Plumose answered 11/4, 2012 at 0:55 Comment(4)
Shared objects are duplicated into every application's address space.Thrash
They are mapped into every applications address space, but they're only physically present on memory once.Thelmathem
Which part of ld-linux does that?Thrash
@NickBrooks, I don't understand your confusion here; what do you believe the distinction between shared and static libraries to be if the pages being shared between applications isn't it?Plumose

© 2022 - 2024 — McMap. All rights reserved.