How does the Linux kernel determine ld.so's load address?
Asked Answered
C

1

3

I know that the dynamic linker uses mmap() to load libraries. I guess it is the kernel who loads both the executable and its .interpreter into the same address space, but how does it determine where? I noticed that ld.so's load address with ASLR disabled is 0x555555554000 (on x86_64) — where does this address come from? I tried following do_execve()'s code path, but it is too ramified for me not to be confused as hell.

Coggins answered 24/4, 2015 at 19:38 Comment(3)
Imagining from scratch, it seems like the address must either be chosen by the kernel or encoded in the file. You looked for evidence in the kernel, have you tried looking in the file?Subtangent
@ChrisStratton I dissected both the executable and the dynamic linker — there's nothing that looks remotely like this address.Coggins
Why do you ask? Why do you care? Please edit your question to motivate it? Are you coding a new dynamic linker?Dehisce
D
2

Read more about ELF, in particular elf(5), and about the execve(2) syscall.

An ELF file may contain an interpreter. elf(5) mentions:

PT_INTERP The array element specifies the location and size of a null-terminated pathname to invoke as an interpreter. This segment type is meaningful only for executable files (though it may occur for shared objects). However it may not occur more than once in a file. If it is present, it must precede any loadable segment entry.

That interpreter is practically almost always ld-linux(8) (e.g. with GNU glibc), more precisely (on my Debian/Sid) /lib64/ld-linux-x86-64.so.2. If you compile musl-libc then build some software with it you'll get a different interpreter, /lib/ld-musl-x86_64.so.1. That ELF interpreter is the dynamic linker.

The execve(2) syscall is using that interpreter:

If the executable is a dynamically linked ELF executable, the interpreter named in the PT_INTERP segment is used to load the needed shared libraries. This interpreter is typically /lib/ld-linux.so.2 for binaries linked with glibc.

See also Levine's book on Linkers and loaders, and Drepper's paper: How To Write Shared Libraries

Notice that execve is also handling the shebang (i.e. first line starting with #!); see the Interpreter scripts section of execve(2). BTW, for ELF binaries, execve is doing the equivalent of mmap(2) on some segments.

Read also about vdso(7), proc(5) & ASLR. Type cat /proc/self/maps in your shell.

(I guess, but I am not sure, that the 0x555555554000 address is in the ELF program header of your executable, or perhaps of ld-linux.so; it might also come from the kernel, since 0x55555555 seems to appear in the kernel source code)

Dehisce answered 24/4, 2015 at 20:11 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.