Is Dynamic Linker part of Kernel or GCC Library on Linux Systems?
Asked Answered
S

1

10

Is Dynamic Linker (aka Program Interpreter, Link Loader) part of Kernel or GCC Library ?

UPDATE (28-08-16):

I have found that the default path for dynamic linker that every binary (i.e linked against a shared library) uses /lib64/ld-linux-x86-64.so.2 is a link to the shared library /lib/x86_64-linux-gnu/ld-2.23.so which is the actual dynamic linker.

And It is part of libc6 (2.23-0ubuntu3) package viz. GNU C Library: Shared libraries in ubuntu for AMD64 architectures.

My actual question was

what would happen to all the applications that are dynamically linked (all,now a days), if this helper program (ld-2.23.so) doesn't exist ?

And answer to that is " no application would run, even the shell program ". I've tried it on virutal machine.

Shock answered 9/8, 2016 at 18:12 Comment(3)
ld.so is not a part of the kernel, but is loaded by it.Terpineol
It's actually part of GNU libc on Linux; see cs.virginia.edu/~dww4s/articles/ld_linux.htmlCrinkly
@Crinkly That link doesn't work; 403 error.Stonyhearted
R
17

In an ELF executable, this is referred to as the "ELF interpreter". On linux (e.g.) this is /lib64/ld-linux-x86-64.so.2

This is not part of the kernel and [generally] with glibc et. al.

When the kernel executes an ELF executable, it must map the executable into userspace memory. It then looks inside for a special sub-section known as INTERP [which contains a string that is the full path].

The kernel then maps the interpreter into userspace memory and transfers control to it. Then, the interpreter does the necessary linking/loading and starts the program.

Because ELF stands for "extensible linker format", this allows many different sub-sections with the ELF file.

Rather than burdening the kernel with having to know about all the myriad of extensions, the ELF interpreter that is paired with the file knows.

Although usually only one format is used on a given system, there can be several different variants of ELF files on a system, each with its own ELF interpreter.

This would allow [say] a BSD ELF file to be run on a linux system [with other adjustments/support] because the ELF file would point to the BSD ELF interpreter rather than the linux one.


UPDATE:

every process(vlc player, chrome) had the shared library ld.so as part of their address space.

Yes. I assume you're looking at /proc/<pid>/maps. These are mappings (e.g. like using mmap) to the files. That is somewhat different than "loading", which can imply [symbol] linking.

So primarily loader after loading the executable(code & data) onto memory , It loads& maps dynamic linker (.so) to its address space

The best way to understand this is to rephrase what you just said:

So primarily the kernel after mapping the executable(code & data) onto memory, the kernel maps dynamic linker (.so) to the program address space

That is essentially correct. The kernel also maps other things, such as the bss segment and the stack. It then "pushes" argc, argv, and envp [the space for environment variables] onto the stack.

Then, having determined the start address of ld.so [by reading a special section of the file], it sets that as the resume address and starts the thread.

Up until now, it has been the kernel doing things. The kernel does little to no symbol linking.

Now, ld.so takes over ...

which further Loads shared Libraries , map & resolve references to libraries. It then calls entry function (_start)

Because the original executable (e.g. vlc) has been mapped into memory, ld.so can examine it for the list of shared libraries that it needs. It maps these into memory, but does not necessarily link the symbols right away.

Mapping is easy and quick--just an mmap call.

The start address of the executable [not to be confused with the start address of ld.so], is taken from a special section of the ELF executable. Although, the symbol associated with this start address has been traditionally called _start, it could actually be named anything (e.g. __my_start) as it is what is in the section data that determines the start address and not address of the symbol _start

Linking symbol references to symbol definitions is a time consuming process. So, this is deferred until the symbol is actually used. That is, if a program has references to printf, the linker doesn't actually try to link in printf until the first time the program actually calls printf

This is sometimes called "link-on-demand" or "on-demand-linking". See my answer here: Which segments are affected by a copy-on-write? for a more detailed explanation of that and what actually happens when an executable is mapped into userspace.

If you're interested, you could do ldd /usr/bin/vlc to get a list of the shared libraries it uses. If you looked at the output of readelf -a /usr/bin/vlc, you'll see these same shared libraries. Also, you'd get the full path of the ELF interpreter and could do readelf -a <full_path_to_interpreter> and note some of the differences. You could repeat the process for any .so files that vlc wanted.

Combining all that with /proc/<pid>maps et. al. might help with your understanding.

Rms answered 9/8, 2016 at 18:22 Comment(4)
@criag . As I have noticed , every process(vlc player, chrome) had the shared library ld.so as part of their address space. So primarily loader after loading the executable(code & data) onto memory , It loads& maps dynamic linker (.so) to its address space which further Loads shared Libraries , map & resolve references to libraries. It then calls entry function (_start) . Is this correct ?Shock
@craig. I have used the program ldd already which prints the shared dependencies .Shock
I would have guessed that you were familiar with ldd, I mentioned it [and other things] to produce a complete answer. Actually, ldd is a shell script. It simply sets the environment variable LD_TRACE_LOADED_OBJECTS and then does an exec of /usr/bin/vlc. The ELF interpreter [for vlc] notices this variable and does the printing (and terminates rather than full execution). ld.so can't use program arguments, so to provide it with options, environment variables are set. The manpage has a fairly complete list (e.g. man ld.so)Rms
@CraigEstey , ok, i got it now.Shock

© 2022 - 2024 — McMap. All rights reserved.