Why does a standalone C hello program crash when used as a dynamic linker
Asked Answered
B

1

5

The following program:

#include <stdio.h>

int main(int argc, char *argv[])
{
  for (int j = 0; j < argc; j++)
    printf("%d: %s\n", j, argv[j]);
  return 0;
}

built into a statically linked PIE:

gcc -g -fpie main.c -static-pie -o ld.so

works fine:

$ ./ld.so foo bar
0: ./ld.so
1: foo
2: bar

But when I use that program as an ELF interpreter for another program:

$ gcc -g main.c -Wl,-I./ld.so -o a.out

it crashes like so:

gdb -q ./a.out
(gdb) run
Starting program: /tmp/a.out 

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7da84e2 in __ctype_init () at ctype-info.c:31
31    *bp = (const uint16_t *) _NL_CURRENT (LC_CTYPE, _NL_CTYPE_CLASS) + 128;
(gdb) bt
#0  0x00007ffff7da84e2 in __ctype_init () at ctype-info.c:31
#1  0x00007ffff7d9e3bf in __libc_init_first (argc=argc@entry=1, argv=argv@entry=0x7fffffffd728, envp=0x7fffffffd738) at ../csu/init-first.c:84
#2  0x00007ffff7d575cd in __libc_start_main (main=0x7ffff7d56e29 <main>, argc=1, argv=0x7fffffffd728, init=0x7ffff7d57ce0 <__libc_csu_init>, fini=0x7ffff7d57d70 <__libc_csu_fini>, rtld_fini=0x0, 
    stack_end=0x7fffffffd718) at ../csu/libc-start.c:244
#3  0x00007ffff7d56d6a in _start () at ../sysdeps/x86_64/start.S:120

Why is that?

All the addresses above are within ./ld.so itself, so it crashes during its own initialization. Indeed the control would never reach a.out since ld.so exits.

Billet answered 14/4, 2019 at 20:10 Comment(0)
B
7

This took a bit longer to debug than I expected.

The crash is in:

Dump of assembler code for function __ctype_init:
   0x00007ffff7da84d0 <+0>:     mov    $0xffffffffffffffa0,%rax
   0x00007ffff7da84d7 <+7>:     mov    $0xfffffffffffffff0,%rcx
   0x00007ffff7da84de <+14>:    mov    %fs:(%rax),%rax
=> 0x00007ffff7da84e2 <+18>:    mov    (%rax),%rax
   0x00007ffff7da84e5 <+21>:    mov    0x40(%rax),%rsi

with $rax == 0. When ld.so itself goes through this code, $rax is distinctly non-NULL. Obviously something went wrong during TLS setup, but what?

It turns out that GLIBC initializes its _dl_phdr from the AT_PHDR in the auxiliary vector, then iterates over all Phdrs to look for one with PT_TLS type.

If there isn't one, then GLIBC assumes that no TLS set up is necessary.

When ld.so runs directly, the kernel-supplied aux vector points to Phdrs for ld.so, PT_TLS is present, and everything works.

But when ld.so runs indirectly as the interpreter for a.out, the aux vector points to Phdrs for a.out (and not for ld.so -- this is as designed). Since a.out doesn't have any thread-local variables, it doesn't have PT_TLS segment either.

Conclusion: it is currently not possible to build an ELF interpreter with -static-pie and GLIBC, unless one is very careful to avoid thread-local storage. And avoiding thread-local storage currently appears to not be an option either: a trivial int main() { return 0; } still has a TLS segment despite not using anything at all from GLIBC.

Billet answered 14/4, 2019 at 20:37 Comment(2)
The name ld.so looks suspect. Why do you use that name?Crosspollinate
@Crosspollinate man7.org/linux/man-pages/man8/ld.so.8.html may help you understand why I chose that name.Billet

© 2022 - 2024 — McMap. All rights reserved.