Call `atexit` when linking to libc dynamically on Linux
Asked Answered
F

1

7

If I have the following program written in C (compiled with GCC on Debian 8.7), I am able to call atexit() as you would expect:

#include <stdlib.h>

void exit_handler(void) {
    return;
}

int main () {
    atexit(exit_handler);
    return 0;
}

And when I compile and run it:

$ gcc test.c
$ ./a.out

Outputs nothing, just as you would expect. In fact, when I run ldd, I get:

$ ldd a.out
    linux-vdso.so.1 (0x00007fffbe592000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fe07d3a8000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fe07d753000)

However, libc does not seem to have any symbols for atexit, amd only has__cxa_atexit and __cxa_threaded_atexit_impl:

$ nm --dynamic /lib/x86_64-linux-gnu/libc.so.6 | grep 'atexit'
0000000000037d90 T __cxa_atexit
0000000000037fa0 T __cxa_thread_atexit_impl

As you would then expect, if I try to link to libc dynamically, I cannot actually call atexit(), such as in the following Racket program which links to libc and tries to find atexit:

#lang racket

(require ffi/unsafe)

(get-ffi-obj 'atexit (ffi-lib "libc" '("6")) (_fun (_fun -> _void) -> _int))

Giving the output:

$ racket findatexit.rkt
ffi-obj: couldn't get "atexit" from "libc.so.6" (/lib/x86_64-linux-gnu/libc.so.6: undefined symbol: atexit)

What I want to know here is:

  1. If libc does not have any symbol for atexit on Linux, why can I still call it from a C program?
  2. Is there any way I can call atexit or a similar function dynamically on Linux?

(I should note that atexit does appear to be a symbol on OS X, so its just Linux that seems unusual here.)

Edit:

At the suggestion of @Jonathan, I also ran:

$ gcc -c test.c
$ nm test.o
                 U atexit
0000000000000000 T exit_handler
0000000000000007 T main

Which seems to indicate the atexit symbol is there somewhere, but it does not appear in any of the libraries ldd is showing.

Fenugreek answered 6/5, 2017 at 22:30 Comment(4)
Try gcc -c test.c; nm test.o and see what symbols are referenced there.Burkey
Good idea: ``` $ nm test.o U atexit 0000000000000000 T exit_handler 0000000000000007 T main ```Fenugreek
OK; that means it calls atexit() somehow. Have you looked in ld.so.1 (or, for you, perhaps /lib64/ld-linux-x86-64.so.2) for the symbol? Or perhaps crt0.o, or whatever is linked? You may need to run gcc -v test.c to see exactly what libraries and object files are linked.Burkey
Hmm...it doesn't appear to be there, as determined by: $ nm --dynamic /lib64/ld-linux-x86-64.so.2 | grep 'atexit'Fenugreek
B
13

I did some poking around on a Centos 7 virtual machine, and I think I found it — but it was anything but obvious!

Found it!

In /usr/lib64/libc_nonshared.a:

$ nm /usr/lib64/libc_nonshared.a | grep -i atexit
atexit.oS:
0000000000000000 T atexit
                 U __cxa_atexit
$

Why look in that library? Good question — and a long story. Are you sitting comfortably? Then I'll begin…

Steps taken to get there

  1. Use the test.c code from the question.
  2. Compile it with gcc -v test.c:

    $ gcc -v test.c
    Using built-in specs.
    COLLECT_GCC=gcc
    COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/4.8.5/lto-wrapper
    Target: x86_64-redhat-linux
    Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-linker-hash-style=gnu --enable-languages=c,c++,objc,obj-c++,java,fortran,ada,go,lto --enable-plugin --enable-initfini-array --disable-libgcj --with-isl=/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/isl-install --with-cloog=/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/cloog-install --enable-gnu-indirect-function --with-tune=generic --with-arch_32=x86-64 --build=x86_64-redhat-linux
    Thread model: posix
    gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) 
    COLLECT_GCC_OPTIONS='-v' '-mtune=generic' '-march=x86-64'
     /usr/libexec/gcc/x86_64-redhat-linux/4.8.5/cc1 -quiet -v test.c -quiet -dumpbase test.c -mtune=generic -march=x86-64 -auxbase test -version -o /tmp/ccPHTer7.s
    GNU C (GCC) version 4.8.5 20150623 (Red Hat 4.8.5-11) (x86_64-redhat-linux)
        compiled by GNU C version 4.8.5 20150623 (Red Hat 4.8.5-11), GMP version 6.0.0, MPFR version 3.1.1, MPC version 1.0.1
    GGC heuristics: --param ggc-min-expand=96 --param ggc-min-heapsize=124992
    ignoring nonexistent directory "/usr/lib/gcc/x86_64-redhat-linux/4.8.5/include-fixed"
    ignoring nonexistent directory "/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../x86_64-redhat-linux/include"
    #include "..." search starts here:
    #include <...> search starts here:
     /usr/lib/gcc/x86_64-redhat-linux/4.8.5/include
     /usr/local/include
     /usr/include
    End of search list.
    GNU C (GCC) version 4.8.5 20150623 (Red Hat 4.8.5-11) (x86_64-redhat-linux)
        compiled by GNU C version 4.8.5 20150623 (Red Hat 4.8.5-11), GMP version 6.0.0, MPFR version 3.1.1, MPC version 1.0.1
    GGC heuristics: --param ggc-min-expand=96 --param ggc-min-heapsize=124992
    Compiler executable checksum: 356f86e67978d665416e07d560c8ba0d
    COLLECT_GCC_OPTIONS='-v' '-mtune=generic' '-march=x86-64'
     as -v --64 -o /tmp/cc5WHEA4.o /tmp/ccPHTer7.s
    GNU assembler version 2.25.1 (x86_64-redhat-linux) using BFD version version 2.25.1-22.base.el7 
    COMPILER_PATH=/usr/libexec/gcc/x86_64-redhat-linux/4.8.5/:/usr/libexec/gcc/x86_64-redhat-linux/4.8.5/:/usr/libexec/gcc/x86_64-redhat-linux/:/usr/lib/gcc/x86_64-redhat-linux/4.8.5/:/usr/lib/gcc/x86_64-redhat-linux/
    LIBRARY_PATH=/usr/lib/gcc/x86_64-redhat-linux/4.8.5/:/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../:/lib/:/usr/lib/
    COLLECT_GCC_OPTIONS='-v' '-mtune=generic' '-march=x86-64'
     /usr/libexec/gcc/x86_64-redhat-linux/4.8.5/collect2 --build-id --no-add-needed --eh-frame-hdr --hash-style=gnu -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/crt1.o /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/crti.o /usr/lib/gcc/x86_64-redhat-linux/4.8.5/crtbegin.o -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../.. /tmp/cc5WHEA4.o -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/x86_64-redhat-linux/4.8.5/crtend.o /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/crtn.o
    $
    
  3. The interesting part is the collect2 command line at the end. Written with one argument per line, that is:

    /usr/libexec/gcc/x86_64-redhat-linux/4.8.5/collect2
    --build-id
    --no-add-needed
    --eh-frame-hdr
    --hash-style=gnu
    -m
    elf_x86_64
    -dynamic-linker
    /lib64/ld-linux-x86-64.so.2
    /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/crt1.o
    /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/crti.o
    /usr/lib/gcc/x86_64-redhat-linux/4.8.5/crtbegin.o
    -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5
    -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64
    -L/lib/../lib64
    -L/usr/lib/../lib64
    -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../..
    /tmp/cc5WHEA4.o
    -lgcc
    --as-needed
    -lgcc_s
    --no-as-needed
    -lc
    -lgcc
    --as-needed
    -lgcc_s
    --no-as-needed
    /usr/lib/gcc/x86_64-redhat-linux/4.8.5/crtend.o
    /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/crtn.o
    
  4. So, there are a bunch of cr*.o files, plus three libraries: -lc, -lgcc and -lgcc_s to look for, and a bunch of directories to look in: -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5, -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64, -L/lib/../lib64, -L/usr/lib/../lib64, -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../... The /tmp/cc5WHEA4.o is the object file created from test.c.

  5. Applying some clean-up code to the path names, and then using ls to help find the libraries yields a list of files to examine further:

    /lib64/ld-linux-x86-64.so.2
    /usr/lib64/crt1.o
    /usr/lib64/crti.o
    /usr/lib/gcc/x86_64-redhat-linux/4.8.5/crtbegin.o
    /usr/lib/gcc/x86_64-redhat-linux/4.8.5/crtend.o
    /usr/lib64/crtn.o
    /usr/lib/gcc/x86_64-redhat-linux/4.8.5/libgcc.a
    /usr/lib/gcc/x86_64-redhat-linux/4.8.5/libgcc_s.so
    /usr/lib64/libgcc_s.so.1
    /lib64/libgcc_s.so.1
    /usr/lib64/libgcc_s.so.1
    /usr/lib64/libc.so
    /usr/lib64/libc.so.6
    /lib64/libc.so
    /lib64/libc.so.6
    /usr/lib64/libc.so
    /usr/lib64/libc.so.6
    
  6. That list of files was saved in a file yy (unimaginative name), and then used in:

    $ nm -o $(<yy) | tee nm.log | grep -i atexit
    nm: _trampoline.o: no symbols
    nm: __main.o: no symbols
    nm: _ctors.o: no symbols
    nm: /usr/lib/gcc/x86_64-redhat-linux/4.8.5/libgcc_s.so: no symbols
    nm: /usr/lib64/libgcc_s.so.1: no symbols
    nm: /lib64/libgcc_s.so.1: no symbols
    nm: /usr/lib64/libgcc_s.so.1: no symbols
    nm: /usr/lib64/libc.so: File format not recognized
    /usr/lib64/libc.so.6:00000000003bcc00 b added_atexit_handler.9157
    /usr/lib64/libc.so.6:0000000000038c90 T __cxa_atexit
    /usr/lib64/libc.so.6:0000000000038c90 t __cxa_atexit_internal
    /usr/lib64/libc.so.6:00000000003b6838 d __elf_set___libc_atexit_element__IO_cleanup__
    /usr/lib64/libc.so.6:0000000000038c40 t __internal_atexit
    /usr/lib64/libc.so.6:00000000003b6838 d __start___libc_atexit
    /usr/lib64/libc.so.6:00000000003b6840 d __stop___libc_atexit
    nm: /lib64/libc.so: File format not recognized
    /lib64/libc.so.6:00000000003bcc00 b added_atexit_handler.9157
    /lib64/libc.so.6:0000000000038c90 T __cxa_atexit
    /lib64/libc.so.6:0000000000038c90 t __cxa_atexit_internal
    /lib64/libc.so.6:00000000003b6838 d __elf_set___libc_atexit_element__IO_cleanup__
    /lib64/libc.so.6:0000000000038c40 t __internal_atexit
    nm: /usr/lib64/libc.so: File format not recognized
    /lib64/libc.so.6:00000000003b6838 d __start___libc_atexit
    /lib64/libc.so.6:00000000003b6840 d __stop___libc_atexit
    /usr/lib64/libc.so.6:00000000003bcc00 b added_atexit_handler.9157
    /usr/lib64/libc.so.6:0000000000038c90 T __cxa_atexit
    /usr/lib64/libc.so.6:0000000000038c90 t __cxa_atexit_internal
    /usr/lib64/libc.so.6:00000000003b6838 d __elf_set___libc_atexit_element__IO_cleanup__
    /usr/lib64/libc.so.6:0000000000038c40 t __internal_atexit
    /usr/lib64/libc.so.6:00000000003b6838 d __start___libc_atexit
    /usr/lib64/libc.so.6:00000000003b6840 d __stop___libc_atexit
    $
    
  7. There's no evidence of a plain atexit function there. Where's it hiding, and what's with those 'File format not recognized' messages?

    $ file /usr/lib64/libc.so
    /usr/lib64/libc.so: ASCII text
    $
    
  8. ASCII text? What?

    $ cat /usr/lib64/libc.so
    /* GNU ld script
       Use the shared library, but some functions are only in
       the static library, so try that secondarily.  */
    OUTPUT_FORMAT(elf64-x86-64)
    GROUP ( /lib64/libc.so.6 /usr/lib64/libc_nonshared.a  AS_NEEDED ( /lib64/ld-linux-x86-64.so.2 ) )
    $
    
  9. OK; what's in /usr/lib64/libc_nonshared.a?

    $  nm /usr/lib64/libc_nonshared.a | grep -i atexit
    atexit.oS:
    0000000000000000 T atexit
                     U __cxa_atexit
    $
    

    Bingo! Found it!

So, it seems that the collect2 linker used by GCC is able to load files not listed on the command line, and that one of those files is /usr/lib64/libc_nonshared.a, and that this library has atexit() in it. Consequently, you should be able to invoke atexit() because it is statically linked into the executable … unless there's some more black magic hidden away here that I've not sussed out.

Burkey answered 7/5, 2017 at 0:23 Comment(2)
Oh my, that is fantastic, thank you for digging that out, you are awesome.Fenugreek
Great work Jonathan. In case anyone is wondering "why the gymnastics?", here is why. It is so the __dso_handle (shared library handle) is available at every call to atexit(). This was implemented as an obscure compatibility feature for shared libraries. When a dlclose() is done to a shared library, any atexit() handlers registered by that specific shared lib get called before the lib is unmapped. This is accomplished by atexit() calling __cxa_atexit() with the extra __dso_handle. This is how glibc pulls it off. Other systems (BSD, Apple) use different techniques to achieve the same goal.Quadrature

© 2022 - 2024 — McMap. All rights reserved.