What does (.eh) mean in nm output?
Asked Answered
L

2

6

When I look at the symbols in my library, nm mylib.a, I see some duplicate entries that look like this:

000000000002d130 S __ZN7quadmat11SpAddLeavesC1EPNS_14BlockContainerEPy
00000000000628a8 S __ZN7quadmat11SpAddLeavesC1EPNS_14BlockContainerEPy.eh

When piped through c++filt:

000000000002d130 S quadmat::SpAddLeaves::SpAddLeaves(quadmat::BlockContainer*, unsigned long long*)
00000000000628a8 S quadmat::SpAddLeaves::SpAddLeaves(quadmat::BlockContainer*, unsigned long long*) (.eh)

What does that .eh mean, and what is this extra symbol used for?

I see it has something to do with exception handling. But why does that use an extra symbol?

(I'm noticing this with clang)

Lapidate answered 15/10, 2013 at 8:26 Comment(4)
https://mcmap.net/q/1778300/-avoid-linking-in-unused-symbols-when-linking-against-static-libs , also refspecs.linuxbase.org/LSB_4.1.0/LSB-Core-generic/…Falsework
So something to do with exceptions? What is its purpose?Lapidate
@Adam, what compiler and platform are you using?Treadwell
@Treadwell I'm using clang on OSX.Lapidate
T
4

Here's some simple code:

bool extenrnal_variable;

int f(...)
{
    if (extenrnal_variable)
        throw 0;

    return 42;
}

int g()
{
    return f(1, 2, 3);
}

I added extenrnal_variable to prevent the compiler from optimizing all the branches away. f has ... to prevent inlining.

When compiled with:

$ clang++ -S -O3 -m32 -o - eh.cpp | c++filt

it emits the following code for g() (the rest is omitted):

g():                                 ## @_Z1gv
    .cfi_startproc
## BB#0:
    pushl   %ebp
Ltmp9:
    .cfi_def_cfa_offset 8
Ltmp10:
    .cfi_offset %ebp, -8
    movl    %esp, %ebp
Ltmp11:
    .cfi_def_cfa_register %ebp
    subl    $24, %esp
    movl    $3, 8(%esp)
    movl    $2, 4(%esp)
    movl    $1, (%esp)
    calll   f(...)
    movl    $42, %eax
    addl    $24, %esp
    popl    %ebp
    ret
    .cfi_endproc

All these .cfi_* directives are there for the stack unwinding in case of an exception being thrown. They all compiled into into an FDE (Frame Description Entry) block and saved under the g().eh (__Z1gv.eh mangled) name. These directives specify where on the stack the CPU registers are saved. When an exception is thrown and the stack is being unwound the code in the function should not be executed (except for the destructors of locals), but the registers that were saved earlier should be restored. These tables store exactly that information.

These tables could be dumped via the dwarfdump tool:

$ dwarfdump --eh-frame --english eh.o | c++filt

The output:

0x00000018: FDE
        length: 0x00000018
   CIE_pointer: 0x00000000
    start_addr: 0x00000000 f(...)
    range_size: 0x0000004d (end_addr = 0x0000004d)
  Instructions: 0x00000000: CFA=esp+4     eip=[esp]
                0x00000001: CFA=esp+8     ebp=[esp]  eip=[esp+4]
                0x00000003: CFA=ebp+8     ebp=[ebp]  eip=[ebp+4]
                0x00000007: CFA=ebp+8     ebp=[ebp]  esi=[ebp-4]  eip=[ebp+4]

0x00000034: FDE
        length: 0x00000018
   CIE_pointer: 0x00000000
    start_addr: 0x00000050 g()
    range_size: 0x0000002c (end_addr = 0x0000007c)
  Instructions: 0x00000050: CFA=esp+4     eip=[esp]
                0x00000051: CFA=esp+8     ebp=[esp]  eip=[esp+4]
                0x00000053: CFA=ebp+8     ebp=[ebp]  eip=[ebp+4]

Here you could find out about the format of this block. Here a bit more and some alternative more compact way of representing the same information. Basically this block describes which registers and where from on the stack to pop during the stack unwinding.

To see the raw content of these symbols you can list all the symbols with their offsets:

$ nm -n eh.o

00000000 T __Z1fz
         U __ZTIi
         U ___cxa_allocate_exception
         U ___cxa_throw
00000050 T __Z1gv
000000a8 s EH_frame0
000000c0 S __Z1fz.eh
000000dc S __Z1gv.eh
000000f8 S _extenrnal_variable

And then dump the (__TEXT,__eh_frame) section:

$ otool -s __TEXT __eh_frame eh.o

eh.o:
Contents of (__TEXT,__eh_frame) section
000000a8    14 00 00 00 00 00 00 00 01 7a 52 00 01 7c 08 01
000000b8    10 0c 05 04 88 01 00 00 18 00 00 00 1c 00 00 00
000000c8    38 ff ff ff 4d 00 00 00 00 41 0e 08 84 02 42 0d
000000d8    04 44 86 03 18 00 00 00 38 00 00 00 6c ff ff ff
000000e8    2c 00 00 00 00 41 0e 08 84 02 42 0d 04 00 00 00

By matching the offsets you could see how each symbol is encoded.

When there are local variables present, they would have to be destroyed during the stack unwinding. For that there's usually more code embedded in the functions themselves and some additional bigger tables are created. You could explore that yourself by adding a local variable with non-trivial destructor into g, compiling and looking at the assembly output.

Further reading

Treadwell answered 25/10, 2013 at 18:4 Comment(0)
E
2

It stands for stands for exception handler and is usually associated with the info below:

If you are using an exports list and building either a shared library, or an executable that will be used with ld's -bundle_loader flag, you need to include the symbols for exception frame information in the exports list for your exported C++ symbols. Otherwise, they may be stripped. These symbols end with .eh; you can view them with the nm tool.

  • from XcodeUserGuide20
Entomo answered 21/10, 2013 at 6:10 Comment(2)
Yes, @BoBTFish's links contain this paragraph. The purpose of this question (and bounty) is to explain what those .eh symbols are for, which this paragraph doesn't do. For instance, what is an exception frame information symbol? Why does the -bundle_loader flag matter? This paragraph assumes you already know all that.Lapidate
Sorry, it stands for exception handler. I have a 2 day old new born and it's been a bit hectic as of late so I was a bit distracted when I was putting the answer together.Entomo

© 2022 - 2024 — McMap. All rights reserved.