How does C++ linking work in practice? [duplicate]

Asked 25/8, 2012 at 13:35 Answered 28/5, 2015 at 13:17

How does C++ linking work in practice? What I am looking for is a detailed explanation about how the linking happens, and not what commands do the linking.

There's already a similar question about compilation which doesn't go into too much detail: How does the compilation/linking process work?

Brutish answered 25/8, 2012 at 13:35 Comment(1)

Try a book, for example Linkers and Loaders – Hassanhassell 25/8, 2012 at 13:48

EDIT: I have moved this answer to the duplicate: https://mcmap.net/q/14442/-what-do-linkers-do

This answer focuses on address relocation, which is one of the crucial functions of linking.

A minimal example will be used to clarify the concept.

0) Introduction

Summary: relocation edits the .text section of object files to translate:

object file address
into the final address of the executable

This must be done by the linker because the compiler only sees one input file at a time, but we must know about all object files at once to decide how to:

resolve undefined symbols like declared undefined functions
not clash multiple .text and .data sections of multiple object files

Prerequisites: minimal understanding of:

x86-64 or IA-32 assembly
global structure of an ELF file. I have made a tutorial for that

Linking has nothing to do with C or C++ specifically: compilers just generate the object files. The linker then takes them as input without ever knowing what language compiled them. It might as well be Fortran.

So to reduce the crust, let's study a NASM x86-64 ELF Linux hello world:

section .data
    hello_world db "Hello world!", 10
section .text
    global _start
    _start:

        ; sys_write
        mov rax, 1
        mov rdi, 1
        mov rsi, hello_world
        mov rdx, 13
        syscall

        ; sys_exit
        mov rax, 60
        mov rdi, 0
        syscall

compiled and assembled with:

nasm -felf64 hello_world.asm            # creates hello_world.o
ld -o hello_world.out hello_world.o     # static ELF executable with no libraries

with NASM 2.10.09.

1) .text of .o

First we decompile the .text section of the object file:

objdump -d hello_world.o

which gives:

0000000000000000 <_start>:
   0:   b8 01 00 00 00          mov    $0x1,%eax
   5:   bf 01 00 00 00          mov    $0x1,%edi
   a:   48 be 00 00 00 00 00    movabs $0x0,%rsi
  11:   00 00 00
  14:   ba 0d 00 00 00          mov    $0xd,%edx
  19:   0f 05                   syscall
  1b:   b8 3c 00 00 00          mov    $0x3c,%eax
  20:   bf 00 00 00 00          mov    $0x0,%edi
  25:   0f 05                   syscall

the crucial lines are:

   a:   48 be 00 00 00 00 00    movabs $0x0,%rsi
  11:   00 00 00

which should move the address of the hello world string into the rsi register, which is passed to the write system call.

But wait! How can the compiler possibly know where "Hello world!" will end up in memory when the program is loaded?

Well, it can't, specially after we link a bunch of .o files together with multiple .data sections.

Only the linker can do that since only he will have all those object files.

So the compiler just:

puts a placeholder value 0x0 on the compiled output
gives some extra information to the linker of how to modify the compiled code with the good addresses

This "extra information" is contained in the .rela.text section of the object file

2) .rela.text

.rela.text stands for "relocation of the .text section".

The word relocation is used because the linker will have to relocate the address from the object into the executable.

We can disassemble the .rela.text section with:

readelf -r hello_world.o

which contains;

Relocation section '.rela.text' at offset 0x340 contains 1 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
00000000000c  000200000001 R_X86_64_64       0000000000000000 .data + 0

The format of this section is fixed documented at: http://www.sco.com/developers/gabi/2003-12-17/ch4.reloc.html

Each entry tells the linker about one address which needs to be relocated, here we have only one for the string.

Simplifying a bit, for this particular line we have the following information:

Offset = C: what is the first byte of the .text that this entry changes.

If we look back at the decompiled text, it is exactly inside the critical movabs $0x0,%rsi, and those that know x86-64 instruction encoding will notice that this encodes the 64-bit address part of the instruction.
Name = .data: the address points to the .data section
Type = R_X86_64_64, which specifies what exactly what calculation has to be done to translate the address.

This field is actually processor dependent, and thus documented on the AMD64 System V ABI extension section 4.4 "Relocation".

That document says that R_X86_64_64 does:
- Field = word64: 8 bytes, thus the 00 00 00 00 00 00 00 00 at address 0xC
- Calculation = S + A
  - S is value at the address being relocated, thus 00 00 00 00 00 00 00 00
  - A is the addend which is 0 here. This is a field of the relocation entry.
  So S + A == 0 and we will get relocated to the very first address of the .data section.

3) .text of .out

Now lets look at the text area of the executable ld generated for us:

objdump -d hello_world.out

gives:

00000000004000b0 <_start>:
  4000b0:   b8 01 00 00 00          mov    $0x1,%eax
  4000b5:   bf 01 00 00 00          mov    $0x1,%edi
  4000ba:   48 be d8 00 60 00 00    movabs $0x6000d8,%rsi
  4000c1:   00 00 00
  4000c4:   ba 0d 00 00 00          mov    $0xd,%edx
  4000c9:   0f 05                   syscall
  4000cb:   b8 3c 00 00 00          mov    $0x3c,%eax
  4000d0:   bf 00 00 00 00          mov    $0x0,%edi
  4000d5:   0f 05                   syscall

So the only thing that changed from the object file are the critical lines:

  4000ba:   48 be d8 00 60 00 00    movabs $0x6000d8,%rsi
  4000c1:   00 00 00

which now point to the address 0x6000d8 (d8 00 60 00 00 00 00 00 in little-endian) instead of 0x0.

Is this the right location for the hello_world string?

To decide we have to check the program headers, which tell Linux where to load each section.

We disassemble them with:

readelf -l hello_world.out

which gives:

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                 0x00000000000000d7 0x00000000000000d7  R E    200000
  LOAD           0x00000000000000d8 0x00000000006000d8 0x00000000006000d8
                 0x000000000000000d 0x000000000000000d  RW     200000

 Section to Segment mapping:
  Segment Sections...
   00     .text
   01     .data

This tells us that the .data section, which is the second one, starts at VirtAddr = 0x06000d8.

And the only thing on the data section is our hello world string.

Ebonize answered 28/5, 2015 at 13:17 Comment(11)

"in big-endian" -- surely you mean "in little-endian" ? Nice writeup, but could probably benefit from looking at objdump -dr and using two strings, two functions and two compilation units, so different types of relocations and different offsets can be seen (at which point it becomes a blog post worthy). – Astolat 30/5, 2015 at 16:11

@EmployedRussian thanks for the correction! Glad a hardcore ELF guy like you took a look at it :-) I didn't know about -dr, it rocks. I might do the two compilation units analysis some time to see if I understood things correctly. This question is so "Too broad", I love it. – Ebonize 30/5, 2015 at 16:30

What about simply saying that if an object file has a function call to "print", then another object file provide the function "print", the linker replace the call to print with the actual offset to print? – Homestretch 30/5, 2015 at 16:31

How can you tell from the program headers which represents the data section? The section->segment mapping below tells us that the data section is the second segment, so do we just count (starting from 0) through the program headers? – Starinsky 29/6, 2016 at 22:58

First review the global structure of an ELF file: cirosantilli.com/elf-hello-world/#global-file-structure Everything is offset based, so order is not required. Then I don't know exactly how the mapping is stored, it was asked at: stackoverflow.com/questions/23018496/… – Ebonize 30/6, 2016 at 6:15

@Starinsky First review the global structure of an ELF file: cirosantilli.com/elf-hello-world/#global-file-structure Everything is offset based, so order is not required. Then I don't know exactly how the mapping is stored, it was asked at: stackoverflow.com/questions/23018496/… – Ebonize 30/6, 2016 at 8:6

Great answer. Sorry if this is a dumb off-topic question; why does the output separate the last three bytes of the address 4000c1: 00 00 00 if they clearly belong to the previous instruction? – Hooghly 2/11, 2016 at 16:6

@Hooghly no question is dumb. I guess it is just to limit the column width. – Ebonize 2/11, 2016 at 16:38

Couldn't you have linked anywhere but SCO? You know, those guys who went to war on Linux? – Palomo 29/11, 2016 at 18:27

@Palomo unfortunately, the LSB links to them as the official ELF specification (so I guess they played an important role in that) :-) – Ebonize 29/11, 2016 at 19:1

The Linux Standard Base links to SCO? Maybe they had a truce... – Palomo 29/11, 2016 at 20:26

Actually, one could say linking is relatively simple.

In the simplest sense, it's just about bundling together object files¹ as those already contain the emitted assembly for each of the functions/globals/data... contained in their respective source. The linker can be extremely dumb here and just treat everything as a symbol (name) and its definition (or content).

Obviously, the linker need produce a file that respects a certain format (the ELF format generally on Unix) and will separate the various categories of code/data into different sections of the file, but that is just dispatching.

The two complications I know of are:

the need to de-duplicate symbols: some symbols are present in several object files and only one should make it in the resulting library/executable being created; it is the linker job to only include one of the definitions
link-time optimization: in this case the object files contain not the emitted assembly but an intermediate representation and the linker merge all the object files together, apply optimization passes (inlining, for example), compiles this down to assembly and finally emit its result.

¹: the result of the compilation of the different translation units (roughly, preprocessed source files)

Firman answered 25/8, 2012 at 13:47 Comment(7)

+1 good answer. The term to search for wrt the first point is "name mangling" btw – Biotechnology 25/8, 2012 at 15:52

@MarcovandeVoort: not really. For example, there is no manging in C, and yet there is still a linker. Duplicate symbols are generally weak symbols, like functions marked inline, instances of template functions, instances of template static symbols etc... Those get generated for every translation unit, but only one should be thrown into the final library and it is the linker job. – Firman 25/8, 2012 at 16:14

linking is only relatively simple when you ignore all the complicated parts of it (which, of course, is a tautology). If you wanted to know what complicates the real linker, you could start here: airs.com/blog/archives/38 – Astolat 25/8, 2012 at 23:40

@Matthieu It depends what you mean... C does have mangling on some platforms (notably Windoze). See Raymond Chen. – Hydrangea 26/8, 2012 at 0:2

@EmployedRussian: It gets really more complicated when you suddenly throw in Shared Library, I agree, but I don't enough to comment on that. – Firman 26/8, 2012 at 10:49

Matthieu: the question is about C++, not C. While strictly mangling is something the compiler does (and not the linker) it is part of linking strategy, which is why I mentioned it. – Biotechnology 27/8, 2012 at 12:42

@MarcovandeVoort: It's relevant in forming the names of the symbols, but whether you mangle or not is independent from whether you have duplicate (in different objects) or not. – Firman 27/8, 2012 at 16:7

Besides the already mentioned "Linkers and Loaders", if you wanted to know how a real and modern linker works, you could start here.

Astolat answered 25/8, 2012 at 23:42 Comment(2)

Thanks for the blog link, really interesting serie indeed. Normally answers containing only links are frown upon though (what if the blog suddenly shut downs ?) so I would encourage you to complete this answer. For example providing an overview of the serie (link + summary of each part) ? – Firman 26/8, 2012 at 10:51

I've always thought this rule of SO is stupid, it's a waste of time to try and summarise an entire series of posts by a leading expert on linkers in a brief SO answer. The question is not suitable for SO precisely because the answer cannot be summarised in a few paragraphs and the right way for the OP to get the answer is to read detailed information elsewhere, not a summary here, and anyway what if SO shuts down? Ian has been using airs.com for longer than SO has existed, maybe he'll outlive it too ;) – Shennashensi 30/5, 2015 at 15:59

0) Introduction

1) .text of .o

2) .rela.text

3) .text of .out

Recommended topics

Hot tags