Why doesn't the linker complain of duplicate symbols?
Asked Answered
I

1

12

I have a dummy.hpp

#ifndef DUMMY
#define DUMMY
void dummy();
#endif

and a dummy.cpp

#include <iostream>
void dummy() {
      std::cerr << "dummy" << std::endl;
}

and a main.cpp which use dummy()

#include "dummy.hpp"
int main(){

    dummy();
    return 0;
}

Then I compiled dummy.cpp to three libraries, libdummy1.a, libdummy2.a, libdummy.so:

g++ -c -fPIC dummy.cpp
ar rvs libdummy1.a dummy.o
ar rvs libdummy2.a dummy.o
g++ -shared -fPIC -o libdummy.so dummy.cpp
  1. When I try compile main and link the dummy libs

    g++ -o main main.cpp -L. -ldummy1 -ldummy2
    

    There is no duplicate symbol error produced by linker. Why does this happen when I link two identical libraries statically?

  2. When I try

    g++ -o main main.cpp -L. -ldummy1 -ldummy
    

    There is also no duplicate symbol error, Why?

The loader seems always to choose dynamic libs and not the code compiled in the .o files.

Does it mean the same symbol is always loaded from the .so file if it is both in a .a and a .so file?

Does it mean symbols in the static symbol table in static library never conflict with those in the dynamic symbol table in a .so file?

Intercommunicate answered 24/12, 2015 at 14:35 Comment(2)
@HansPassant I have personally meet the situation in which I linked two different .a libs. to create a .so lib, if there are same .o files in both libs, then the linker complains. But not work here in the example above, really wired.Intercommunicate
Please show your exact command that produces the error. Normally there shoul be no error, it's just the eay the linker works, so you are doing something unusual here.Chigoe
H
14

There's no error in either Scenario 1 (dual static libraries) or Scenario 2 (static and shared libraries) because the linker takes the first object file from a static library, or the first shared library, that it encounters that provides a definition of a symbol it has not yet got a definition for. It simply ignores any later definitions of the same symbol because it already has a good one. In general, the linker only takes what it needs from a library. With static libraries, that's strictly true. With shared libraries, all the symbols in the shared library are available if it satisfied any missing symbol; with some linkers, the symbols of the shared library may be available regardless, but other versions only record the use a shared library if that shared library provides at least one definition.

It's also why you need to link libraries after object files. You could add dummy.o to your linking commands and as long as that appears before the libraries, there'll be no trouble. Add the dummy.o file after libraries and you'll get doubly-defined symbol errors.

The only time you run into problems with this double definitions is if there's an object file in Library 1 that defines both dummy and extra, and there's an object file in Library 2 that defines both dummy and alternative, and the code needs the definitions of both extra and alternative — then you have duplicate definitions of dummy that cause trouble. Indeed, the object files could be in a single library and would cause trouble.

Consider:

/* file1.h */
extern void dummy();
extern int extra(int);

/* file1.cpp */
#include "file1.h"
#include <iostream>
void dummy() { std::cerr << "dummy() from " << __FILE__ << '\n'; }
int extra(int i) { return i + 37; }

/* file2.h */
extern void dummy();
extern int alternative(int);

/* file2.cpp */
#include "file2.h"
#include <iostream>
void dummy() { std::cerr << "dummy() from " << __FILE__ << '\n'; }
int alternative(int i) { return -i; }

/* main.cpp */
#include "file1.h"
#include "file2.h"
int main()
{
    return extra(alternative(54));
}

You won't be able to link the object files from the three source files shown because of the double-definition of dummy, even though the main code does not call dummy().

Regarding:

The loader seems always to choose dynamic libs and not compiled in the .o files.

No; the linker always attempts to load object files unconditionally. It scans libraries as it encounters them on the command line, collecting definitions it needs. If the object files precede the libraries, there's not a problem unless two of the object files define the same symbol (does 'one definition rule' ring any bells?). If some of the object files follow libaries, you can run into conflicts if libraries define symbols that the later object files define. Note that when it starts out, the linker is looking for a definition of main. It collects the defined symbols and referenced symbols from each object file it is told about, and keeps adding code (from libraries) until all the referenced symbols are defined.

Does it means the same symbol is always loaded from .so file, if it is both in .a and .so file?

No; it depends which was encountered first. If the .a was encountered first, the .o file is effectively copied from the library into the executable (and the symbol in the shared library is ignored because there's already a definition for it in the executable). If the .so was encountered first, the definition in the .a is ignored because the linker is no longer looking for a definition of that symbol — it's already got one.

Does it mean that symbols in static symbol table in a static library are never in conflict with those in dynamic symbol table in .so file?

You can have conflicts, but the first definition encountered resolves the symbol for the linker. It only runs into conflicts if the code that satisfies the reference causes a conflict by defining other symbols that are needed.

If I link 2 shared libs, can I get conflicts and the link phase failed?

As I noted in a comment:

My immediate reaction is "Yes, you can". It would depend on the content of the two shared libraries, but you could run into problems, I believe. […cogitation…] How would you show this problem? … It's not as easy as it seems at first sight. What is required to demonstrate such a problem? … Or am I overthinking this? … […time to go play with some sample code…]

After some experimentation, my provisional, empirical answer is "No, you can't" (or "No, on at least some systems, you don't run into a conflict"). I'm glad I prevaricated.

Taking the code shown above (2 headers, 3 source files), and running with GCC 5.3.0 on Mac OS X 10.10.5 (Yosemite), I can run:

$ g++ -O -c main.cpp
$ g++ -O -c file1.cpp
$ g++ -O -c file2.cpp
$ g++ -shared -o libfile2.so file2.o
$ g++ -shared -o libfile1.so file1.o
$ g++ -o test2 main.o -L. -lfile1 -lfile2
$ ./test2
$ echo $?
239
$ otool -L test2
test2:
    libfile2.so (compatibility version 0.0.0, current version 0.0.0)
    libfile1.so (compatibility version 0.0.0, current version 0.0.0)
    /opt/gcc/v5.3.0/lib/libstdc++.6.dylib (compatibility version 7.0.0, current version 7.21.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1213.0.0)
    /opt/gcc/v5.3.0/lib/libgcc_s.1.dylib (compatibility version 1.0.0, current version 1.0.0)
$

It is aconventional to use .so as the extension on Mac OS X (it's usually .dylib), but it seems to work.

Then I revised the code in the .cpp files so that extra() calls dummy() before the return, and so does alternative() and main(). After recompiling and rebuilding the shared libraries, I ran the programs. The first line of output is from the dummy() called by main(). Then you get the other two lines produced by alternative() and extra() in that order because the calling sequence for return extra(alternative(54)); demands that.

$ g++ -o test2 main.o -L. -lfile1 -lfile2
$ ./test2
dummy() from file1.cpp
dummy() from file2.cpp
dummy() from file1.cpp
$ g++ -o test2 main.o -L. -lfile2 -lfile1
$ ./test2
dummy() from file2.cpp
dummy() from file2.cpp
dummy() from file1.cpp
$

Note that the function called by main() is the first one that appears in the libraries it is linked with. But (on Mac OS X 10.10.5 at least) the linker does not run into a conflict. Note, though, that the code in each shared object calls 'its own' version of dummy() — there is disagreement between the two shared libraries about which function is dummy(). (It would be interesting to have the dummy() function in separate object files in the shared libraries; then which version of dummy() gets called?) But in the extremely simple scenario shown, the main() function manages to call just one of the dummy() functions. (Note that I'd not be surprised to find differences between platforms for this behaviour. I've identified where I tested the code. Please let me know if you find different behaviour on some platform.)

Heterochromous answered 24/12, 2015 at 15:47 Comment(8)
if I link 2 shared libs, can I get conflicts and the link phase failed?Intercommunicate
My immediate reaction is "Yes, you can". It would depend on the content of the two shared libraries, but you could run into problems, I believe. […cogitation…] How would you show this problem? … It's not as easy as it seems at first sight. What is required to demonstrate such a problem? … Or am I overthinking this? … […time to go play with some sample code…]Heterochromous
@Will: See updated answer. Apparently not — but experiment on your systems.Heterochromous
for the senarios1 .so and 1 .a and senario 2 .so filess, the code in my post cannot demonstrate conflicts problem.Intercommunicate
@Will: I think that with one or more shared libraries, you don't run into the link time conflict. (Your second comment arrived 6 seconds after my 'see updated answer' comment; I don't suppose you had a chance to review the updates by then.)Heterochromous
Thanks for your explanation. Within my environment, in senario 1 .a and 1 .so, the executable always needs a so to run, whenever the the order the libs passed to the linker.Intercommunicate
There are differences in the way the loaders manage these things. One question would be "if you replace the original libwhichever.so with another file that contains just a single function int SomethingCompletelyDifferent() { return 0; }, does the program still run OK"? My suspicion is that the answer's "Yes" if the symbol used by the program was taken from the static library (because you linked it before the shared library) — but "No" if the shared library was listed first. You can play a lot of similar games to experiment with this. Be aware that what works on O/S A may fail on O/S B.Heterochromous
you are right, the first libs passed by command resolves the symbols. the loader always needs a so to run the executable, if I link a so to executable whenever it uses the symbol in it.Intercommunicate

© 2022 - 2024 — McMap. All rights reserved.