Linking libstdc++ statically: any gotchas?
Asked Answered
H

5

116

I need to deploy a C++ application built on Ubuntu 12.10 with GCC 4.7's libstdc++ to systems running Ubuntu 10.04, which comes with a considerably older version of libstdc++.

Currently, I'm compiling with -static-libstdc++ -static-libgcc, as suggested by this blog post: Linking libstdc++ statically. The author warns against using any dynamically-loaded C++ code when compiling libstdc++ statically, which is something I haven't yet checked. Still, everything seems to be going smoothly so far: I can make use of C++11 features on Ubuntu 10.04, which is what I was after.

I note that this article is from 2005, and perhaps much has changed since then. Is its advice still current? Are there any lurking issues I should be aware of?

Heads answered 29/11, 2012 at 23:12 Comment(6)
No, linking statically to libstdc++ does not imply that. If it did imply that then there would be no point to the -static-libstdc++ option, you would just use -staticCesar
@JonathanWakely -static will get kernel too old error in some ubuntu 1404 system. The glibc.so is like kernel32.dll in window, it is part of operation system interface, we should not embed it in our binary. You can use objdump -T [binary path] to see it dynamically-loaded libstdc++.so or not. For golang programer, you can add #cgo linux LDFLAGS: -static-libstdc++ -static-libgcc before import "C"Atrophy
@bronzeman, but we're talking about -static-libstdc++ not -static so libc.so will not be statically linked.Cesar
@NickHutchinson the linked-to blog post is gone. This SO question is a popular search hit for the relevant terms here. Can you reproduce the critical info from that blog post in your question, or offer a new link if you know where it's moved to?Primp
@BrianCain The internet archive has it: web.archive.org/web/20160313071116/http://www.trilithium.com/…Offshore
@bronzeman glibc is a OS, not the OS interface on Linux distros. Consider dietlibc, musl-libc and others ... it's well possible to make use of syscalls without having glibc. The choice of glibc is down to the distro and users have a lot of leeway in working around that, if need be.Snuff
C
168

That blog post is pretty inaccurate.

As far as I know C++ ABI changes have been introduced with every major release of GCC (i.e. those with different first or second version number components).

Not true. The only C++ ABI changes introduced since GCC 3.4 have been backward-compatible, meaning the C++ ABI has been stable for nearly nine years.

To make matters worse, most major Linux distributions use GCC snapshots and/or patch their GCC versions, making it virtually impossible to know exactly what GCC versions you might be dealing with when you distribute binaries.

The differences between distributions' patched versions of GCC are minor, and not ABI changing, e.g. Fedora's 4.6.3 20120306 (Red Hat 4.6.3-2) is ABI compatible with the upstream FSF 4.6.x releases and almost certainly with any 4.6.x from any other distro.

On GNU/Linux GCC's runtime libraries use ELF symbol versioning so it's easy to check the symbol versions needed by objects and libraries, and if you have a libstdc++.so that provides those symbols it will work, it doesn't matter if it's a slightly different patched version from another version of your distro.

but no C++ code (or any code using the C++ runtime support) may be linked dynamically if this is to work.

This is not true either.

That said, statically linking to libstdc++.a is one option for you.

The reason it might not work if you dynamically load a library (using dlopen) is that libstdc++ symbols it depends on might not have been needed by your application when you (statically) linked it, so those symbols will not be present in your executable. That can be solved by dynamically-linking the shared library to libstdc++.so (which is the right thing to do anyway if it depends on it.) ELF symbol interposition means symbols that are present in your executable will be used by the shared library, but others not present in your executable will be found in whichever libstdc++.so it links to. If your application doesn't use dlopen you don't need to care about that.

Another option (and the one I prefer) is to deploy the newer libstdc++.so alongside your application and ensure it is found before the default system libstdc++.so, which can be done by forcing the dynamic linker to look in the right place, either using $LD_LIBRARY_PATH environment variable at run-time, or by setting an RPATH in the executable at link-time. I prefer to use RPATH as it doesn't rely on the environment being set correctly for the application to work. If you link your application with '-Wl,-rpath,$ORIGIN' (note the single quotes to prevent the shell trying to expand $ORIGIN) then the executable will have an RPATH of $ORIGIN which tells the dynamic linker to look for shared libraries in the same directory as the executable itself. If you put the newer libstdc++.so in the same directory as the executable it will be found at run-time, problem solved. (Another option is to put the executable in /some/path/bin/ and the newer libstdc++.so in /some/path/lib/ and link with '-Wl,-rpath,$ORIGIN/../lib' or any other fixed location relative to the executable, and set the RPATH relative to $ORIGIN)

Cesar answered 29/12, 2012 at 14:22 Comment(33)
Thanks, that's put my mind at ease. I'll investigate setting an RPATH.Heads
The article was written around the time of GCC 3.4 so may have been accurate back then... maybe?Council
This explanation, especially about RPATH, is glorious.Infare
Shipping libstdc++ with your app on Linux is bad advice. Google for "steam libstdc++" to see all the drama that this brings. In short, if your exe loads external libs (like, opengl) that want to dlopen libstdc++ again (like, radeon drivers), those libs will be using your libstdc++ because it's already loaded, instead of their own, which is what they need and expect. So you're back to square one.Kor
@cap, the OP is specifically asking about deploying to a distro where the system libstdc++ is older. The Steam problem is that they bundled a libstdc++.so that was older than the system one (presumably it was newer at the time they bundled it, but distros moved on to even newer ones). That can be solved by having the RPATH point to a directory containing a libstdc++.so.6 symlink that is set at installation time to point to the bundled lib or to the system one if it's newer. There are more complicated mixed-linkage models, as used by Red Hat DTS, but they're hard to do yourself.Cesar
But if you ship a newer one you get into the same trouble, if the 3rd party library actually requires an older one. Now you shifted your bet on to libstdc++ being backwards compatible, which is out of your control and not as true as they make it out to be.Kor
@cap, no, because the newer one is guaranteed to be backward compatible, and I know I can rely on that because it's my job to maintain it. If you have evidence for your claim please follow the instructions at gcc.gnu.org/bugsCesar
hey man, I'm sorry if I don't want my model for shipping backwards-compat binaries to include "trusting other people to keep libstdc++ ABI compat" or "conditionally linking libstdc++ at runtime"... if that ruffles some feathers here and there, what can I do, I mean no disrespect. And if you remember the memcpy@GLIBC_2.14 drama, you can't really fault me for having trust issues with this :)Kor
@cap, as I'm sure you know, only code with undefined behaviour was affected by the change to memcpy, and the whole point of the memcpy@GLIBC_2.14 symbol is that now both versions are present i.e. glibc offers backwards compatibility even for programs with undefined behaviour. But that's not libstdc++ anyway, is it?Cesar
you're right, it's not libstdc++, but the attitude you just expressed regarding the implications of that change doesn't help me with my trust issues :)Kor
I had to use '-Wl,-rpath,$ORIGIN' (note the '-' in front of rpath). I can't edit the answer because edits must be at least 6 characters ....Antoninaantonino
This answer suggests that we need only be shipping libstdc++.so, not libgcc_s.so too (as the OP was doing). Is that right? Or will that depend on various things?Vanhoose
@LightnessRacesinOrbit libgcc_s.so rarely changes, so it's unlikely you need a newer oneCesar
@JonathanWakely: In the unlikely event that (for whatever reason) I do, will it be immediately evident (assertion failure or somesuch) or will I just be happily UBing along until one day everything goes up in smoke?Vanhoose
Come to think of it, I guess it just wouldn't link. SONAME there for a reason innitVanhoose
The soname doesn't change. More likely you'll get an undefined reference when the dynamic linker tries to start your program. Any problems will be noisy, not silent.Cesar
Jonathan- regarding the assertion that the versions are backwards compatible, can you reference a doc that backs that up?Duna
@TimothyJohnLaird gcc.gnu.org/onlinedocs/libstdc++/manual/manual/abi.html (and the fact the SONAME is unchanged, which means it provides the same interface, at least it does if you version your libraries properly, which GCC does)Cesar
@JonathanWakely I may be wrong, but I think you missed one point about dynamically loading when libstdc++ is statically linked. And it's sort of a biggie when we're talking about an extension/plugin mechanism for a program. It's generally possible that the statically linked libstdc++ symbols perform some task (e.g. allocation) and the dynamically loaded libstdc++ is then tasked to free said resource. This is asking for trouble and could roughly be compared to mixing msvcrtd and msvcrt on Windows, but passing the resources around. This makes for a class of rather subtle defects.Snuff
... so what I am trying to say is that, while technically possible, using a mix of code statically and dynamically linking to libstdc++ is probably a bad idea as it requires that the developer is always aware which "instance" if the libstdc++ owns a particular resource.Snuff
@JonathanWakely the point about memcpy is a particularly weak one. If you look at the history of why memcpy and memmove exist, you'll notice that at least since 1990 or so, there is literally no reason for not granting memcpy the semantics of memmove (safe to copy overlapping buffers). And incidentally everybody but the glibc maintainers decided to do that eventually, because it's a near-zero overhead for a library to fix (programmer) mistakes galore. Ask C devs what's the difference between memcpy and memmove, often they don't know. Also works with malloc/calloc ...Snuff
@Snuff "it requires that the developer is always aware which "instance" if the libstdc++ owns a particular resource" no it doesn't, because (apart from on Windows) there is no concept of a particular instance owning it. There is only one C++ runtime, and it can free any resources created by the C++ runtime. e.g. there is only one heap, shared by the whole application. That's not true on Windows, but that model doesn't conform to the requirements of the C++ standard.Cesar
Furthermore, on GNU/Linux (which OP was asking about) the STB_GNU_UNIQUE binding ensures that there is only one instance of each global variable, even if you load the library multiple times. That's not to say that you won't run into other problems, but not the ones you are referring to.Cesar
@JonathanWakely so you're telling me that the statically linked libstdc++ which is interposed with the application code at build time is identical to the libstdc++ dynamically loaded as .so (i.e. they're same instance)? Well, I beg to differ! It arguably may well be the same version of the library, but that still doesn't ensure that one instance is aware of how to free resources of the other. The "one heap" point has nothing to do with that, because how the library internally implements new is up to it and potentially the user/developer. And that's then other problems ...Snuff
@Snuff No, not identical, but compatible, and equivalent. And there's only one copy of a given symbol (such as operator new(size_t) in the process. And there is only one source of memory for that operator new defined in libstdc++, no matter how many copies of libstdc++ you link to, same version or not. Re "aware of how to free resources of the other" I still dispute that there are any resources that belong to one or the other. Do you have concrete examples of the resources you're concerned about?Cesar
For example, why do you think it matters which "instance" of libstdc++ you call operator new on and which instance you call operator delete on? Neither of those functions "owns" the memory they deal with, they just pass it through to libc's malloc and free. There's nothing tied to an instance of libstdc++Cesar
@JonathanWakely I confess I am rather ignorant on the fine points of how .so's work at a comprehensive level. If I use '-Wl,-rpath,$ORIGIN' to load a shared object, and a shared object with the same name has already been loaded by another process (perhaps from /usr/local/lib or other ystem folder), does the .so loaded by the ORIGIN linker step replace the .so for the whole system? Or does it only apply to that process? Is there a good resource for how dynamic linking behaves in linux?Duna
This didn't work for me. After copying libstdc++ I got /lib/x86_64-linux-gnu/libc.so.6: version 'GLIBC_2.33' not found and after copying libc I then got symbol lookup error: /home/charles/libc.so.6: undefined symbol: _dl_fatal_printf, version GLIBC_PRIVATE, and now I don't think I've got any more options.Dualistic
"Not true. The only C++ ABI changes introduced since GCC 3.4 have been backward-compatible, meaning the C++ ABI has been stable for nearly nine years." This isn't true. I recently diagnosed an ABI bug in some commercial software, that boiled down to breaking changes in libstdc++. There's sixteen versions of the ABI, including changes in recent releases of GCC (10 and 11)Rifleman
@Rifleman please report them to GCC's bugzilla thenCesar
Quite a few years later, but anyways. I use this Jonathan Wakely's solution. The only problem is that sometimes you also have to specify -Wl,-dynamic-linker,[path here] And I have not found a way to put a relative path in the [path here] Anybody else did?Otic
please report them [abi breaks] to GCC's bugzilla then - the breaks are in "experimental" implementations. Unfortunately, it is really hard for coders to keep track of what things, like std::span which "broke" between 10 and 11, are experimental and what are not because GCC doesn't warn of usage of such "experimental" constructsTufa
@Tufa the release notes are quite explicit that C++20 support is experimental in those releases: gcc.gnu.org/gcc-11/changes.html#libstdcxxCesar
D
14

One addition to Jonathan Wakely's excellent answer, why dlopen() is problematic:

Due to the new exception handling pool in GCC 5 (see PR 64535 and PR 65434), if you dlopen and dlclose a library that is statically linked to libstdc++, you will get a memory leak (of the pool object) each time. So if there's any chance that you'll ever use dlopen, it seems like a really bad idea to statically link libstdc++. Note that this is a real leak as opposed to the benign one mentioned in PR 65434.

Doubling answered 26/1, 2016 at 13:52 Comment(1)
The function __gnu_cxx::__freeres() seems to provide at least some help with this issue, since it frees the internal buffer of the pool object. But for me it is rather unclear which implication a call to this function has with respect to exceptions accidentally thrown afterwards.Simulcast
E
5

Add-on to Jonathan Wakely's answer regarding the RPATH:

RPATH will only work if the RPATH in question is the RPATH of the running application. If you have a library which dynamically links to any library through its own RPATH, the library's RPATH will be overwritten by the RPATH of the application which loads it. This is a problem when you cannot guarantee that the RPATH of the application is the same as that of your library, e.g. if you expect your dependencies to be in a particular directory, but that directory is not part of the application's RPATH.

For example, let us say you have an application App.exe which has a dynamically-linked dependency on libstdc++.so.x for GCC 4.9. The App.exe has this dependency resolved through the RPATH, i.e.

App.exe (RPATH=.:./gcc4_9/libstdc++.so.x)

Now let's say there is another library Dependency.so, which has a dynamically-linked dependency on libstdc++.so.y for GCC 5.5. The dependency here is resolved through the RPATH of the library, i.e.

Dependency.so (RPATH=.:./gcc5_5/libstdc++.so.y)

When App.exe loads Dependency.so, it neither appends nor prepends the RPATH of the library. It doesn't consult it at all. The only RPATH which is considered will be that of the running application, or App.exe in this example. That means that if the library relies on symbols which are in gcc5_5/libstdc++.so.y but not in gcc4_9/libstdc++.so.x, then the library will fail to load.

This is just as a word of warning, since I've run into these issues myself in the past. RPATH is a very useful tool but its implementation still has some gotchas.

Endocarditis answered 19/6, 2019 at 17:24 Comment(1)
so RPATH for shared libraries is kind of pointless! And I was hoping, that they improved Linux a bit in this respect in the last 2 decades...Humidistat
R
3

You might also need to make sure that you don't depend on the dynamic glibc. Run ldd on your resulting executable and note any dynamic dependencies (libc/libm/libpthread are usal suspects).

Additional exercise would be building a bunch of involved C++11 examples using this methodology and actually trying the resulting binaries on a real 10.04 system. In most cases, unless you do something weird with dynamic loading, you'll know right away whether the program works or it crashes.

Reckoning answered 29/11, 2012 at 23:34 Comment(3)
What is the problem with depending on the dynamic glibc?Heads
I believe at least some time ago libstdc++ implied dependency on glibc. Not sure where things stand today.Reckoning
libstdc++ does depend on glibc (e.g. iostreams are implemented in terms of printf) but as long as the glibc on Ubuntu 10.04 provides all the features needed by the newer libstdc++ there's no problem with depending on the dynamic glibc, in fact it's highly recommended never to link statically to glibcCesar
H
1

I'd like to add to Jonathan Wakely's answer the following.

Playing around -static-libstdc++ on linux, I've faced the problem with dlclose(). Suppose we have an application 'A' statically linked to libstdc++ and it loads dynamically linked to libstdc++ plugin 'P' at runtime. That's fine. But when 'A' unloads 'P', segmentation fault occurs. My assumption is that after unloading libstdc++.so, 'A' no longer can use symbols related to libstdc++. Note that if both 'A' and 'P' are statically linked to libstdc++, or if 'A' is linked dynamically and 'P' statically, the problem does not occur.

Summary: if your application loads/unloads plugins that may dynamically link to libstdc++, the app must also be linked to it dynamically. This is just my observation and I'd like to get your comments.

Hardej answered 27/2, 2019 at 19:16 Comment(3)
This is probably akin to mixing libc implementations (say dynamically linking to a plugin that in turn dynamically links glibc, whereas the application itself is statically linked to musl-libc). Rich Felker, author of musl-libc, claims that the issue in such a scenario is that the glibc memory management (using sbrk) makes certain assumption and pretty much expects to be alone within one process ... not sure if this is limited to a particular glibc version or whatever, though.Snuff
and people still don't see the advantages of the windows heap interface, which is able to deal with multiple independent copies of libc++/libc inside a single process. Such people should not design software.Humidistat
@FrankPuck having a decent amount of both Windows and Linux experience I can tell you that the way "Windows" does it won't help you when MSVC is the party that decides what allocator gets used and how. The main advantage I see with heaps on Windows is that you can hand out bits and pieces and then free them in one fell swoop. But with MSVC you will still run into pretty much the problem described above, e.g. when passing around pointers allocated by another VC runtime (release vs. debug or statically vs. dynamically linked). So "Windows" isn't immune. Care has to be taken on both systems.Snuff

© 2022 - 2024 — McMap. All rights reserved.