TL;DR
You're compiling with GCC 8.4.0, therefore you need to link explicitly against -lstdc++fs
.
Since you're using GCC 8.4.0, you're using the GNU C++ Standard Library aka libstdc++ headers for version GCC 8.4.0. But your system (Ubuntu 20.04) only contains libstdc++.so.6.0.28
from GCC 9. If you don't explicitly link against -lstdc++fs
, then you're accidentally consuming a std::filesystem
symbol from GCC 9 (via libstdc++.so
) instead of from GCC 8 (via libstdc++fs.a
).
GCC 8 and GCC 9 have incompatible std::filesystem
types. More specifically, their binary layout is different. This is basically a very hidden ODR-violation. Your object is allocated for GCC 8 layout but constructed using GCC 9 layout. When you then attempt to destroy it, the destructor uses GCC 8 layout and crashes because the data is not what it expects.
There are two pieces of code which use different, incompatible layouts of the path
type.
The first piece of code is from libstdc++.so.6.0.28
: It contains a definition of path::_M_split_cmpts()
, called via the inline constructor path::path(string_type&&, format)
. Since the constructor is inline, code for the constructor itself is generated into your executable. Your executable therefore contains a call to path::_M_split_cmpts
.
The second piece of code is in your own executable: It generates instructions for the inline (defaulted) destructor path::~path()
, and the inline functions it calls; all the way up to std::filesystem::__cxx11::path::path<char [21], std::filesystem::__cxx11::path>(char const (&) [21], std::filesystem::__cxx11::path::path>(char const (&) [21], std::filesystem::__cxx11::path::format)
.
How can we find this?
using a debugger:
Stepping through suspicious functions in the ctor reveals:
0x5569716498ed <std::filesystem::__cxx11::path::path<char [21], std::filesystem::__cxx11::path>(char const (&) [21], std::filesystem::__cxx11::path::path>(char const (&) [21], std::filesystem::__cxx11::path::format)+112> callq 0x5569716491e0 <_ZNSt10filesystem7__cxx114path14_M_split_cmptsEv@plt>
That's a call through the PLT (so, potentially from a shared object, and definitely not inlined). We step into it and:
(gdb) bt
#0 0x00007f102c60f260 in std::filesystem::__cxx11::path::_M_split_cmpts() () from /lib/x86_64-linux-gnu/libstdc++.so.6
#1 0x00005569716498ed in std::filesystem::__cxx11::path::path<char [21], std::filesystem::__cxx11::path> (this=0x7ffe1a07ad60, __source=...)
at /usr/include/c++/8/bits/fs_path.h:185
#2 0x00005569716493fd in main () at blub.cpp:6
So, we can see that it comes indeed from /lib/x86_64-linux-gnu/libstdc++.so.6
, which is a symlink to /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28
.
The dtor we can see e.g. in the Valgrind output in the OP:
==30078== Invalid read of size 8
==30078== at 0x13AD9B: std::vector<std::filesystem::__cxx11::path::_Cmpt, std::allocator<std::filesystem::__cxx11::path::_Cmpt> >::~vector() (in /home/(me)/src/tomato/build-src-Desktop-Release/TomatoLauncher)
It's inline and therefore in the executable.
Now, the actually interesting part is that both the header which contains the inlined function for path
and the path::_M_split_cmpts
function are from the GNU C++ Standard library (libstdc++).
How can they be incompatible?
To answer this, let's take a look at the exact version. We're compiling with GCC 8.4.0. It has baked in include paths, and they refer to standard library headers shipped in the gcc-8 package of Ubuntu 20.04. Those match perfectly, and you have to change default settings to make GCC consume different, unmatching standard library headers. The headers are therefore those of GCC 8.4.0.
What about the shared object libstdc++.so
? We're running with libstdc++.so.6.0.28
according to ldd
and the debugger. According to libstdc++ ABI Policy and Guidelines, that's GCC >= 9.3.
libstdc++.so.6.0.28 does contain a definition of _ZNSt10filesystem7__cxx114path14_M_split_cmptsEv
:
$ objdump -T /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28 | grep _ZNSt10filesystem7__cxx114path14_M_split_cmptsEv
000000000016a260 g DF .text 00000000000005f3 GLIBCXX_3.4.26 _ZNSt10filesystem7__cxx114path14_M_split_cmptsEv
According to the ABI doc, this is
GCC 9.1.0: GLIBCXX_3.4.26, CXXABI_1.3.12
So that's a symbol which was NOT available in GCC 8.4.0.
Why doesn't the compiler/linker complain?
When we compile with gcc-8, why doesn't the compiler or linker complain about us consuming a symbol from GCC 9?
If we compile with -v
, we see the linker invocation:
COLLECT_GCC_OPTIONS='-v' '-std=c++17' '-g' '-shared-libgcc' '-mtune=generic' '-march=x86-64'
/usr/lib/gcc/x86_64-linux-gnu/8/collect2 -plugin /usr/lib/gcc/x86_64-linux-gnu/8/liblto_plugin.so -plugin-opt=/usr/lib/gcc/x86_64-linux-gnu/8/lto-wrapper -plugin-opt=-fresolution=/tmp/cceJgWPt.res -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc --build-id --eh-frame-hdr -m elf_x86_64 --hash-style=gnu --as-needed -dynamic-linker /lib64/ld-linux-x86-64.so.2 -pie -z now -z relro /usr/lib/gcc/x86_64-linux-gnu/8/../../../x86_64-linux-gnu/Scrt1.o /usr/lib/gcc/x86_64-linux-gnu/8/../../../x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/8/crtbeginS.o -L/usr/lib/gcc/x86_64-linux-gnu/8 -L/usr/lib/gcc/x86_64-linux-gnu/8/../../../x86_64-linux-gnu -L/usr/lib/gcc/x86_64-linux-gnu/8/../../../../lib -L/lib/x86_64-linux-gnu -L/lib/../lib -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib -L/usr/lib/gcc/x86_64-linux-gnu/8/../../.. /tmp/ccTNph3u.o -lstdc++ -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc /usr/lib/gcc/x86_64-linux-gnu/8/crtendS.o /usr/lib/gcc/x86_64-linux-gnu/8/../../../x86_64-linux-gnu/crtn.o COLLECT_GCC_OPTIONS='-v' '-std=c++17' '-g' '-shared-libgcc' '-mtune=generic' '-march=x86-64'
In there, we have -L/usr/lib/gcc/x86_64-linux-gnu/8
and other paths to find the standard library. There, we find libstdc++.so -> ../../../x86_64-linux-gnu/libstdc++.so.6
, which finally points to /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28
(!!!).
So the linker is given GCC 9's libstdc++.so
, and it does NOT receive any version information on the symbol from the compiler (*). The compiler only knows the source code, and the source code does not contain a symbol version in this case (filesystem headers of GCC 8.4.0). The symbol version is however present in the ELF binary libstdc++.so
. The linker sees GLIBCXX_3.4.26
for the symbol requested by the compiler _ZNSt10filesystem7__cxx114path14_M_split_cmptsEv
and is satisfied with that. Makes you wonder if there's a linker switch to tell the linker "don't consume a versioned symbol if I requested an unversioned symbol".
(*) The linker does not receive any symbol information on that unresolved symbol from the compiler because the compiler has no such information from the source code. You can add info to your source code. I don't know how libstdc++ usually does it - or its policy on symbol versions in header files. It looks like it is not done at all for filesystem
.
The ELF symbol versioning mechanism should usually prevent such incompatibilities: If there is a layout-incompatible change, you create a new symbol with the same name but a different version, and add it to libstdc++.so
, which then contains both the old and the new version.
A binary compiled against libstdc++.so
specifies which version of a symbol it wants, and the dynamic loader properly resolves the undefined symbols against symbols of matching name and version. Note that the dynamic linker does not know which shared library to search (on Windows/PE, this is different). Any "symbol request" is merely an undefined symbol, and there's a completely separate list of required libraries which shall provide those undefined symbols. But there's no mapping in the binary which symbol should come from which library.
Because the ELF symbol versioning mechanism allows backwards-compatible additions of symbols, we can maintain a single libstdc++.so
for multiple versions of the compiler. That's why you see symlinks all over the place, leading all to the same file. The suffix .6.0.28
is another, orthogonal versioning scheme which allows backwards-incompatible changes: You binary can specify that it needs libstdc++.so.6
and you can add an incompatible libstdc++.so.7
for other binaries.
Fun fact: If you linked your library against a pure GCC 8 version of libstdc++.so
, you would have seen a linker error. Linking against a shared library doesn't do much to the binary; it does however fix the symbol versions of unresolved symbols and can check that no unresolved symbols are left after looking though all libraries. We can see that your binary actually requests _ZNSt10filesystem7__cxx114path14_M_split_cmptsEv@GLIBCXX_3.4.26
when you link it against libstdc++.so.6.0.28
.
Fun fact 2: If you run your library against a pure GCC 8 version of libstdc++.so
, you would have received a dynamic linker error, because it cannot find _ZNSt10filesystem7__cxx114path14_M_split_cmptsEv@GLIBCXX_3.4.26
.
What should actually happen?
You should actually link to libstdc++fs.a
. It also provides a definition of _ZNSt10filesystem7__cxx114path14_M_split_cmptsEv
, and it's not a symlink but specific to this GCC version: /usr/lib/gcc/x86_64-linux-gnu/8/libstdc++fs.a
.
When you link against -lstdc++fs
, you get its symbols included directly into the executable (since it's a static library). Symbols in the executable take priority over the symbols in shared objects. Therefore, the _ZNSt10filesystem7__cxx114path14_M_split_cmptsEv
from libstdc++fs.a
is used.
What's actually the incompatibility in layout in path
?
GCC 9 introduced a different type to hold the components of the path. Using clang++ -cc1 -fdump-record-layouts
, we can see the offset at the left side, and the member and type names at the right side:
GCC 8.4.0:
0 | class std::filesystem::__cxx11::path
0 | class std::__cxx11::basic_string<char, struct std::char_traits<char>, class std::allocator<char> > _M_pathname
0 | struct std::__cxx11::basic_string<char, struct std::char_traits<char>, class std::allocator<char> >::_Alloc_hider _M_dataplus
0 | class std::allocator<char> (base) (empty)
0 | class __gnu_cxx::new_allocator<char> (base) (empty)
0 | std::__cxx11::basic_string<char, struct std::char_traits<char>, class std::allocator<char> >::pointer _M_p
8 | std::__cxx11::basic_string<char, struct std::char_traits<char>, class std::allocator<char> >::size_type _M_string_length
16 | union std::__cxx11::basic_string<char, struct std::char_traits<char>, class std::allocator<char> >::(anonymous at /usr/include/c++/8/bits/basic_string.h:160:7)
16 | char [16] _M_local_buf
16 | std::__cxx11::basic_string<char, struct std::char_traits<char>, class std::allocator<char> >::size_type _M_allocated_capacity
32 | class std::vector<struct std::filesystem::__cxx11::path::_Cmpt, class std::allocator<struct std::filesystem::__cxx11::path::_Cmpt> > _M_cmpts
32 | struct std::_Vector_base<struct std::filesystem::__cxx11::path::_Cmpt, class std::allocator<struct std::filesystem::__cxx11::path::_Cmpt> > (base)
32 | struct std::_Vector_base<struct std::filesystem::__cxx11::path::_Cmpt, class std::allocator<struct std::filesystem::__cxx11::path::_Cmpt> >::_Vector_impl _M_impl
32 | class std::allocator<struct std::filesystem::__cxx11::path::_Cmpt> (base) (empty)
32 | class __gnu_cxx::new_allocator<struct std::filesystem::__cxx11::path::_Cmpt> (base) (empty)
32 | std::_Vector_base<struct std::filesystem::__cxx11::path::_Cmpt, class std::allocator<struct std::filesystem::__cxx11::path::_Cmpt> >::pointer _M_start
40 | std::_Vector_base<struct std::filesystem::__cxx11::path::_Cmpt, class std::allocator<struct std::filesystem::__cxx11::path::_Cmpt> >::pointer _M_finish
48 | std::_Vector_base<struct std::filesystem::__cxx11::path::_Cmpt, class std::allocator<struct std::filesystem::__cxx11::path::_Cmpt> >::pointer _M_end_of_storage
56 | enum std::filesystem::__cxx11::path::_Type _M_type
| [sizeof=64, dsize=57, align=8,
| nvsize=57, nvalign=8]
GCC 9.3.0:
0 | class std::filesystem::__cxx11::path
0 | class std::__cxx11::basic_string<char> _M_pathname
0 | struct std::__cxx11::basic_string<char, struct std::char_traits<char>, class std::allocator<char> >::_Alloc_hider _M_dataplus
0 | class std::allocator<char> (base) (empty)
0 | class __gnu_cxx::new_allocator<char> (base) (empty)
0 | std::__cxx11::basic_string<char, struct std::char_traits<char>, class std::allocator<char> >::pointer _M_p
8 | std::__cxx11::basic_string<char, struct std::char_traits<char>, class std::allocator<char> >::size_type _M_string_length
16 | union std::__cxx11::basic_string<char, struct std::char_traits<char>, class std::allocator<char> >::(anonymous at /usr/include/c++/9/bits/basic_string.h:171:7)
16 | char [16] _M_local_buf
16 | std::__cxx11::basic_string<char, struct std::char_traits<char>, class std::allocator<char> >::size_type _M_allocated_capacity
32 | struct std::filesystem::__cxx11::path::_List _M_cmpts
32 | class std::unique_ptr<struct std::filesystem::__cxx11::path::_List::_Impl, struct std::filesystem::__cxx11::path::_List::_Impl_deleter> _M_impl
32 | class std::__uniq_ptr_impl<struct std::filesystem::__cxx11::path::_List::_Impl, struct std::filesystem::__cxx11::path::_List::_Impl_deleter> _M_t
32 | class std::tuple<struct std::filesystem::__cxx11::path::_List::_Impl *, struct std::filesystem::__cxx11::path::_List::_Impl_deleter> _M_t
32 | struct std::_Tuple_impl<0, struct std::filesystem::__cxx11::path::_List::_Impl *, struct std::filesystem::__cxx11::path::_List::_Impl_deleter> (base)
32 | struct std::_Tuple_impl<1, struct std::filesystem::__cxx11::path::_List::_Impl_deleter> (base) (empty)
32 | struct std::_Head_base<1, struct std::filesystem::__cxx11::path::_List::_Impl_deleter, true> (base) (empty)
32 | struct std::filesystem::__cxx11::path::_List::_Impl_deleter (base) (empty)
32 | struct std::_Head_base<0, struct std::filesystem::__cxx11::path::_List::_Impl *, false> (base)
32 | struct std::filesystem::__cxx11::path::_List::_Impl * _M_head_impl
| [sizeof=40, dsize=40, align=8,
| nvsize=40, nvalign=8]
The difference is in path::_M_cmpts
:
// GCC 8
class std::vector<
struct std::filesystem::__cxx11::path::_Cmpt,
class std::allocator<struct std::filesystem::__cxx11::path::_Cmpt>
> _M_cmpts
// GCC 9
struct std::filesystem::__cxx11::path::_List _M_cmpts
You can also see the structure of path::_List
in the record dump above. It's very much not compatible to a GCC 8 vector
.
Remember that we're calling path::_M_split_cmpts
via libstdc++.so from GCC 9, and we're crashing in the vector
destructor for this _M_cmpts
data member.
Here's the commit that changed from vector
to _List
:
commit 4f87bb8d6e8dec21a07f1fba641a78a127281349
Author: Jonathan Wakely <[email protected]>
Date: Thu Dec 13 20:33:55 2018 +0000
PR libstdc++/71044 optimize std::filesystem::path construction
This new implementation has a smaller footprint than the previous
implementation, due to replacing std::vector<_Cmpt> with a custom pimpl
type that only needs a single pointer. The _M_type enumeration is also
combined with the pimpl type, by using a tagged pointer, reducing
sizeof(path) further still.
Construction and modification of paths is now done more efficiently, by
splitting the input into a stack-based buffer of string_view objects
instead of a dynamically-allocated vector containing strings. Once the
final size is known only a single allocation is needed to reserve space
for it. The append and concat operations no longer require constructing
temporary path objects, nor re-parsing the entire native pathname.
This results in algorithmic improvements to path construction, and
working with large paths is much faster.
exists
is implemented as it appears to be the only thing here that could contain an error. – Brahmanismstd::filesystem::exists
can throwstd::filesystem::filesystem_error
. – Fanciedexists
is part of the stl. I tried it without even calling it and just returning true, which didn't change anything. If I remove the initialization of the path variable and have nothing in the function but areturn true;
, it doesn't crash. – Disillusionizeexception
but it doesn't help. – Disillusionizeg++ --version
say? – Jarrodsudo apt list --installed 'g++*'
and thensudo apt reinstall <packages>
could perhaps help. – Jarrodldd
ona.out
. See which version oflibstdc++
is linked. Is it the one you expect? – Carboxylaselibstdc++.so.6
. I need to keep v6 for CUDA but I guess I have to install a more recent one as well? – Disillusionize_Cmpt
which is derived frompath
. – Physicochemical-L
does not set the rpath of the binary IIRC. Please checkldd a.out
to see what you're actually running with. – Physicochemicalg++ -v -std=c++17 pathExistsTest.cpp
. It will show the include paths at least. Even better, use-M
to print the locations of every included header. From those files, we can determine (bydiff
for the lack of a better option) which version they come from. Regarding libstdc++.so, read the highest version usingreadelf --version-info
– Physicochemicallibstdc++.so.6
is using a different version scheme from the 8 in the path/usr/lib/gcc/x86_64-linux-gnu/8/libstdc++.so
, see gcc.gnu.org/onlinedocs/libstdc++/manual/abi.html – Physicochemical-v
:compiled by GNU C version 8.4.0, GMP version 6.2.0, MPFR version 4.0.2, MPC version 1.1.0, isl version isl-0.22.1-GMP
(full output: pastebin.com/upZBH4E9 ) – Disillusionizepath
, or B) GCC 8.4'spath
is simply broken. – Physicochemicalpath
and found a comment that I should compile with-lstdc++fs
and it worked! Apparently there's some instability. If you would summarize your hints in an actual answer, I'd issue you the bounty, thanks a lot. – Disillusionize