Program crashes when filesystem::path is destroyed
Asked Answered
D

1

8

The following program crashes:

#include <iostream>
#include <filesystem>
namespace fs = std::filesystem;
int main()
{
    fs::path p1 = "/usr/lib/sendmail.cf";
 
    std::cout << "p1 = " << p1 << '\n';
}

Compilation:

$ g++ -std=c++17 pathExistsTest.cpp
$ ./a.out 
p1 = "/usr/lib/sendmail.cf"
[1]    35688 segmentation fault (core dumped)  ./a.out

Tested on Ubuntu 20.04, compiler is GCC 8.4.0.

Valgrind, here is the cut output:

==30078==    by 0x4AE5034: QAbstractButton::mouseReleaseEvent(QMouseEvent*) (in /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5.12.8)
==30078==    by 0x4A312B5: QWidget::event(QEvent*) (in /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5.12.8)
==30078==  Address 0x2b is not stack'd, malloc'd or (recently) free'd
==30078== 
==30078== 
==30078== Process terminating with default action of signal 11 (SIGSEGV)
==30078==  Access not within mapped region at address 0x2B
==30078==    at 0x13AD9B: std::vector<std::filesystem::__cxx11::path::_Cmpt, std::allocator<std::filesystem::__cxx11::path::_Cmpt> >::~vector() (in /home/(me)/src/tomato/build-src-Desktop-Release/TomatoLauncher)

Full Output
I don't even know why the vector dtor is called? I only create a path variable, no vector<path>.

Disillusionize answered 15/9, 2020 at 13:8 Comment(28)
Looks like you may have Undefined Behavior somewhere in your code which manifests here. It doesn't seem like it is in the code shown. Undefined Behavior can occur very far from where the symptoms occur. Edit : Maybe share how exists is implemented as it appears to be the only thing here that could contain an error.Brahmanism
Do you have try catch block? Program will crash if you do not have it and std::filesystem::exists can throw std::filesystem::filesystem_error.Fancied
Have you tried to build with verbose warnings enabled? There's no warnings? Have you tried to use a memory debugger such as Valgrind? It doesn't report any problems?Charade
exists is part of the stl. I tried it without even calling it and just returning true, which didn't change anything. If I remove the initialization of the path variable and have nothing in the function but a return true;, it doesn't crash.Disillusionize
There are no warnings, but I'll have a look at enabling verbose warnings. I do have a catch block that catches exception but it doesn't help.Disillusionize
You have to show a complete program, the bug is elsewhereSpradlin
... works for me...Pines
Yeah, it seems to be a problem with my installation. No idea how to approach it, though.Disillusionize
What does g++ --version say?Jarrod
To repair an installation, sudo apt list --installed 'g++*' and then sudo apt reinstall <packages> could perhaps help.Jarrod
Try running ldd on a.out. See which version of libstdc++ is linked. Is it the one you expect?Carboxylase
@Carboxylase thanks, it shows it's libstdc++.so.6. I need to keep v6 for CUDA but I guess I have to install a more recent one as well?Disillusionize
The vector is from here: code.woboq.org/gcc/libstdc++-v3/include/bits/… -- It's the path components, stored as a vector of a type _Cmpt which is derived from path.Physicochemical
If you compile against the filesystem headers from version A of libstdc++, and run the executable which loads version B of libstdc++, where A != B, that could explain such a result.Physicochemical
-L does not set the rpath of the binary IIRC. Please check ldd a.out to see what you're actually running with.Physicochemical
@Physicochemical I forgot to include the ldd output, it's running with v6. Will include it nowDisillusionize
How can it happen that I run with a version different from the one I compiled against?Disillusionize
"How can it happen" Well that is an interesting question. But first, let's try if the issue disappears if you make everything consistent. That is, we need to find the version of the headers you're using during compilation. You can get those with g++ -v -std=c++17 pathExistsTest.cpp. It will show the include paths at least. Even better, use -M to print the locations of every included header. From those files, we can determine (by diff for the lack of a better option) which version they come from. Regarding libstdc++.so, read the highest version using readelf --version-infoPhysicochemical
The 6 in libstdc++.so.6 is using a different version scheme from the 8 in the path /usr/lib/gcc/x86_64-linux-gnu/8/libstdc++.so, see gcc.gnu.org/onlinedocs/libstdc++/manual/abi.htmlPhysicochemical
Thanks a lot. g++ output with -v: compiled by GNU C version 8.4.0, GMP version 6.2.0, MPFR version 4.0.2, MPC version 1.1.0, isl version isl-0.22.1-GMP (full output: pastebin.com/upZBH4E9 )Disillusionize
Can't really draw conclusions from that yet. However GCC 8.3 behaves fishy: godbolt.org/z/fjhGs7 -- it has issues exactly with that vector as well. My working hypotheses are: Either A) your header and .so have an incompatibility in the layout of path, or B) GCC 8.4's path is simply broken.Physicochemical
I checked for possible bugs 8.4's path and found a comment that I should compile with -lstdc++fs and it worked! Apparently there's some instability. If you would summarize your hints in an actual answer, I'd issue you the bounty, thanks a lot.Disillusionize
Here's where I found the flag, it has to be used until GCC 9 if you want to use filesystem. #33150378Disillusionize
Woah, umm, it finds the symbol somwhere else?? And that definition is incompatible? o.OPhysicochemical
Hah, I can reproduce it in a Ubuntu 20.04 container. Neat. Will write an answer as soon as I figured out where this symbol comes from. Don't care about the bounty.Physicochemical
Maybe I removed a bit too much from the question. Feel free to add back. The idea is that this question is useful for future readers, basically people searching for the same problem.Physicochemical
No no, I considered removing the better part myself but wasn't sure if something was needed for confirmability. It's better to understand now.Disillusionize
I suggest you kindly ask the GCC developers about their take on symbol versioning. This could have been avoided if they added symbol versions to their header files. However, I have no experience with this, there might be good arguments against.Physicochemical
P
40

TL;DR

You're compiling with GCC 8.4.0, therefore you need to link explicitly against -lstdc++fs.

Since you're using GCC 8.4.0, you're using the GNU C++ Standard Library aka libstdc++ headers for version GCC 8.4.0. But your system (Ubuntu 20.04) only contains libstdc++.so.6.0.28 from GCC 9. If you don't explicitly link against -lstdc++fs, then you're accidentally consuming a std::filesystem symbol from GCC 9 (via libstdc++.so) instead of from GCC 8 (via libstdc++fs.a).

GCC 8 and GCC 9 have incompatible std::filesystem types. More specifically, their binary layout is different. This is basically a very hidden ODR-violation. Your object is allocated for GCC 8 layout but constructed using GCC 9 layout. When you then attempt to destroy it, the destructor uses GCC 8 layout and crashes because the data is not what it expects.


There are two pieces of code which use different, incompatible layouts of the path type.

The first piece of code is from libstdc++.so.6.0.28: It contains a definition of path::_M_split_cmpts(), called via the inline constructor path::path(string_type&&, format). Since the constructor is inline, code for the constructor itself is generated into your executable. Your executable therefore contains a call to path::_M_split_cmpts.

The second piece of code is in your own executable: It generates instructions for the inline (defaulted) destructor path::~path(), and the inline functions it calls; all the way up to std::filesystem::__cxx11::path::path<char [21], std::filesystem::__cxx11::path>(char const (&) [21], std::filesystem::__cxx11::path::path>(char const (&) [21], std::filesystem::__cxx11::path::format).


How can we find this?

using a debugger: Stepping through suspicious functions in the ctor reveals:

0x5569716498ed <std::filesystem::__cxx11::path::path<char [21], std::filesystem::__cxx11::path>(char const (&) [21], std::filesystem::__cxx11::path::path>(char const (&) [21], std::filesystem::__cxx11::path::format)+112>       callq  0x5569716491e0 <_ZNSt10filesystem7__cxx114path14_M_split_cmptsEv@plt>

That's a call through the PLT (so, potentially from a shared object, and definitely not inlined). We step into it and:

(gdb) bt
#0  0x00007f102c60f260 in std::filesystem::__cxx11::path::_M_split_cmpts() () from /lib/x86_64-linux-gnu/libstdc++.so.6
#1  0x00005569716498ed in std::filesystem::__cxx11::path::path<char [21], std::filesystem::__cxx11::path> (this=0x7ffe1a07ad60, __source=...)
    at /usr/include/c++/8/bits/fs_path.h:185
#2  0x00005569716493fd in main () at blub.cpp:6

So, we can see that it comes indeed from /lib/x86_64-linux-gnu/libstdc++.so.6, which is a symlink to /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28.

The dtor we can see e.g. in the Valgrind output in the OP:

==30078== Invalid read of size 8
==30078==    at 0x13AD9B: std::vector<std::filesystem::__cxx11::path::_Cmpt, std::allocator<std::filesystem::__cxx11::path::_Cmpt> >::~vector() (in /home/(me)/src/tomato/build-src-Desktop-Release/TomatoLauncher)

It's inline and therefore in the executable.


Now, the actually interesting part is that both the header which contains the inlined function for path and the path::_M_split_cmpts function are from the GNU C++ Standard library (libstdc++).

How can they be incompatible?

To answer this, let's take a look at the exact version. We're compiling with GCC 8.4.0. It has baked in include paths, and they refer to standard library headers shipped in the gcc-8 package of Ubuntu 20.04. Those match perfectly, and you have to change default settings to make GCC consume different, unmatching standard library headers. The headers are therefore those of GCC 8.4.0.

What about the shared object libstdc++.so? We're running with libstdc++.so.6.0.28 according to ldd and the debugger. According to libstdc++ ABI Policy and Guidelines, that's GCC >= 9.3.

libstdc++.so.6.0.28 does contain a definition of _ZNSt10filesystem7__cxx114path14_M_split_cmptsEv:

$ objdump -T /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28 | grep _ZNSt10filesystem7__cxx114path14_M_split_cmptsEv
000000000016a260 g    DF .text  00000000000005f3  GLIBCXX_3.4.26 _ZNSt10filesystem7__cxx114path14_M_split_cmptsEv

According to the ABI doc, this is

GCC 9.1.0: GLIBCXX_3.4.26, CXXABI_1.3.12

So that's a symbol which was NOT available in GCC 8.4.0.


Why doesn't the compiler/linker complain?

When we compile with gcc-8, why doesn't the compiler or linker complain about us consuming a symbol from GCC 9?

If we compile with -v, we see the linker invocation:

COLLECT_GCC_OPTIONS='-v' '-std=c++17' '-g' '-shared-libgcc' '-mtune=generic' '-march=x86-64'
/usr/lib/gcc/x86_64-linux-gnu/8/collect2 -plugin /usr/lib/gcc/x86_64-linux-gnu/8/liblto_plugin.so -plugin-opt=/usr/lib/gcc/x86_64-linux-gnu/8/lto-wrapper -plugin-opt=-fresolution=/tmp/cceJgWPt.res -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc --build-id --eh-frame-hdr -m elf_x86_64 --hash-style=gnu --as-needed -dynamic-linker /lib64/ld-linux-x86-64.so.2 -pie -z now -z relro /usr/lib/gcc/x86_64-linux-gnu/8/../../../x86_64-linux-gnu/Scrt1.o /usr/lib/gcc/x86_64-linux-gnu/8/../../../x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/8/crtbeginS.o -L/usr/lib/gcc/x86_64-linux-gnu/8 -L/usr/lib/gcc/x86_64-linux-gnu/8/../../../x86_64-linux-gnu -L/usr/lib/gcc/x86_64-linux-gnu/8/../../../../lib -L/lib/x86_64-linux-gnu -L/lib/../lib -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib -L/usr/lib/gcc/x86_64-linux-gnu/8/../../.. /tmp/ccTNph3u.o -lstdc++ -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc /usr/lib/gcc/x86_64-linux-gnu/8/crtendS.o /usr/lib/gcc/x86_64-linux-gnu/8/../../../x86_64-linux-gnu/crtn.o                           COLLECT_GCC_OPTIONS='-v' '-std=c++17' '-g' '-shared-libgcc' '-mtune=generic' '-march=x86-64'

In there, we have -L/usr/lib/gcc/x86_64-linux-gnu/8 and other paths to find the standard library. There, we find libstdc++.so -> ../../../x86_64-linux-gnu/libstdc++.so.6, which finally points to /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28 (!!!).

So the linker is given GCC 9's libstdc++.so, and it does NOT receive any version information on the symbol from the compiler (*). The compiler only knows the source code, and the source code does not contain a symbol version in this case (filesystem headers of GCC 8.4.0). The symbol version is however present in the ELF binary libstdc++.so. The linker sees GLIBCXX_3.4.26 for the symbol requested by the compiler _ZNSt10filesystem7__cxx114path14_M_split_cmptsEv and is satisfied with that. Makes you wonder if there's a linker switch to tell the linker "don't consume a versioned symbol if I requested an unversioned symbol".

(*) The linker does not receive any symbol information on that unresolved symbol from the compiler because the compiler has no such information from the source code. You can add info to your source code. I don't know how libstdc++ usually does it - or its policy on symbol versions in header files. It looks like it is not done at all for filesystem.

The ELF symbol versioning mechanism should usually prevent such incompatibilities: If there is a layout-incompatible change, you create a new symbol with the same name but a different version, and add it to libstdc++.so, which then contains both the old and the new version.

A binary compiled against libstdc++.so specifies which version of a symbol it wants, and the dynamic loader properly resolves the undefined symbols against symbols of matching name and version. Note that the dynamic linker does not know which shared library to search (on Windows/PE, this is different). Any "symbol request" is merely an undefined symbol, and there's a completely separate list of required libraries which shall provide those undefined symbols. But there's no mapping in the binary which symbol should come from which library.

Because the ELF symbol versioning mechanism allows backwards-compatible additions of symbols, we can maintain a single libstdc++.so for multiple versions of the compiler. That's why you see symlinks all over the place, leading all to the same file. The suffix .6.0.28 is another, orthogonal versioning scheme which allows backwards-incompatible changes: You binary can specify that it needs libstdc++.so.6 and you can add an incompatible libstdc++.so.7 for other binaries.

Fun fact: If you linked your library against a pure GCC 8 version of libstdc++.so, you would have seen a linker error. Linking against a shared library doesn't do much to the binary; it does however fix the symbol versions of unresolved symbols and can check that no unresolved symbols are left after looking though all libraries. We can see that your binary actually requests _ZNSt10filesystem7__cxx114path14_M_split_cmptsEv@GLIBCXX_3.4.26 when you link it against libstdc++.so.6.0.28.

Fun fact 2: If you run your library against a pure GCC 8 version of libstdc++.so, you would have received a dynamic linker error, because it cannot find _ZNSt10filesystem7__cxx114path14_M_split_cmptsEv@GLIBCXX_3.4.26.


What should actually happen?

You should actually link to libstdc++fs.a. It also provides a definition of _ZNSt10filesystem7__cxx114path14_M_split_cmptsEv, and it's not a symlink but specific to this GCC version: /usr/lib/gcc/x86_64-linux-gnu/8/libstdc++fs.a.

When you link against -lstdc++fs, you get its symbols included directly into the executable (since it's a static library). Symbols in the executable take priority over the symbols in shared objects. Therefore, the _ZNSt10filesystem7__cxx114path14_M_split_cmptsEv from libstdc++fs.a is used.


What's actually the incompatibility in layout in path?

GCC 9 introduced a different type to hold the components of the path. Using clang++ -cc1 -fdump-record-layouts, we can see the offset at the left side, and the member and type names at the right side:

GCC 8.4.0:

 0 | class std::filesystem::__cxx11::path
 0 |   class std::__cxx11::basic_string<char, struct std::char_traits<char>, class std::allocator<char> > _M_pathname
 0 |     struct std::__cxx11::basic_string<char, struct std::char_traits<char>, class std::allocator<char> >::_Alloc_hider _M_dataplus
 0 |       class std::allocator<char> (base) (empty)
 0 |         class __gnu_cxx::new_allocator<char> (base) (empty)
 0 |       std::__cxx11::basic_string<char, struct std::char_traits<char>, class std::allocator<char> >::pointer _M_p
 8 |     std::__cxx11::basic_string<char, struct std::char_traits<char>, class std::allocator<char> >::size_type _M_string_length
16 |     union std::__cxx11::basic_string<char, struct std::char_traits<char>, class std::allocator<char> >::(anonymous at /usr/include/c++/8/bits/basic_string.h:160:7) 
16 |       char [16] _M_local_buf
16 |       std::__cxx11::basic_string<char, struct std::char_traits<char>, class std::allocator<char> >::size_type _M_allocated_capacity
32 |   class std::vector<struct std::filesystem::__cxx11::path::_Cmpt, class std::allocator<struct std::filesystem::__cxx11::path::_Cmpt> > _M_cmpts
32 |     struct std::_Vector_base<struct std::filesystem::__cxx11::path::_Cmpt, class std::allocator<struct std::filesystem::__cxx11::path::_Cmpt> > (base)
32 |       struct std::_Vector_base<struct std::filesystem::__cxx11::path::_Cmpt, class std::allocator<struct std::filesystem::__cxx11::path::_Cmpt> >::_Vector_impl _M_impl
32 |         class std::allocator<struct std::filesystem::__cxx11::path::_Cmpt> (base) (empty)
32 |           class __gnu_cxx::new_allocator<struct std::filesystem::__cxx11::path::_Cmpt> (base) (empty)
32 |         std::_Vector_base<struct std::filesystem::__cxx11::path::_Cmpt, class std::allocator<struct std::filesystem::__cxx11::path::_Cmpt> >::pointer _M_start
40 |         std::_Vector_base<struct std::filesystem::__cxx11::path::_Cmpt, class std::allocator<struct std::filesystem::__cxx11::path::_Cmpt> >::pointer _M_finish
48 |         std::_Vector_base<struct std::filesystem::__cxx11::path::_Cmpt, class std::allocator<struct std::filesystem::__cxx11::path::_Cmpt> >::pointer _M_end_of_storage
56 |   enum std::filesystem::__cxx11::path::_Type _M_type
   | [sizeof=64, dsize=57, align=8,
   |  nvsize=57, nvalign=8]

GCC 9.3.0:

 0 | class std::filesystem::__cxx11::path
 0 |   class std::__cxx11::basic_string<char> _M_pathname
 0 |     struct std::__cxx11::basic_string<char, struct std::char_traits<char>, class std::allocator<char> >::_Alloc_hider _M_dataplus
 0 |       class std::allocator<char> (base) (empty)
 0 |         class __gnu_cxx::new_allocator<char> (base) (empty)
 0 |       std::__cxx11::basic_string<char, struct std::char_traits<char>, class std::allocator<char> >::pointer _M_p
 8 |     std::__cxx11::basic_string<char, struct std::char_traits<char>, class std::allocator<char> >::size_type _M_string_length
16 |     union std::__cxx11::basic_string<char, struct std::char_traits<char>, class std::allocator<char> >::(anonymous at /usr/include/c++/9/bits/basic_string.h:171:7) 
16 |       char [16] _M_local_buf
16 |       std::__cxx11::basic_string<char, struct std::char_traits<char>, class std::allocator<char> >::size_type _M_allocated_capacity
32 |   struct std::filesystem::__cxx11::path::_List _M_cmpts
32 |     class std::unique_ptr<struct std::filesystem::__cxx11::path::_List::_Impl, struct std::filesystem::__cxx11::path::_List::_Impl_deleter> _M_impl
32 |       class std::__uniq_ptr_impl<struct std::filesystem::__cxx11::path::_List::_Impl, struct std::filesystem::__cxx11::path::_List::_Impl_deleter> _M_t
32 |         class std::tuple<struct std::filesystem::__cxx11::path::_List::_Impl *, struct std::filesystem::__cxx11::path::_List::_Impl_deleter> _M_t
32 |           struct std::_Tuple_impl<0, struct std::filesystem::__cxx11::path::_List::_Impl *, struct std::filesystem::__cxx11::path::_List::_Impl_deleter> (base)
32 |             struct std::_Tuple_impl<1, struct std::filesystem::__cxx11::path::_List::_Impl_deleter> (base) (empty)
32 |               struct std::_Head_base<1, struct std::filesystem::__cxx11::path::_List::_Impl_deleter, true> (base) (empty)
32 |                 struct std::filesystem::__cxx11::path::_List::_Impl_deleter (base) (empty)
32 |             struct std::_Head_base<0, struct std::filesystem::__cxx11::path::_List::_Impl *, false> (base)
32 |               struct std::filesystem::__cxx11::path::_List::_Impl * _M_head_impl
   | [sizeof=40, dsize=40, align=8,
   |  nvsize=40, nvalign=8]

The difference is in path::_M_cmpts:

// GCC 8
class std::vector<
  struct std::filesystem::__cxx11::path::_Cmpt,
  class std::allocator<struct std::filesystem::__cxx11::path::_Cmpt>
> _M_cmpts

// GCC 9
struct std::filesystem::__cxx11::path::_List _M_cmpts

You can also see the structure of path::_List in the record dump above. It's very much not compatible to a GCC 8 vector.

Remember that we're calling path::_M_split_cmpts via libstdc++.so from GCC 9, and we're crashing in the vector destructor for this _M_cmpts data member.

Here's the commit that changed from vector to _List:

commit 4f87bb8d6e8dec21a07f1fba641a78a127281349
Author: Jonathan Wakely <[email protected]>
Date:   Thu Dec 13 20:33:55 2018 +0000

PR libstdc++/71044 optimize std::filesystem::path construction

This new implementation has a smaller footprint than the previous
implementation, due to replacing std::vector<_Cmpt> with a custom pimpl
type that only needs a single pointer. The _M_type enumeration is also
combined with the pimpl type, by using a tagged pointer, reducing
sizeof(path) further still.

Construction and modification of paths is now done more efficiently, by
splitting the input into a stack-based buffer of string_view objects
instead of a dynamically-allocated vector containing strings. Once the
final size is known only a single allocation is needed to reserve space
for it.  The append and concat operations no longer require constructing
temporary path objects, nor re-parsing the entire native pathname.
This results in algorithmic improvements to path construction, and
working with large paths is much faster.
Physicochemical answered 24/9, 2020 at 17:44 Comment(5)
I haven't had that much fun in ages! xDPhysicochemical
Thanks for putting in the effort! Glad my question could spark some fun for you. :DDisillusionize
media0.giphy.com/media/7JEGCK4E0W8SPSk84m/…Cabriolet
Now that I think about it, the function(s) could have been marked in the header as ELF-invisible. This would have enforced fetching them from a static library.Physicochemical
Maybe one of the best / most detailed posts that was exactly the problem I just had. I wish I could upvote by more.Resurge

© 2022 - 2024 — McMap. All rights reserved.