Double free in the C++ standard library using only std::function and std::shared_pointer
Asked Answered
G

1

32

I recently came across a weird double-free bug in a program when capturing a shared_ptr in a lambda. I was able to reduce it this the following minimal example:

#include <memory>
#include <functional>

struct foo {
    std::function<void(void)> fun;
};

foo& get() {
    auto f = std::make_shared<foo>();
    // Create a circular reference by capturing the shared pointer by value
    f->fun = [f]() {};
    return *f;

}

int main(void) {
    get().fun = nullptr;
    return 0;
}

Compiling this with GCC 12.2.0 and the address sanitizer and running it, produces a double-free in std::function:

$ g++ -fsanitize=address -g -Wall -Wextra -o main main.cpp && ./main
=================================================================
==2401674==ERROR: AddressSanitizer: attempting double-free on 0x602000000010 in thread T0:
    #0 0x7f7064ac178a in operator delete(void*, unsigned long) /usr/src/debug/gcc/libsanitizer/asan/asan_new_delete.cpp:164
    #1 0x556a00865b9d in _M_destroy /usr/include/c++/12.2.0/bits/std_function.h:175
    #2 0x556a00865abe in _M_manager /usr/include/c++/12.2.0/bits/std_function.h:203
    #3 0x556a008658b9 in _M_manager /usr/include/c++/12.2.0/bits/std_function.h:282
    #4 0x556a00866623 in std::function<void ()>::operator=(decltype(nullptr)) /usr/include/c++/12.2.0/bits/std_function.h:505
    #5 0x556a008654b5 in main /tmp/cpp/main.cpp:16
    #6 0x7f706443c28f  (/usr/lib/libc.so.6+0x2328f)
    #7 0x7f706443c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
    #8 0x556a008651b4 in _start ../sysdeps/x86_64/start.S:115

0x602000000010 is located 0 bytes inside of 16-byte region [0x602000000010,0x602000000020)
freed by thread T0 here:
    #0 0x7f7064ac178a in operator delete(void*, unsigned long) /usr/src/debug/gcc/libsanitizer/asan/asan_new_delete.cpp:164
    #1 0x556a00865b9d in _M_destroy /usr/include/c++/12.2.0/bits/std_function.h:175
    #2 0x556a00865abe in _M_manager /usr/include/c++/12.2.0/bits/std_function.h:203
    #3 0x556a008658b9 in _M_manager /usr/include/c++/12.2.0/bits/std_function.h:282
    #4 0x556a00866215 in std::_Function_base::~_Function_base() /usr/include/c++/12.2.0/bits/std_function.h:244
    #5 0x556a00866579 in std::function<void ()>::~function() /usr/include/c++/12.2.0/bits/std_function.h:334
    #6 0x556a00868337 in foo::~foo() /tmp/cpp/main.cpp:4
    #7 0x556a00868352 in void std::_Destroy<foo>(foo*) /usr/include/c++/12.2.0/bits/stl_construct.h:151
    #8 0x556a0086830d in void std::allocator_traits<std::allocator<void> >::destroy<foo>(std::allocator<void>&, foo*) /usr/include/c++/12.2.0/bits/alloc_traits.h:648
    #9 0x556a008680fa in std::_Sp_counted_ptr_inplace<foo, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() /usr/include/c++/12.2.0/bits/shared_ptr_base.h:613
    #10 0x556a00866005 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() /usr/include/c++/12.2.0/bits/shared_ptr_base.h:346
    #11 0x556a008664c5 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count() /usr/include/c++/12.2.0/bits/shared_ptr_base.h:1071
    #12 0x556a00866235 in std::__shared_ptr<foo, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr() /usr/include/c++/12.2.0/bits/shared_ptr_base.h:1524
    #13 0x556a00866251 in std::shared_ptr<foo>::~shared_ptr() /usr/include/c++/12.2.0/bits/shared_ptr.h:175
    #14 0x556a008652ad in ~<lambda> /tmp/cpp/main.cpp:10
    #15 0x556a00865b90 in _M_destroy /usr/include/c++/12.2.0/bits/std_function.h:175
    #16 0x556a00865abe in _M_manager /usr/include/c++/12.2.0/bits/std_function.h:203
    #17 0x556a008658b9 in _M_manager /usr/include/c++/12.2.0/bits/std_function.h:282
    #18 0x556a00866623 in std::function<void ()>::operator=(decltype(nullptr)) /usr/include/c++/12.2.0/bits/std_function.h:505
    #19 0x556a008654b5 in main /tmp/cpp/main.cpp:16
    #20 0x7f706443c28f  (/usr/lib/libc.so.6+0x2328f)

previously allocated by thread T0 here:
    #0 0x7f7064ac0672 in operator new(unsigned long) /usr/src/debug/gcc/libsanitizer/asan/asan_new_delete.cpp:95
    #1 0x556a00865906 in _M_create<get()::<lambda()> > /usr/include/c++/12.2.0/bits/std_function.h:161
    #2 0x556a008657e3 in _M_init_functor<get()::<lambda()> > /usr/include/c++/12.2.0/bits/std_function.h:215
    #3 0x556a00865719 in function<get()::<lambda()> > /usr/include/c++/12.2.0/bits/std_function.h:449
    #4 0x556a00865578 in operator=<get()::<lambda()> > /usr/include/c++/12.2.0/bits/std_function.h:534
    #5 0x556a008653aa in get() /tmp/cpp/main.cpp:10
    #6 0x556a008654a8 in main /tmp/cpp/main.cpp:16
    #7 0x7f706443c28f  (/usr/lib/libc.so.6+0x2328f)

SUMMARY: AddressSanitizer: double-free /usr/src/debug/gcc/libsanitizer/asan/asan_new_delete.cpp:164 in operator delete(void*, unsigned long)
==2401674==ABORTING

Once the get function returns, the std::function inside the foo struct owns the only shared_ptr that owns the enclosing foo object. This means, assigning nullptr to it should destroy the shared_ptr which in turn should free the foo object.

What seems to be happening here is that the delete call in std_function.h:175 first runs the destructor of the lambda, which destroys the shared_ptr, the foo object, and its enclosed std::function object before freeing memory. However, the destruction of the std::function object has now already freed that memory location, leading to the double-free.

I'm now trying to figure out whether this is a bug in the standard library implementation (libstdc++) or whether the program is triggering undefined behavior somewhere.

An indicator that this might be a libstdc++ bug is that with clang++ and libc++ 14.0.6, there is no double-free (or at least none is detected), but clang++ with libstdc++ does have the double-free issue as well.

Is this program breaking any rules/triggering undefined behavior according to any C++ standard?


I reproduced all this on an x86-64 linux machine.

Gertrudgertruda answered 15/10, 2022 at 14:12 Comment(3)
Your foo& get() is broken, returning a reference to an object that is deleted. It should returned std::shared_ptr<foo>.Rogerson
No, the object is still alive since there is a shared_ptr pointing to it (in the lambda capture)Gertrudgertruda
There is no double free if one replaces get().fun = nullptr with std::function<void(void)>{}.swap( get().fun ), which safely deletes foo object. Online demo: gcc.godbolt.org/z/a1168f65TMeggs
K
33

I believe the relevant part of the standard is [res.on.objects], which states

If an object of a standard library type is accessed, and the beginning of the object's lifetime does not happen before the access, or the access does not happen before the end of the object's lifetime, the behavior is undefined unless otherwise specified.

In your example, you access the std::function by assigning to it. During this access, the destructor of the std::function is called, ending the lifetime of the std::function. But the access has not finished, so it is not the case that the access happens before the end of the object's lifetime.

Therefore, the code has undefined behavior.

Kirtle answered 15/10, 2022 at 15:17 Comment(7)
That definitely seems to be it, thanks for the reference to the standard. That's what I get for capturing shared_ptrs in a lambda ;)Gertrudgertruda
Not necessarily: access means something very specific in the standard; see [defns.access], including the note: invoking an assignment operator does not in itself constitute an access. However, std::function's assignment operator is specified to return *this; that's still not an access, but it is indirection, and this is an invalid pointer value when the return is executed, and that is undefined behavior ([basic.stc.general]/4). cc @PatrickZieglerMichaella
@Michaella That definition of access doesn't make sense in context. Almost all the types in the standard library are class types, not scalar types. So the paragraph about lifetimes has very little force if we interpret it using that definition. Also, consider the note which follows the lifetime rule: "This applies even to objects such as mutexes intended for thread synchronization." Since the mutex types in the standard library are not scalar types, it cannot be intended that the lifetime rule applies only to scalar types.Kirtle
That definition of access is the only one in the standard, and it applies to the entire document; the library specification does not introduce another one. It does make perfect sense: the paragraph in the library specification refers to accesses to scalar subobjects of a library object of class type, usually (but not necessarily) done as part of a function call. It does have quite a lot of force, too, and it does apply to mutexes just as well.Michaella
Interestingly, the current wording in [res.on.objects] was introduced by LWG2224. Previously, there was only a (non-normative) note, which was replaced by this paragraph. The old note was specifically saying that any member function invocation must complete before destruction, so it seems that the intent is still to prohibit such invocations, which means that your interpretation is what LWG actually wanted to say there. They used the wrong term, which makes this an issue with the wording.Michaella
@Michaella That is indeed interesting! Will you be submitting a Defect Report? If not, I might. (One of the notes in LWG2224 says the word "access" is magic. Perhaps, but it seems to be the wrong magic.)Kirtle
"magic" - yeah, exactly, were they intending for it to do its magic according to the core language definition, or were they assuming the wrong magic? :-) There was also some discussion about this on std-discussion, but with no definite answer from someone from LWG (Ville is on the committee, but not LWG as far as I know). I think submitting an issue would be worth it. How about we discuss it here?Michaella

© 2022 - 2024 — McMap. All rights reserved.