Why do move constructors and move assignment operators of Standard Library leave the object moved-from in unspecified state?
Asked Answered
H

2

8

There is a special description for move constructors and move assignment operators in the C++ Standard Library that says that the object the data is moved from is left in a valid but unspecified state after the call. Why? I frankly don't understand it. It is something I intuitively don't expect. Really, if I move something from one place to another in the real world, the place I move from is left empty (and yep, valid), until I move there something new. Why in the C++ world should it be different?

For example, depending on the implementation the following code:

std::vector<int> a {1, 2, 3};
std::vector<int> b {4, 5, 6};
a = std::move(b);

may be equivalent to the next code:

std::vector<int> a {1, 2, 3};
std::vector<int> b {4, 5, 6};
a.swap(b);

It is really what I don't expect. If I move the data from one vector to another, I expect the vector I move data from to be empty (zero size).

As far as I know the GCC implementation of the Standard C++ library leaves the vector in empty state after the move. Why not to make this behavior a part of the Standard?

What are the reasons to leave an object in unspecified state. If it is for optimization, it is kind of strange too. The only reasonable thing I can do with an object in unspecified state is to clear it (ok, I can get the size of the vector, I can print its content, but since the content is unspecified I don't need it). So the object will be cleared in any way either by me manually or by call of assignment operator or destructor. I prefer to clear it by myself, because I expect it to be cleared. But that's a double call to clear. Where is an optimization?

Haleigh answered 26/2, 2018 at 14:2 Comment(23)
its maybe a bit subjective, but when I move from A to B, then .. it depends (aka unspecified) whether I leave some garbage at A or not. The owner of my flat may not be happy about it (maybe not what he expects), but I wont clean every dust particle that I ever left in the apartment :PBoult
Because after you "move from" an object you are not expected to use that object as an R-value anymore - If you do it it behaves as an uninitialized variable, that is "unspecified". This leaves room for optimization that is actually implemented: moving and object can be implemented as aliasing it. The complier has the option not to implement the object you move-from at all.Roe
btw just to stay with that analogy: once you moved from A to B, you dont have the permission anymore to enter ABoult
My suspicion is that the standard designers simply didn’t want to have to force every c++ type to support the concept of an ‘empty’ state. While the meaning of ‘empty’ is straightforward for a container type, it’s not so clear what eg an ‘empty’ floating point value would mean.Webbed
i guess the answer is simply: because thats what "moving" was intended to mean. If you care about the state of the moved-from object, you simply dont moveBoult
@Sigismondo Amendment: uninitialized is not a valid state.Mandi
For SSO string, moving may not make the moved-from small string "empty".Cagey
The move constructor of std::shared_ptr is guaranteed to leave the pointer moved-from in empty state. If I move the vector of shared_ptr to another vector, the first vector still may contain original shared_ptrs. That may cause a memory leak. The same with std::function. The lambda inside it can capture the shared_ptr which may be left in the function move-from. It's better to clear it.Haleigh
@Haleigh For the new and the original vector to both contain the shared pointers, the pointers would either have to have been copied (which does not result in a memory leak, as the shared_ptr takes care of that) or both vectors would have to contain to same junk of memory, which would not be a valid state (for at least one of them). So this isn't a problem. Same thing for the lambda.Smollett
From cppreference: "After the move, other is guaranteed to be empty()" -- so your question is based on a false premise. A moved-from object is left in an unspecified but valid state in the general case, then standard classes (or your own classes) can add sensible specifications. Same thing for std::unique_ptr and std::shared_ptr.Trotyl
@Knoep, no, you are wrong. shared_ptr can be copied. std::function may have internal buffer to store data, and it is not required to move data, it may copy it (at least cppreference says that). So it may just copy the lambda with shared_ptrs from one std::function to another.Haleigh
@Quentin, yes, not all move constructors/operators leave an object in unspecified state. shared_ptrs are left empty. vectors are not.Haleigh
@Haleigh Sorry, the moment I sent the comment, I noticed that I was thinking of unique_ptr. The comment is correct now. Thist gist is, that copying something in a vector or a lambda is no different from copying it manually. It is either not possible (in case of unique_ptr) or will not result in a memory leak (in case of shared_ptr).Smollett
To continue the moving analogy think of it this way. You move in and the walls are white. While living there you decide to paint the walls blue. When you move out you take the stuff with you but you don't repaint i because that is when you just don't want/need to do. Now this is still valid but you didn't return it to the original empty state. This same thing happens with containers. With vector we set the size to 0 and the pointer to null but changing the capacity doesn't matter.Rosner
@Smollett you still can get a memory leak with shared_ptrs (in case of cyclic dependency). And actually I had problems with std::function and shared_ptr capture in my practice. The lambda captures the shared_ptr to the object and then it is passed to the object itself (cyclic dependency). The object start processing in another thread. When processing completes the object run the std::function to notify about the completion.Haleigh
Then the object destroys the std::function.Destructor of the std::function destroys the shared_ptr that destroys the object that calls destructor of the std::function second time that results in crash (double destructor call). So I had to move the std::function to local variable in the thread before destroying it.After that when object destroys the std::function it is empty, so doesn't result in destroying shared_ptr. shared_ptr in local variable still holds the reference to the object. When local variable is destroyed on thread exit,the object itself with empty std::function is destroyed.Haleigh
If the move constructor didn't move lambda, I would end up with memory leak.Haleigh
No, you caused undefined behavour by creating a cyclic dependency. You simply relied on a side effect of the move implementation to make it work anyway.Smollett
@Knoep, cyclic dependency is not undefined behavior, but you have to handle it manually to avoid memory leak. I don't rely on move anymore, I use std::swap (which is usually noexcept unlike move) and clear the second object manually.Haleigh
@Haleigh "Then the object destroys the std::function.Destructor of the std::function destroys the shared_ptr that destroys the object that calls destructor of the std::function second time that results in crash (double destructor call)." How is it possible? shared_ptr's ref count prevents this from happening. In fact, when dealing with circular references and shared_ptr the problem I routinely meet is that they won't get released, and I often approach it with weak_ptr's... what about pasting a code example in the question?Roe
@Sigismondo, wrong example: coliru.stacked-crooked.com/a/8698a44f63084d68, fixed example: coliru.stacked-crooked.com/a/b6e680c8f24b8123, UB version that works fine on gcc: coliru.stacked-crooked.com/a/44f9ab54257e25ecHaleigh
worth linking: What can I do with a moved-from object?Hygro
The name "move" is not so great. It is more like a copy of an object that you won't care about afterwards. Moving is the most useful case and gave its name to the thing, but conceptually it is still closer to copying (with permission to trash the source).Xenomorphic
S
9

There is a special description for move constructors and move assignment operators in the C++ Standard Library that says that the object the data is moved from is left in a valid but unspecified state after the call. Why? I frankly don't understand it. It is something I intuitively don't expect. Really, if I move something from one place to another in the real world, the place I move from is left empty (and yep, valid), until I move there something new. Why in the C++ world should it be different?

It isn't.

But you're failing to consider that a "move" cannot always be a move. What happens when you move data from a std::array, for example? Not much. Since an array stores its data in-place, there's no pointers to swap, and a move becomes a copy. As such, although the library could destroy the original data, there's not really any point in doing so, and so the standard won't go any further than saying "we don't guarantee what you get".

A real example is a std::string which is currently storing its contents not in a dynamically-allocated block of memory, but in a small automatically allocated block of memory (this is commonly referred to as the small string optimisation). Like an array, there is no way to actually "move" this information; it must be copied. The string could zero it out afterwards, and it could reduce its length counter to zero, but why force that runtime cost on its users?

So, it would be possible to make stronger guarantees about the state of a post-moved container, on a case-by-case basis, but only by artificially constraining implementations (and reducing optimisation opportunities) for frankly no good reason.

Real world analogies can be fun as a thought experiment, but using them to actually rationalise about behaviours of a programming language is folly.

Snell answered 26/2, 2018 at 14:56 Comment(4)
Too many UB/US. std::array may move all elements to new place one by one. So the state of the moved-from array is defined by the specification of move assignment of its element. You are right, std::string can reduce its length to zero even when using internal buffer. And yes, there are types (like ints) that cannot be moved. But you may specify that there is no move assignment for them and just copy is used.Haleigh
Any way, the object moved-from will be cleared (either by assigning new value to it or by calling destructor on object destroying). So no way to avoid clearing. Sooner or earlier it will happen. Don't see any optimization... You can't do anything but clear with the object moved-from.Haleigh
I would like more if it was disallowed to use std::move on int (and so on all classes that contain int) than having problems like leaving a copy of shared_ptr in the old place (and thus causing a possible memory leak) or problems like when copying is used instead of moving without reporting to a programmer (https://mcmap.net/q/1326568/-is-declaring-explicitly-defaulted-move-constructor-in-every-class-that-doesn-39-t-provide-user-defined-one-a-good-practice/5447906).Haleigh
@anton_rh: The real problem is that std::move doesn't move. But if you wish to mandate that a class not be copied, there's an easy way to do thatSnell
R
3

What are the reasons to leave an object in unspecified state.

Any class can have a different state that is reasonable to be left behind. "unspecified" means here "to be determined time to time". This state can be just the old value (so the compiler can perform just a cheap swap), if this has not side effects, but in case of vectors or shared_ptr's this state must be empty (see the definitions of the move constructors).

https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#c64-a-move-operation-should-move-and-leave-its-source-in-a-valid-state

When you have applied it in your case, a memory corruption arose. This is explained in the following.

The OP reported his code in a comment at the following links: "wrong example: coliru.stacked-crooked.com/a/8698a44f63084d68, fixed example: coliru.stacked-crooked.com/a/b6e680c8f24b8123, UB version that works fine on gcc: coliru.stacked-crooked.com/a/44f9ab54257e25ec"

The real problem you are facing is that you must never mix shared_ptr's and bare ptr's. In fact you are declaring

auto p_processor = std::make_shared<BackGroundProcessor>();

and then copying one reference of the shared pointer in the function object stored in the thread:

Event ev_done;
p_processor->Run([p_processor, &ev_done]() { ev_done.Set(); });

and than launching the thread by capturing this - that is, you are using it by pointer:

void Run(std::function<void()> on_done)
{
    m_on_done.swap(on_done);
    std::thread([this]()
    {
        // Doing some processing
        ...
        m_on_done = nullptr;

Since the thread will take longer than main(), as you reset the shared_ptr in main(), its ref count become "1". Than in the thread, as soon as m_on_done is reset, the object executed in the thread (that is this itself) get deleted before the thread termination. I believe that this is at the origin of all the non reproducible behaviors that you have met.

One common approach to face this is to use shared_from_this() declaring:

class BackGroundProcessor : public std::enable_shared_from_this<BackGroundProcessor>
{
   ...

(find the full fix here http://coliru.stacked-crooked.com/a/1f5c425696c29011)

Then create a shared_ptr and copy it in the thread-lambda - so it will keep it alive until running:

void Run(std::function<void()> on_done)
{
    auto self = shared_from_this();
    m_on_done.swap(on_done);
    std::thread([this,self]()
    {
        // Doing some processing
        ...

Specifying it in the capture arguments should be enough.

Roe answered 2/3, 2018 at 12:37 Comment(8)
Yo, @Sigismondo - did you mistakenly put this answer on the wrong question? (Or is SO messing up?)Tribe
actually no - I believe that the non-reproducible behavior the OP is observing is just a bug - deduced from his last comment. IMHO It just make no sense that these different behaviors arise from the standard.Roe
I don't want force users to use shared_ptrs. My last example doesn't have a bug.Haleigh
the problem is that you are making wrong assumptions, and so the whole question is biased. The standard works fine; your example doesn't work, not because of "standard defects", but because the object get destroyed too early. Your fix works just because swapping to another variable that will go out of scope at the thread termination, will keep the shared_ptr ref count > 0 until the thread exits.Roe
@Sigismondo, you are confusing things a little bit. First of all, my original example, your are replying to in your answer, is not directly related to my question. It is just an example of a problem that I wrongly tried to fix using std::move.Haleigh
The problem with the Standard in this specific case is that it is сounterintuitive (IMO) and it may be a reason for bugs (with compiler-dependent reproducibility).Haleigh
My fix works not just because, it just works. It complies with the Standard and will work on any compiler. There is nothing wrong with my fix. The wrong assumption is not that the object is destroyed too early, the actual wrong assumption was that std::move moves the object whereas it doesn't have to move (thanks to the Standard).Haleigh
Since such an example depends on the fact that "std::move should/shouldn't move" is symptom of an underlying bug. The wrong assumptions that I have met are "I don't want the users care a pointer is a shared_ptr" (you cannot guarantee the correct lifecycle - as in the example); "vectors are not left empty by std::move" (they are, C++11 or their contents are, C++17); "The lambda captures the shared_ptr to the object and then it is passed to the object itself (cyclic dependency)" (the lambda capture the POINTER to the object, you don't have cyclic dependency, you are mixing shared_ptr and ptr).Roe

© 2022 - 2024 — McMap. All rights reserved.