Sink arguments and move semantics for functions that can fail (strong exception safety)
Asked Answered
R

2

12

I have a function that operates on a big chunk of data passed in as a sink argument. My BigData type is already C++11-aware and comes with fully functional move constructor and move assignment implementations, so I can get away without having to copy the damn thing:

Result processBigData(BigData);

[...]

BigData b = retrieveData();
Result r = processBigData(std::move(b));

This all works perfectly fine. However, my processing function may fail occasionally at runtime resulting in an exception. This is not really a problem, since I can just fix stuff and retry:

BigData b = retrieveData();
Result r;
try {
    r = processBigData(std::move(b));
} catch(std::runtime_error&) {
    r = fixEnvironmnentAndTryAgain(b);
    // wait, something isn't right here...
}

Of course, this won't work.

Since I moved my data into the processing function, by the time I arrive in the exception handler, b will not be usable anymore.

This threatens to drastically reduce my enthusiasm for passing sink arguments by-value.

So here is the question: How to deal with a situation like this in modern C++ code? How to retrieve access to data that was previously moved into a function that failed to execute?

You may change the implementation and interfaces for both BigData and processBigData as you please. The final solution however should try to minimize drawbacks over the original code regarding efficiency and usability.

Red answered 4/9, 2014 at 13:53 Comment(9)
Important question, does Result contain the moved resources of b or is just based on it?Flagellant
@MadScienceDreams The Result is just calculated from b, it does not contain a reference to, or copy of the original b.Red
@Red But does it contain the moved (as opposed to copied) contents?Aparri
Then there is no reason to pass is by rhr. Whenever you call std::move, then you are making the agreement that the value is gone after the function call. While there may be tricks to get around this (it is not guaranteed to be gone, just you agreed that it could be gone), the correct way to pass a value that you don't want to have side effects on (even in modern c++) is const reference.Flagellant
@Aparri If that helps you to solve the problem, feel free to assume that it does. In my code as it stands now however, b is discarded by the processing function as soon as it returns.Red
@MadScienceDreams In my particular situation, passing by const& would require me to copy the whole data (I actually ran into this problem in an asynchronous processing function, so ownership of the data really needs to be moved to the function). The best solution I could think of to avoid a copy is to use a shared_ptr and go through the heap, but I'd like to avoid that if possible.Red
@Red So "result" doesn't consume BigData's resources, but a side-effect of the function does? I don't think you can get around using a shared-pointer for such ambiguous ownership...(Since BigData has resources that can be sped up by the move, I assume that it is already using heap resources).Flagellant
@MadScienceDreams Pretty much. Although I don't see how the fact who eventually consumes the resource changes the outcome. Assuming that it would be consumed by the result, would that allow for a nicer solution?Red
@Red Nope, just be more clear on what is going on. One possible (but insane) solution would be to have a output variable that has the (potentially) filled in object. Result processBigData(BigData&& in_ref, BigData* out_ref=NULL){BigData whereitmoves(std::foreward(in_ref));try{/*old method*/}catch(...){if (out_ref){*out_ref=std::move(whereitmoves);} std::rethrow_exception(std::current_exception());}Flagellant
R
2

Apparently this issue was discussed lively at the recent CppCon 2014. Herb Sutter summarized the latest state of things in his closing talk, Back to the Basics! Essentials of Modern C++ Style (slides).

His conclusion is quite simply: Don't use pass-by-value for sink arguments.

The arguments for using this technique in the first place (as popularized by Eric Niebler's Meeting C++ 2013 keynote C++11 Library design (slides)) seem to be outweighed by the disadvantages. The initial motivation for passing sink arguments by-value was to get rid of the combinatorial explosion for function overloads that results from using const&/&&.

Unfortunately, it seems that this brings a number of unintended consequences. One of which are potential efficiency drawbacks (mainly due to unnecessary buffer allocations). The other is the problem with exception safety from this question. Both of these are discussed in Herb's talk.

Herb's conclusion is to not use pass-by-value for sink arguments, but instead rely on separate const&/&& (with const& being the default and && reserved for those few cases where optimization is required).

This also matches with what @Potatoswatter's answer suggested. By passing the sink argument via && we might be able to defer the actual moving of the data from the argument to a point where we can give a noexcept guarantee.

I kind of liked the idea of passing sink arguments by-value, but it seems that it does not hold up as well in practice as everyone hoped.

Update after thinking about this for 5 years:

I am now convinced that my motivating example is a misuse of move semantics. After the invocation of processBigData(std::move(b));, I should never be allowed to assume what the state of b is, even if the function exits with an exception. Doing so leads to code that is hard to follow and to maintain.

Instead, if the contents of b should be recoverable in the error case, this needs to be made explicit in the code. For example:

class BigDataException : public std::runtime_error {
private:
    BigData b;
public:
    BigData retrieveDataAfterError() &&;

    // [...]
};


BigData b = retrieveData();
Result r;
try {
    r = processBigData(std::move(b));
} catch(BigDataException& e) {
    b = std::move(e).retrieveDataAfterError();
    r = fixEnvironmnentAndTryAgain(std::move(b));
}

If I want to recover the contents of b, I need to explicitly pass them out along the error path (in this case wrapped inside the BigDataException). This approach requires a bit of additional boilerplate, but it is more idiomatic in that it does not require making assumptions about the state of a moved-from object.

Red answered 17/9, 2014 at 12:6 Comment(3)
If say BigData was a container. e.g. a std::vector or something similar. Then you could let your processBigData function consume items from the container, until such time as you get an exception, and thus leave the remainder of the data for further processing. I have used this pattern for exactly the reasons you mention. You can also do the same thing with iterators or ranges, rather than passing in the container itself. The still other alternative is to give the data away as you have done, but take ownership of it within a class, you can then "resume" or "return" it on error.Mccloud
@SpacenJasset How do you signify on the function signature that parts of the container will be consumed? You cannot use a && parameter, as that signifies the vector will be consumed entirely ("...will be left in a valid but unspecified state"). So you will probably want to take the container via & instead. I agree that this is a valid solution. For this question I was focusing on sensible semantics for the && signature in particular.Red
either as you say, pass in BigData & to process what you can from the container. Or, you can move the data in with &&, and on failurure (exception) move the unprocessed data back out again - but this requires a class object for temporary storage so you can fish it back out later. example: processor->process(Bigdata &&); BigData processor->getUnprocessed(); Essentially rvalue moves doesn't seem to be a good fit for (some) Sink data, as Herb Sutter observed.Mccloud
A
3

I'm similarly nonplussed by this issue.

As far as I can tell, the best current idiom is to divide the pass-by-value into a pair of pass-by-references.

template< typename t >
std::decay_t< t >
val( t && o ) // Given an object, return a new object "val"ue by move or copy
    { return std::forward< t >( o ); }

Result processBigData(BigData && in_rref) {
    // implementation
}

Result processBigData(BigData const & in_cref ) {
    return processBigData( val( in_cref ) );
}

Of course, bits and pieces of the argument might have been been moved before the exception. The problem propagates out to whatever processBigData calls.

I've had an inspiration to develop an object that moves itself back to its source upon certain exceptions, but that's a solution to a particular problem on the horizon in one of my projects. It might end up too specialized, or it might not be feasible at all.

Aparri answered 4/9, 2014 at 14:5 Comment(3)
I'm still not sure why you want to pass a RHR for the function if it doesn't actually consume BigData...I guess to allow it to consume it in the future? Also, what do you do in the case of a non-copyable type (like std::unique_ptr)?Flagellant
Yeah, adding an overload for rvalue refs would indeed help here. You could defer moving from BigData to a point at which you can guarantee that no exception will occur. Of course this suffers from the usual drawbacks of having to introduce rvalue-ref overloads on function interfaces (hence I'm not 100% convinced it satisfies my usability constraint), but it might actually work out fine in certain situations. +1 either way.Red
@MadScienceDreams 1. Yes, given the clarification comments under the question, I'm not sure why it's not a const & in the first place, but I just answered the question conceptually. 2. Non-copyable types wouldn't need the const & overload, that's all.Aparri
R
2

Apparently this issue was discussed lively at the recent CppCon 2014. Herb Sutter summarized the latest state of things in his closing talk, Back to the Basics! Essentials of Modern C++ Style (slides).

His conclusion is quite simply: Don't use pass-by-value for sink arguments.

The arguments for using this technique in the first place (as popularized by Eric Niebler's Meeting C++ 2013 keynote C++11 Library design (slides)) seem to be outweighed by the disadvantages. The initial motivation for passing sink arguments by-value was to get rid of the combinatorial explosion for function overloads that results from using const&/&&.

Unfortunately, it seems that this brings a number of unintended consequences. One of which are potential efficiency drawbacks (mainly due to unnecessary buffer allocations). The other is the problem with exception safety from this question. Both of these are discussed in Herb's talk.

Herb's conclusion is to not use pass-by-value for sink arguments, but instead rely on separate const&/&& (with const& being the default and && reserved for those few cases where optimization is required).

This also matches with what @Potatoswatter's answer suggested. By passing the sink argument via && we might be able to defer the actual moving of the data from the argument to a point where we can give a noexcept guarantee.

I kind of liked the idea of passing sink arguments by-value, but it seems that it does not hold up as well in practice as everyone hoped.

Update after thinking about this for 5 years:

I am now convinced that my motivating example is a misuse of move semantics. After the invocation of processBigData(std::move(b));, I should never be allowed to assume what the state of b is, even if the function exits with an exception. Doing so leads to code that is hard to follow and to maintain.

Instead, if the contents of b should be recoverable in the error case, this needs to be made explicit in the code. For example:

class BigDataException : public std::runtime_error {
private:
    BigData b;
public:
    BigData retrieveDataAfterError() &&;

    // [...]
};


BigData b = retrieveData();
Result r;
try {
    r = processBigData(std::move(b));
} catch(BigDataException& e) {
    b = std::move(e).retrieveDataAfterError();
    r = fixEnvironmnentAndTryAgain(std::move(b));
}

If I want to recover the contents of b, I need to explicitly pass them out along the error path (in this case wrapped inside the BigDataException). This approach requires a bit of additional boilerplate, but it is more idiomatic in that it does not require making assumptions about the state of a moved-from object.

Red answered 17/9, 2014 at 12:6 Comment(3)
If say BigData was a container. e.g. a std::vector or something similar. Then you could let your processBigData function consume items from the container, until such time as you get an exception, and thus leave the remainder of the data for further processing. I have used this pattern for exactly the reasons you mention. You can also do the same thing with iterators or ranges, rather than passing in the container itself. The still other alternative is to give the data away as you have done, but take ownership of it within a class, you can then "resume" or "return" it on error.Mccloud
@SpacenJasset How do you signify on the function signature that parts of the container will be consumed? You cannot use a && parameter, as that signifies the vector will be consumed entirely ("...will be left in a valid but unspecified state"). So you will probably want to take the container via & instead. I agree that this is a valid solution. For this question I was focusing on sensible semantics for the && signature in particular.Red
either as you say, pass in BigData & to process what you can from the container. Or, you can move the data in with &&, and on failurure (exception) move the unprocessed data back out again - but this requires a class object for temporary storage so you can fish it back out later. example: processor->process(Bigdata &&); BigData processor->getUnprocessed(); Essentially rvalue moves doesn't seem to be a good fit for (some) Sink data, as Herb Sutter observed.Mccloud

© 2022 - 2024 — McMap. All rights reserved.