What are the evaluation order guarantees introduced by C++17?
Asked Answered
I

3

113

What are the implications of the voted in C++17 evaluation order guarantees (P0145) on typical C++ code?

What does it change about things like the following?

i = 1;
f(i++, i)

and

std::cout << f() << f() << f();

or

f(g(), h(), j());
Ibis answered 21/7, 2016 at 10:21 Comment(3)
Related to Order of evaluation of assignment statement in C++ and Does this code from “The C++ Programming Language” 4th edition section 36.3.6 have well-defined behavior? which are both covered by the paper. The first one might make a nice additional examples in your answer below.Limousin
Also somewhat related: c++17 evaluation order with operator overloading functions.Multiplication
operator-precedence is not relevant.Aimeeaimil
I
109

Some common cases where the evaluation order has so far been unspecified, are specified and valid with C++17. Some undefined behaviour is now instead unspecified.

i = 1;
f(i++, i)

was undefined, but it is now unspecified. Specifically, what is not specified is the order in which each argument to f is evaluated relative to the others. i++ might be evaluated before i, or vice-versa. Indeed, it might evaluate a second call in a different order, despite being under the same compiler.

However, the evaluation of each argument is required to execute completely, with all side-effects, before the execution of any other argument. So you might get f(1, 1) (second argument evaluated first) or f(1, 2) (first argument evaluated first). But you will never get f(2, 2) or anything else of that nature.

std::cout << f() << f() << f();

was unspecified, but it will become compatible with operator precedence so that the first evaluation of f will come first in the stream (examples below).

f(g(), h(), j());

still has unspecified evaluation order of g, h, and j. Note that for getf()(g(),h(),j()), the rules state that getf() will be evaluated before g, h, j.

Also note the following example from the proposal text:

 std::string s = "but I have heard it works even if you don't believe in it"
 s.replace(0, 4, "").replace(s.find("even"), 4, "only")
  .replace(s.find(" don't"), 6, "");

The example comes from The C++ Programming Language, 4th edition, Stroustrup, and used to be unspecified behaviour, but with C++17 it will work as expected. There were similar issues with resumable functions (.then( . . . )).

As another example, consider the following:

#include <iostream>
#include <string>
#include <vector>
#include <cassert>

struct Speaker{
    int i =0;
    Speaker(std::vector<std::string> words) :words(words) {}
    std::vector<std::string> words;
    std::string operator()(){
        assert(words.size()>0);
        if(i==words.size()) i=0;
        // Pre-C++17 version:
        auto word = words[i] + (i+1==words.size()?"\n":",");
        ++i;
        return word;
        // Still not possible with C++17:
        // return words[i++] + (i==words.size()?"\n":",");

    }
};

int main() {
    auto spk = Speaker{{"All", "Work", "and", "no", "play"}};
    std::cout << spk() << spk() << spk() << spk() << spk() ;
}

With C++14 and before we may (and will) get results such as

play
no,and,Work,All,

instead of

All,work,and,no,play

Note that the above is in effect the same as

(((((std::cout << spk()) << spk()) << spk()) << spk()) << spk()) ;

But still, before C++17 there was no guarantee that the first calls would come first into the stream.

References: From the accepted proposal:

Postfix expressions are evaluated from left to right. This includes functions calls and member selection expressions.

Assignment expressions are evaluated from right to left. This includes compound assignments.

Operands to shift operators are evaluated from left to right. In summary, the following expressions are evaluated in the order a, then b, then c, then d:

  1. a.b
  2. a->b
  3. a->*b
  4. a(b1, b2, b3)
  5. b @= a
  6. a[b]
  7. a << b
  8. a >> b

Furthermore, we suggest the following additional rule: the order of evaluation of an expression involving an overloaded operator is determined by the order associated with the corresponding built-in operator, not the rules for function calls.

Edit note: My original answer misinterpreted a(b1, b2, b3). The order of b1, b2, b3 is still unspecified. (thank you @KABoissonneault, all commenters.)

However, (as @Yakk points out) and this is important: Even when b1, b2, b3 are non-trivial expressions, each of them are completely evaluated and tied to the respective function parameter before the other ones are started to be evaluated. The standard states this like this:

§5.2.2 - Function call 5.2.2.4:

. . . The postfix-expression is sequenced before each expression in the expression-list and any default argument. Every value computation and side effect associated with the initialization of a parameter, and the initialization itself, is sequenced before every value computation and side effect associated with the initialization of any subsequent parameter.

However, one of these new sentences are missing from the GitHub draft:

Every value computation and side effect associated with the initialization of a parameter, and the initialization itself, is sequenced before every value computation and side effect associated with the initialization of any subsequent parameter.

The example is there. It solves a decades-old problems (as explained by Herb Sutter) with exception safety where things like

f(std::unique_ptr<A> a, std::unique_ptr<B> b);

f(get_raw_a(), get_raw_a());

would leak if one of the calls get_raw_a() would throw before the other raw pointer was tied to its smart pointer parameter.

As pointed out by T.C., the example is flawed since unique_ptr construction from raw pointer is explicit, preventing this from compiling.*

Also note this classical question (tagged C, not C++):

int x=0;
x++ + ++x;

is still undefined.

Ibis answered 21/7, 2016 at 10:22 Comment(16)
Some of that reads as if it was about an earlier not accepted draft. Are you sure you are reading the accepted one?Sociable
"A second, subsidiary proposal replaces the evaluation order of function calls as follows: the function is evaluated before all its arguments, but any pair of arguments (from the argument list) is indeterminately sequenced; meaning that one is evaluated before the other but the order is not specified; it is guaranteed that the function is evaluated before the arguments. This reflects a suggestion made by some members of the Core Working Group."Sociable
As far as I know, in the actual accepted proposal, only the example with std::cout will actually have a defined order; the others will remain undefined as before. I mean, I might be wrong, but that's what I understood from the proposal that actually made it to C++17.Recruit
I'm getting that impression from the paper saying that "the following expressions are evaluated in the order a, then b, then c, then d" and then showing a(b1, b2, b3), suggesting that all b expressions are not necessarily evaluated in any order (otherwise, it would be a(b, c, d))Recruit
@KABoissoneault, You are correct and I have updated the answer accordingly. Also, all: the quotes are form version 3, which is the voted in version as far as I understand.Ibis
@Yakk. I quoited the correct text, but your conclusions are still true, note my comment to KABoissonneault.Ibis
@JohanLundberg There is another thing from the paper that I believe is important. a(b1()(), b2()()) can order b1()() and b2()() in any order, but it cannot do b1() then b2()() then b1()(): it may no longer interleave their executions. In short, "8. ALTERNATE EVALUATION ORDER FOR FUNCTION CALLS" was part of the approved change.Sociable
f(i++, i) was undefined. It's now unspecified. Stroustrup's string example was probably unspecified, not undefined. ` f(get_raw_a(),get_raw_a());` won't compile since the relevant unique_ptr constructor is explicit. Finally, x++ + ++x is undefined, period.Hyperploid
@Yakk, I trust you (updated) but I seem to not have access to what was actually concluded then? What new wording guarantees this? Also what wording does the change T.C. mentions with respect to undefined/unspecified. Note that the current draft (github #834) does not contain any of the words from point 8, and also misses one sentence from the r3 paper.Ibis
@Yakk in P0145R3 Section 8 it states "At the Spring 2016 meeting in Jacksonville, FL, EWG decided (via a poll) not to pursue this alternative evaluation order." -- yet the latest draft from github I've seen does have wording closely resembling the "indeterminately sequenced" part. Do you know how/when this got voted back in again?Maccarone
@Maccarone Jun 24th 2016?Sociable
Can this answer be updated now that C++17 is published?Stearne
In the statement *foo() += x;, is the read of x sequenced before the call to foo, or merely before the read of the address that foo returned? Having to perform the read of x before the call would often necessitate pushing an extra word on the stack compared with reading it after.Cite
Does the uniform initialization a{b1, b2, b3} follow the function calls, i.e. "the evaluation of each argument is required to execute completely"?Bonanza
I'm a bit confused. What would ìnt x, *y; y = &x; cout << *y << x++; return?Transpire
But you will never get f(2, 2) or anything else of that nature. Yeah, but was it even possible to get it before C++17? I'd say no.Ferment
H
64

Interleaving is prohibited in C++17

In C++14, the following was unsafe:

void foo(std::unique_ptr<A>, std::unique_ptr<B>);

foo(std::unique_ptr<A>(new A), std::unique_ptr<B>(new B));

There are four operations that happen here during the function call

  1. new A
  2. unique_ptr<A> constructor
  3. new B
  4. unique_ptr<B> constructor

The ordering of these was completely unspecified, and so a perfectly valid ordering is (1), (3), (2), (4). If this ordering was selected and (3) throws, then the memory from (1) leaks - we haven't run (2) yet, which would've prevented the leak.


In C++17, the new rules prohibit interleaving. From [intro.execution]:

For each function invocation F, for every evaluation A that occurs within F and every evaluation B that does not occur within F but is evaluated on the same thread and as part of the same signal handler (if any), either A is sequenced before B or B is sequenced before A.

There is a footnote to that sentence which reads:

In other words, function executions do not interleave with each other.

This leaves us with two valid orderings: (1), (2), (3), (4) or (3), (4), (1), (2). It is unspecified which ordering is taken, but both of these are safe. All the orderings where (1) (3) both happen before (2) and (4) are now prohibited.

Hemocyte answered 28/9, 2017 at 15:5 Comment(4)
A slight aside, but this was one of the reasons for boost::make_shared, and later std::make_shared (other reason being fewer allocations + better locality). Sounds like the exception-safety/resource-leak motivation no longer applies. See Code Example 3, boost.org/doc/libs/1_67_0/libs/smart_ptr/doc/html/… Edit and stackoverflow.com/a/48844115 , herbsutter.com/2013/05/29/gotw-89-solution-smart-pointersVandalize
I wonder how this change affects optimization. The compiler now has severely reduced number of options as to how to combine and interleave CPU instructions related to arguments computation, so it may lead to poorer CPU utilization?Tessellation
What about the case of obj.modify().f(obj.access()): Is it well-defined if obj.modify() comes before or after obj.access()? (It sounds like at least in obj.modify().f(obj.access().foo()) all of obj.access().foo() would happen "together" rather than obj.modify() being sequenced after obj.access() before .foo().)Mclaren
@Mclaren That's handled by the top answer. In a.b, a is evaluated before b.Hemocyte
P
3

I've found some notes about expression evaluation order:

  • Quick Q: Why doesn’t c++ have a specified order for evaluating function arguments?

    Some order of evaluation guarantees surrounding overloaded operators and complete-argument rules where added in C++17. But it remains that which argument goes first is left unspecified. In C++17, it is now specified that the expression giving what to call (the code on the left of the ( of the function call) goes before the arguments, and whichever argument is evaluated first is evaluated fully before the next one is started, and in the case of an object method the value of the object is evaluated before the arguments to the method are.

  • Order of evaluation

    21) Every expression in a comma-separated list of expressions in a parenthesized initializer is evaluated as if for a function call (indeterminately-sequenced)

  • Ambiguous expressions

    The C++ language does not guarantee the order in which arguments to a function call are evaluated.

In P0145R3.Refining Expression Evaluation Order for Idiomatic C++ I've found:

The value computation and associated side-effect of the postfix-expression are sequenced before those of the expressions in the expression-list. The initializations of the declared parameters are indeterminately sequenced with no interleaving.

But I didn't find it in standard, instead in standard I've found:

6.8.1.8 Sequential execution [intro.execution] An expression X is said to be sequenced before an expression Y if every value computation and every side effect associated with the expression X is sequenced before every value computation and every side effect associated with the expression Y.

6.8.1.9 Sequential execution [intro.execution] Every value computation and side effect associated with a full-expression is sequenced before every value computation and side effect associated with the next full-expression to be evaluated.

7.6.19.1 Comma operator [expr.comma] A pair of expressions separated by a comma is evaluated left-to-right;...

So, I compared according behavior in three compilers for 14 and 17 standards. The explored code is:

#include <iostream>

struct A
{
    A& addInt(int i)
    {
        std::cout << "add int: " << i << "\n";
        return *this;
    }

    A& addFloat(float i)
    {
        std::cout << "add float: " << i << "\n";
        return *this;
    }
};

int computeInt()
{
    std::cout << "compute int\n";
    return 0;
}

float computeFloat()
{
    std::cout << "compute float\n";
    return 1.0f;
}

void compute(float, int)
{
    std::cout << "compute\n";
}

int main()
{
    A a;
    a.addFloat(computeFloat()).addInt(computeInt());
    std::cout << "Function call:\n";
    compute(computeFloat(), computeInt());
}

Results (the more consistent is clang):

<style type="text/css">
  .tg {
    border-collapse: collapse;
    border-spacing: 0;
    border-color: #aaa;
  }
  
  .tg td {
    font-family: Arial, sans-serif;
    font-size: 14px;
    padding: 10px 5px;
    border-style: solid;
    border-width: 1px;
    overflow: hidden;
    word-break: normal;
    border-color: #aaa;
    color: #333;
    background-color: #fff;
  }
  
  .tg th {
    font-family: Arial, sans-serif;
    font-size: 14px;
    font-weight: normal;
    padding: 10px 5px;
    border-style: solid;
    border-width: 1px;
    overflow: hidden;
    word-break: normal;
    border-color: #aaa;
    color: #fff;
    background-color: #f38630;
  }
  
  .tg .tg-0pky {
    border-color: inherit;
    text-align: left;
    vertical-align: top
  }
  
  .tg .tg-fymr {
    font-weight: bold;
    border-color: inherit;
    text-align: left;
    vertical-align: top
  }
</style>
<table class="tg">
  <tr>
    <th class="tg-0pky"></th>
    <th class="tg-fymr">C++14</th>
    <th class="tg-fymr">C++17</th>
  </tr>
  <tr>
    <td class="tg-fymr"><br>gcc 9.0.1<br></td>
    <td class="tg-0pky">compute float<br>add float: 1<br>compute int<br>add int: 0<br>Function call:<br>compute int<br>compute float<br>compute</td>
    <td class="tg-0pky">compute float<br>add float: 1<br>compute int<br>add int: 0<br>Function call:<br>compute int<br>compute float<br>compute</td>
  </tr>
  <tr>
    <td class="tg-fymr">clang 9</td>
    <td class="tg-0pky">compute float<br>add float: 1<br>compute int<br>add int: 0<br>Function call:<br>compute float<br>compute int<br>compute</td>
    <td class="tg-0pky">compute float<br>add float: 1<br>compute int<br>add int: 0<br>Function call:<br>compute float<br>compute int<br>compute</td>
  </tr>
  <tr>
    <td class="tg-fymr">msvs 2017</td>
    <td class="tg-0pky">compute int<br>compute float<br>add float: 1<br>add int: 0<br>Function call:<br>compute int<br>compute float<br>compute</td>
    <td class="tg-0pky">compute float<br>add float: 1<br>compute int<br>add int: 0<br>Function call:<br>compute int<br>compute float<br>compute</td>
  </tr>
</table>
Pulver answered 7/2, 2019 at 8:38 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.