What is the relation between operator precedence and order of evaluation? [duplicate]
Asked Answered
J

6

51

The terms 'operator precedence' and 'order of evaluation' are very commonly used terms in programming and extremely important for a programmer to know. And, as far as I understand them, the two concepts are tightly bound; one cannot do without the other when talking about expressions.

Let us take a simple example:

int a=1;  // Line 1
a = a++ + ++a;  // Line 2
printf("%d",a);  // Line 3

Now, it is evident that Line 2 leads to Undefined Behavior, since Sequence points in C and C++ include:

  1. Between evaluation of the left and right operands of the && (logical AND), || (logical OR), and comma operators. For example, in the expression *p++ != 0 && *q++ != 0, all side effects of the sub-expression *p++ != 0 are completed before any attempt to access q.

  2. Between the evaluation of the first operand of the ternary "question-mark" operator and the second or third operand. For example, in the expression a = (*p++) ? (*p++) : 0 there is a sequence point after the first *p++, meaning it has already been incremented by the time the second instance is executed.

  3. At the end of a full expression. This category includes expression statements (such as the assignment a=b;), return statements, the controlling expressions of if, switch, while, or do-while statements, and all three expressions in a for statement.

  4. Before a function is entered in a function call. The order in which the arguments are evaluated is not specified, but this sequence point means that all of their side effects are complete before the function is entered. In the expression f(i++) + g(j++) + h(k++), f is called with a parameter of the original value of i, but i is incremented before entering the body of f. Similarly, j and k are updated before entering g and h respectively. However, it is not specified in which order f(), g(), h() are executed, nor in which order i, j, k are incremented. The values of j and k in the body of f are therefore undefined.3 Note that a function call f(a,b,c) is not a use of the comma operator and the order of evaluation for a, b, and c is unspecified.

  5. At a function return, after the return value is copied into the calling context. (This sequence point is only specified in the C++ standard; it is present only implicitly in C.)

  6. At the end of an initializer; for example, after the evaluation of 5 in the declaration int a = 5;.

Thus, going by Point # 3:

At the end of a full expression. This category includes expression statements (such as the assignment a=b;), return statements, the controlling expressions of if, switch, while, or do-while statements, and all three expressions in a for statement.

Line 2 clearly leads to Undefined Behavior. This shows how Undefined Behaviour is tightly coupled with Sequence Points.

Now let us take another example:

int x=10,y=1,z=2; // Line 4
int result = x<y<z; // Line 5

Now its evident that Line 5 will make the variable result store 1.

Now the expression x<y<z in Line 5 can be evaluated as either:

x<(y<z) or (x<y)<z. In the first case the value of result will be 0 and in the second case result will be 1. But we know, when the Operator Precedence is Equal/Same - Associativity comes into play, hence, is evaluated as (x<y)<z.

This is what is said in this MSDN Article:

The precedence and associativity of C operators affect the grouping and evaluation of operands in expressions. An operator's precedence is meaningful only if other operators with higher or lower precedence are present. Expressions with higher-precedence operators are evaluated first. Precedence can also be described by the word "binding." Operators with a higher precedence are said to have tighter binding.

Now, about the above article; it mentions:

Expressions with higher-precedence operators are evaluated first.

It may sound incorrect. But, I think the article is not saying something wrong if we consider that () is also an operator x<y<z is same as (x<y)<z. My reasoning is if associativity does not come into play, then the complete expressions evaluation would become ambiguous since < is not a Sequence Point.

Also, another link I found says this on Operator Precedence and Associativity:

This page lists C operators in order of precedence (highest to lowest). Their associativity indicates in what order operators of equal precedence in an expression are applied.

So taking, the second example of int result=x<y<z, we can see here that there are in all 3 expressions, x, y and z, since, the simplest form of an expression consists of a single literal constant or object. Hence the result of the expressions x, y, z would be there rvalues, i.e., 10, 1 and 2 respectively. Hence, now we may interpret x<y<z as 10<1<2.

Now, doesn't Associativity come into play since now we have 2 expressions to be evaluated, either 10<1 or 1<2 and since the precedence of operator is same, they are evaluated from left to right?

Taking this last example as my argument:

int myval = ( printf("Operator\n"), printf("Precedence\n"), printf("vs\n"),
printf("Order of Evaluation\n") );

Now in the above example, since the comma operator has same precedence, the expressions are evaluated left-to-right and the return value of the last printf() is stored in myval.

In SO/IEC 9899:201x under J.1 Unspecified behavior it mentions:

The order in which subexpressions are evaluated and the order in which side effects take place, except as specified for the function-call (), &&, ||, ?:, and comma operators (6.5).

Now I would like to know, would it be wrong to say:

Order of Evaluation depends on the precedence of operators, leaving cases of Unspecified Behavior.

I would like to be corrected if any mistakes were made in something I said in my question. The reason I posted this question is because of the confusion created in my mind by the MSDN Article. Is it in Error or not?

Jail answered 29/3, 2011 at 13:16 Comment(14)
I'm with you all the way up to the conclusion, but don't see where you get the unspecified behavior.Withal
As I read the standard, (x<y)<z only clarifies associativity/precedence. The two subexpressions (x<y) and z could be evaluated in either order.Incompetent
@Bo Persson:: The reason i state unspecified behavior is because in situations like:obj + obj++ the behaviour is unspecified even though the precedence is the same.Jail
I don't know about C++. The C Standard does not define a "precedence"; it defines a grammar. The grammar for expression evaluation imposes a certain grouping of sub-expressions.Mikiso
@pmg:: The precedence of operators is not directly specified, but it can be derived from the syntax.Jail
@Erik:: Precisely my point!! So should that MSDN article be considered flawed or not?Jail
@Acme: Depends. That statement may well be true for MSVC - but it's not the case for c++ in general as far as I can tell.Incompetent
@Acme The behavior of obj + obj++ is undefined, not just unspecified. It's undefined because the standard explicitly says so; precedence has nothing to do with the question.Oralle
@Erik:: I would like to know from a Standard C point of view.Jail
@Acme: No idea on C. For C++ I'm pretty certain my initial comment is correct.Incompetent
@Acme Read my answer. The precedence and associativity rules only define the shape of the expression tree, not the order in which it will be evaluated.Thenceforward
@James Kanze:: Sorry that was a typo. i meant ++obj + obj++ where obj is an object of some class type and where ++ is overloaded, hence the evaluation is unspecified and similar to f1() + f2(), .Jail
@James @Mikiso @Incompetent @Let_Me_Be: The confusion has arised because of my comments here.Rebroadcast
@Acme : Read Jerry Coffin's answer. It will clear all your confusions.Rebroadcast
B
55

Yes, the MSDN article is in error, at least with respect to standard C and C++1.

Having said that, let me start with a note about terminology: in the C++ standard, they (mostly--there are a few slip-ups) use "evaluation" to refer to evaluating an operand, and "value computation" to refer to carrying out an operation. So, when (for example) you do a + b, each of a and b is evaluated, then the value computation is carried out to determine the result.

It's clear that the order of value computations is (mostly) controlled by precedence and associativity--controlling value computations is basically the definition of what precedence and associativity are. The remainder of this answer uses "evaluation" to refer to evaluation of operands, not to value computations.

Now, as to evaluation order being determined by precedence, no it's not! It's as simple as that. Just for example, let's consider your example of x<y<z. According to the associativity rules, this parses as (x<y)<z. Now, consider evaluating this expression on a stack machine. It's perfectly allowable for it to do something like this:

 push(z);    // Evaluates its argument and pushes value on stack
 push(y);
 push(x);
 test_less();  // compares TOS to TOS(1), pushes result on stack
 test_less();

This evaluates z before x or y, but still evaluates (x<y), then compares the result of that comparison to z, just as it's supposed to.

Summary: Order of evaluation is independent of associativity.

Precedence is the same way. We can change the expression to x*y+z, and still evaluate z before x or y:

push(z);
push(y);
push(x);
mul();
add();

Summary: Order of evaluation is independent of precedence.

When/if we add in side effects, this remains the same. I think it's educational to think of side effects as being carried out by a separate thread of execution, with a join at the next sequence point (e.g., the end of the expression). So something like a=b++ + ++c; could be executed something like this:

push(a);
push(b);
push(c+1);
side_effects_thread.queue(inc, b);
side_effects_thread.queue(inc, c);
add();
assign();
join(side_effects_thread);

This also shows why an apparent dependency doesn't necessarily affect order of evaluation either. Even though a is the target of the assignment, this still evaluates a before evaluating either b or c. Also note that although I've written it as "thread" above, this could also just as well be a pool of threads, all executing in parallel, so you don't get any guarantee about the order of one increment versus another either.

Unless the hardware had direct (and cheap) support for thread-safe queuing, this probably wouldn't be used in in a real implementation (and even then it's not very likely). Putting something into a thread-safe queue will normally have quite a bit more overhead than doing a single increment, so it's hard to imagine anybody ever doing this in reality. Conceptually, however, the idea is fits the requirements of the standard: when you use a pre/post increment/decrement operation, you're specifying an operation that will happen sometime after that part of the expression is evaluated, and will be complete at the next sequence point.

Edit: though it's not exactly threading, some architectures do allow such parallel execution. For a couple of examples, the Intel Itanium and VLIW processors such as some DSPs, allow a compiler to designate a number of instructions to be executed in parallel. Most VLIW machines have a specific instruction "packet" size that limits the number of instructions executed in parallel. The Itanium also uses packets of instructions, but designates a bit in an instruction packet to say that the instructions in the current packet can be executed in parallel with those in the next packet. Using mechanisms like this, you get instructions executing in parallel, just like if you used multiple threads on architectures with which most of us are more familiar.

Summary: Order of evaluation is independent of apparent dependencies

Any attempt at using the value before the next sequence point gives undefined behavior -- in particular, the "other thread" is (potentially) modifying that data during that time, and you have no way of synchronizing access with the other thread. Any attempt at using it leads to undefined behavior.

Just for a (admittedly, now rather far-fetched) example, think of your code running on a 64-bit virtual machine, but the real hardware is an 8-bit processor. When you increment a 64-bit variable, it executes a sequence something like:

load variable[0]
increment
store variable[0]
for (int i=1; i<8; i++) {
    load variable[i]
    add_with_carry 0
    store variable[i]
}

If you read the value somewhere in the middle of that sequence, you could get something with only some of the bytes modified, so what you get is neither the old value nor the new one.

This exact example may be pretty far-fetched, but a less extreme version (e.g., a 64-bit variable on a 32-bit machine) is actually fairly common.

Conclusion

Order of evaluation does not depend on precedence, associativity, or (necessarily) on apparent dependencies. Attempting to use a variable to which a pre/post increment/decrement has been applied in any other part of an expression really does give completely undefined behavior. While an actual crash is unlikely, you're definitely not guaranteed to get either the old value or the new one -- you could get something else entirely.


1 I haven't checked this particular article, but quite a few MSDN articles talk about Microsoft's Managed C++ and/or C++/CLI (or are specific to their implementation of C++) but do little or nothing to point out that they don't apply to standard C or C++. This can give the false appearance that they're claiming the rules they have decided to apply to their own languages actually apply to the standard languages. In these cases, the articles aren't technically false -- they just don't have anything to do with standard C or C++. If you attempt to apply those statements to standard C or C++, the result is false.

Bicuspid answered 29/3, 2011 at 15:45 Comment(18)
Since the precedence and associativity rules define the shape of the expression tree, they also define partial ordering, since the tree has to be evaluated from leaves to root. There is of course no ordering between the leaves themselves, but claiming that there is none at all is just weird.Thenceforward
@Let_Me_Be: The compiler is free to transform something like ab+ac, into something like a*(b+c) (assuming a, b, and c are all integers). In the original, two multiplications precede addition, but the transformed version has one multiplication subsequent to the addition. Ultimately, yes, it has to evaluate some tree from leaves to root, but the tree it evaluates often won't directly reflect what you wrote.Bicuspid
precedence surely affects order of evaluation to some degree. If you say (a++ + b) * c, then the addition is necessarily evaluated before the multiplication, because the result of the multiplication depends on the result of the addition. That's just how the expression tree turns out to look. That's the whole reason f().g() will first execute f and then g, because f() is evaluated before g is called, as there is an apparent dependency.Outstanding
@Johannes: I don't think that's really true. Though it seems unlikely to happen in reality, the compiler could evaluate that as (a++*c)+(bc). The usual caveats apply though -- especially that it's much less free to do such reordering in the case of floating point. My point, however had less to do with order of carrying out the operations, than the order in which the *operands were evaluated.Bicuspid
@Jerry I agree the compiler could evaluate it as (a++*c)+(b*c) if it doesn't violate the as-if rule. But it can do any and everything if something doesn't violate the as if rule. For example, if c would be a volatile variable, it couldn't evaluate it in that transformed way. Note that I don't say that a has been incremented at the time of multiplication. That would require a sequence point, which isn't there. Basically, I'm expressing what @JamesKanze said in his answer.Outstanding
The order of sequence points is partial because the order of evaluation is partial. If there were no evaluation order without sequence points, there would be no order for sequence points: The order of the latter is defined as the order of the former, not the other way around - "At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place".Outstanding
@Johannes: No, but it could still do temp=c, then use temp in the altered expression. What we're left with is pretty simple: depending on order of evaluation beyond what's defined by sequence points is a really poor idea at best. Yes, you might be able to deduce part of what's likely to happen part of the time, but depending on it is still a really poor idea -- almost anything you think you've determined stands at least some chance of being wrong -- and even if you are right, depending on it leads to fragility (i.e., somebody who knows less than you do is likely to break it).Bicuspid
I don't see any problems with doing temp=c and then using temp in the altered expression. I think that's fine. "depending on order of evaluation beyond what's defined by sequence points is a really poor idea at best". Yes I agree, in that only depending on the order of evaluation is almost useless if there are side effects in the game, because side effects are only sequenced by sequence points.Outstanding
Is it not true that in your a=b++ + ++c; example that the assignment to a is also a side effect? This is why (at least in C++11) i=i++; is undefined: because the side-effects of incrementing and assigning to i are unsequenced with respect to one another.Antinomian
@GarrettGutierrez: Yes, if memory serves, the assignment is considered a side effect.Bicuspid
−1 The general claim "Order of evaluation is independent of precedence." is nonsense. E.g., in a+b*c, the multiplication has to be evaluated first. The values of the variables a, b and c can be evaluated in any order. C++17 added some further constraints on order of evaluationMalonis
@Cheersandhth.-Alf: The C++ standard doesn't talk about carrying out a multiplication as "evaluation". It talks about operands being evaluated. The operation itself is talked about as a "value computation", not an evaluation (e.g., N4659, §[intro.execution]/16, /17)Bicuspid
@JerryCoffin: First of all the formal-speak wouldn't matter even if it were as you describe, and secondly, it isn't as you describe. There are numerous instances where the standard talks of evaluations of operations, e.g. (just the one in front of me right now) " the operation of postfix ++ is a single evaluation".Malonis
@Cheersandhth.-Alf: Yes, it does matter. And yes, I suppose the editors aren't perfect, but you don't have to read through much to realize that they at least tried to use "value computation" to the act of carrying out the operation itself, and "evaluation" to refer to evaluation of operands. But as your side-stepping SO's rules to put in a "-1" shows, you care more about your own opinion than anybody else's rules, so I doubt any amount of evidence will affect your opinion either.Bicuspid
@JerryCoffin: No need to get personal to communicate you realize now that this answer is wrong. Other Stack Overflowers do that enough. We don't have to.Malonis
@Cheersandhth.-Alf: If you think that was what I was communicating, I apologize--I obviously wasn't clear. I have, however, added a note to the beginning to prevent others from misunderstanding the answer the same way you did.Bicuspid
@JerryCoffin: To find defined formal terms you can, as a rule, look for the term in italics. The terminology note you added is an improvement. It's an incorrect perception on your part, as evidenced by the numerous counter examples in the standard, which include at top level, "evaluation" of an expression. But it's better. Thanks.Malonis
By the way, I think the most important fallacy to learn about here, now, is the No True Scotsman fallacy. ;-)Malonis
O
12

The only way precedence influences order of evaluation is that it creates dependencies; otherwise the two are orthogonal. You've carefully chosen trivial examples where the dependencies created by precedence do end up fully defining order of evaluation, but this isn't generally true. And don't forget, either, that many expressions have two effects: they result in a value, and they have side effects. These two are no required to occur together, so even when dependencies force a specific order of evaluation, this is only the order of evaluation of the values; it has no effect on side effects.

Oralle answered 29/3, 2011 at 13:46 Comment(3)
Effectively meaning that in foo(a) + foo(b), foo(b) could have been evaluated much earlier than foo(a).Guideboard
I like the way you describe this.Outstanding
@Guideboard How would one proceed to enforce order of evaluation in your example?Spillar
F
7

A good way to look at this is to take the expression tree.

If you have an expression, lets say x+y*z you can rewrite that into an expression tree:

Applying the priority and associativity rules:

x + ( y * z )

After applying the priority and associativity rules, you can safely forget about them.

In tree form:

  x
+
    y
  *
    z

Now the leaves of this expression are x, y and z. What this means is that you can evaluate x, y and z in any order you want, and also it means that you can evaluate the result of * and x in any order.

Now since these expressions don't have side effects you don't really care. But if they do, the ordering can change the result, and since the ordering can be anything the compiler decides, you have a problem.

Now, sequence points bring a bit of order into this chaos. They effectively cut the tree into sections.

x + y * z, z = 10, x + y * z

after priority and associativity

x + ( y * z ) , z = 10, x + ( y * z)

the tree:

      x
    +
        y
      *
        z
  , ------------
      z
    =
      10     
  , ------------
      x
    +
        y
      *
        z   

The top part of the tree will be evaluated before the middle, and middle before bottom.

Fundy answered 29/3, 2011 at 13:44 Comment(16)
@Acme Well, the article is essentially correct. Or to put it even more diplomatically, I know what the author meant. And it might actually be completely true for MSVC (compilers may impose additional orderings). But again, the precedence and associativity rules only define the shape of the expression tree, not the order of evaluation.Thenceforward
@Acme Depends on how you look at it. First, precedence defines partial order. x+y*z means that for evaluating + you need the result of *. That is the ordering that is defined by the operator precedence. Now the order of evaluation needs to follow this partial ordering, but apart from this it can do what it wants. So no you can't really say that they are completely unrelated. But they aren't defined by each other either.Thenceforward
@Let_Me_Be : Partial ordering of what? First, precedence defines partial order. x+y*z means that for evaluating + you need the result of *. . This is plain wrong. Operators don't get evaluated, operands do.Rebroadcast
@Prasoon No, expressions are evaluated.Thenceforward
@Let_Me_Be : Yes! So what's your point? Expressions are evaluated. Operands are part of expressions. I don't see a difference. What do you mean by that for evaluating + ? How could someone possibly evaluate + ?Rebroadcast
@Prasoon No, operands are separate expressions. Operator with its operands forms another expression. And how could someone evaluate + in x + y * z you ask? By evaluating its operands first and when they are evaluated, by evaluating the expression itself.Thenceforward
@Let_Me_Be : My only point is that the order of evaluation and precedence of operators are unrelated. Check out the extended discussion on usenet where former ISO C Committee member Peter Seebach clearly mentions that order of evaluation and precendence of operators are two independent things.Rebroadcast
@Prasoon I'm sorry, but you obviously have trouble understanding what he meant. He is talking about the order of the expression on the leaves of the expression tree.Thenceforward
@Let_Me_Be : I have no problem understanding what he meant. He is talking about the order of the evaluation of expressions on the leaves of the expression tree and that order being independent of the precedence of operators. Infact the order is unspecified.Rebroadcast
@Let_Me_Be: Also check out Jerry Coffin's answer.Rebroadcast
@Acme : Read that. I still believe Jerry Coffin's answer is 100% correct. :)Rebroadcast
@Prasoon Yes, but you are not, that's the problem.Thenceforward
@Let_Me_Be : There is just one problem and that is in your understanding. :) How's "Order of evaluation is independent of precedence." mentioned in Jerry Coffin's post different from what I said?Rebroadcast
@Prasoon You are misinterpreting the text and making bad conclusions based on your misinterpretation. The "Order of evaluation is independent of precedence" makes sense in the context of the answer, since it is a completely different thing than what you are interpreting it to be. The reason why I object to that answer is that it can be misinterpreted as your are misinterpreting right now.Thenceforward
@Let_Me_Be : I am certainly having trouble understanding your understanding about my understanding. :-)Rebroadcast
@Prasoon The "Order of evaluation" only refers to leaves in the expression tree. The partial ordering still exists and is defined by the shape of the expression tree. This is what you are misinterpreting. You insist on claiming that there is no ordering whatsoever.Thenceforward
R
5

It mentions "Expressions with higher-precedence operators are evaluated first."

I am just going to repeat what I said here. As far as standard C and C++ are concerned that article is flawed. Precedence only affects which tokens are considered to be the operands of each operator, but it does not affect in any way the order of evaluation.

So, the link only explains how Microsoft implemented things, not how the language itself works.

Rebroadcast answered 29/3, 2011 at 14:43 Comment(3)
The order of evaluation of 1+2*3 is defined solely by operator precedence.Shahaptian
@EJP that is a special case. Look into 1 + 2 + 3 * 4, where 1 + 2 can be evaluated before 3 * 4 while * has higher precedence than +Crispy
@user207421: 1+2*3 will be evaluated at compile time anyway... But a() + b() * c() is a better example: * will probably be evaluated before + but the order of evaluation of the operands is unspecified. I am using probably because the context in which this expression is used might allow the compiler to elide the evaluation of + and/or * in this expression.Broadminded
K
5

Precedence has nothing to do with order of evaluation and vice-versa.

Precedence rules describe how an underparenthesized expression should be parenthesized when the expression mixes different kinds of operators. For example, multiplication is of higher precedence than addition, so 2 + 3 x 4 is equivalent to 2 + (3 x 4), not (2 + 3) x 4.

Order of evaluation rules describe the order in which each operand in an expression is evaluated.

Take an example

y = ++x || --y;   

By operator precedence rule, it will be parenthesize as (++/-- has higher precedence than || which has higher precedence than =):

y = ( (++x) || (--y) )   

The order of evaluation of logical OR || states that (C11 6.5.14)

the || operator guarantees left-to-right evaluation.

This means that the left operand, i.e the sub-expression (x++) will be evaluated first. Due to short circuiting behavior; If the first operand compares unequal to 0, the second operand is not evaluated, right operand --y will not be evaluated although it is parenthesize prior than (++x) || (--y).

Killifish answered 28/8, 2014 at 18:51 Comment(0)
S
-1

I think it's only the

a++ + ++a

epxression problematic, because

a = a++ + ++a;

fits first in 3. but then in the 6. rule: complete evaluation before assignment.

So,

a++ + ++a

gets for a=1 fully evaluated to:

1 + 3   // left to right, or
2 + 2   // right to left

The result is the same = 4.

An

a++ * ++a    // or
a++ == ++a

would have undefined results. Isn't it?

Synonymy answered 5/11, 2013 at 15:12 Comment(1)
No. Because a++ + ++a modifies a twice between sequence points, the behaviour is undefined; there is no correct result (or every result is correct, even if the answer is -2,000,000,000).Achievement

© 2022 - 2024 — McMap. All rights reserved.