Does the definition int a = 0, b = a++, c = a++; have defined behavior in C?
Asked Answered
S

3

33

Does the definition int a = 0, b = a++, c = a++; have defined behavior in C?

Or almost equivalently, does the comma in an object definition introduce a sequence point as for the comma operator in expressions?

Similar questions have been asked for C++:

The widely accepted answer for C++ is Yes, it is fully defined per paragraph 8/3 of the C++11 Standard:

Each init-declarator in a declaration is analyzed separately as if it was in a declaration by itself

Albeit this paragraph only refers to the syntax analysis phase and is not quite precise enough regarding the sequencing of operations at runtime.

What is the situation for the C language? Does the C Standard define the behavior?

A similar question was asked before:

Does the comma in a declaration for multiple objects introduce a sequence point like the comma operator?

Yet the answer seems to refer specifically to the C11 draft and may not hold for more recent versions of the C Standard as the wording of the informative Annex C has changed since the C11 draft and does not seem fully consistent with the Standard text either.

EDIT: of course such an initializer seems uselessly contorted. I definitely do not condone such programming style and constructions. The question arose from a discussion regarding a trivial definition: int res = 0, a = res; for which the behavior did not seem fully defined (!). Initializers with side effects are not so uncommon, consider for example this one: int arg1 = pop(), arg2 = pop();

Schaper answered 9/8, 2023 at 12:10 Comment(17)
C11 draft has Annex C, which enumerates sequence points. It is not hard to find one covering your questionUnchaste
@LanguageLawyer Please note than Annex C is not normative and some things it claims to be sequence points can not actually be found in normative text. So it is not as trivial as to just look it up in the standard, because the standard is inconsistent and poorly edited.Gracia
The duplicate question is only answered for C11. The corresponding clause has been changed in C17. Appendix C in C17 still lists "an initializer that is not part of a compound literal" as a full expression which introduces a sequence point, but even though section 6.7.9 is given as reference, I fail to find a supporting normative clause in that section.Antiserum
The accepted answer in the referenced duplicate question includes some ambiguity wrt the source information used to assert the answer. In comments, the author of the accepted answer says "I quoted C11 because C17 wasn't finalized when I wrote the answer. You're right that neither C17 nor C23 (N2176 and N3054 from the WG14 Document Log) seems to include the sentence "The end of a full declarator is a sequence point".". If you would like to expand this question to explicitly include the newer standards, I will remove the closed tag. Ping me.Wits
@Antiserum - good point. I had already noticed the ambiguities was drafting my comment as you posted yours :)Wits
@ryyker: I amended the question in this direction, thank you for your support trying to solve the issue formally.Schaper
You're welcome. Expanding the question was a good move in light of the age of the previously linked duplicate, and the ambiguity caused by the gap in time between C11 and C17 (which was not yet ratified), which was evidently showing some gaps wrt sequence points in declarators at that time.Wits
Did someone run this code? Does it give a = 0, b = 1 and c = 2 or a = b = c = 0?Rianon
@A.L: it should give a = 2, b = 0 and c = 1.Schaper
@Schaper Thanks, my knowledge of C was rusty but I was able to run it: onlinegdb.com/7ePsNJsaqRianon
Does this answer your question? Does the comma in a declaration for multiple objects introduce a sequence point like the comma operator?Pavla
@ChristosLytras This has been discussed already and the question itself contains a link to this to this very question (see the paragraph under "EDIT").Antiserum
@Antiserum I'm not here for a discussion. I just marked this as a dup as it is for moderation attention.Pavla
@ChristosLytras: no it does not. The question you refer to only addresses the issue partially and the answers' arguments are incomplete and inconsistent. That's why I asked a more precise question, that received 2 well documented answers and more than 2 thousand views in less than 48hrs.Schaper
I just responded no to the moderator question and got this remark: Thanks! Edit your question to explain how it’s different from the suggested questions. This will help prevent your question from getting closed and will remove the suggested questions notification from your post. Well I am going to do that but the explanation was there already.Schaper
@Schaper I don't care about the stats and as I said, I am not here to neither discuss nor argue with anyone; it seems dup to me and I just flagged it for mods attention; the review will outcome if it's dup or not, that's all.Pavla
I have taken the discussion about duplicates of questions with outdated answers to MetaAntiserum
Y
18

Does the definition int a = 0, b = a++, c = a++; have defined behavior in C?

Yes, because C 2018 6.8 3 says these initializations (not all, see bottom) are evaluated in the order they appear:

… The initializers of objects that have automatic storage duration, and the variable length array declarators of ordinary identifiers with block scope, are evaluated and the values are stored in the objects (including storing an indeterminate value in objects without an initializer) each time the declaration is reached in the order of execution, as if it were a statement, and within each declaration in the order that declarators appear. [Emphasis added.]

Also, 6.8 4 tells us that each initializer is a full expression and there is a sequence point after the evaluation of a full expression and evaluation of the next:

A full expression is an expression that is not part of another expression, nor part of a declarator or abstract declarator. There is also an implicit full expression in which the non-constant size expressions for a variably modified type are evaluated; within that full expression, the evaluation of different size expressions are unsequenced with respect to one another. There is a sequence point between the evaluation of a full expression and the evaluation of the next full expression to be evaluated.

Given both the above, the initializers are sequenced in the order they appear. a is initialized first and so has a value when a++ is evaluated for b, and the side effects for that are completed before the a++ for c begins, so the whole declaration is safe from the “unsequenced effects” rule in 6.5 2.

6.8 3 is a bit lacking for two reasons:

  • Initializers are not part of the grammar token declarator (they are part of the init-declarator, a containing token of declarator). However, this seems like a wording issue, and we can take the initializers to be associated with their declarators.
  • It does not specify ordering between the expressions in a declarator (such as sizes for variable length arrays) and its initializer(s).

Also note that not all initializers are evaluated in the order they appear in a declaration. 6.7.9 23 discusses initializers for aggregates and unions and says:

The evaluations of the initialization list expressions are indeterminately sequenced with respect to one another and thus the order in which any side effects occur is unspecified.

History

The wording in 6.8 3 quoted above goes back to C 1999. In C 1990, it had this form in 6.6.2, which is about compound statements:

… The initializers of objects that have automatic storage duration are evaluated and the values are stored in the objects in the order their declarators appear in the translation unit.

Yvor answered 9/8, 2023 at 20:52 Comment(6)
IMHO this is the correct answer since it answers the question without depending on informative text in the standard.Antiserum
It may be mentioned that 6.8-5 (as a NOTE) explicitly specifies "an initializer that is not part of a compound literal" as being a full expression.Antiserum
It does not specify ordering between the expressions in a declarator (such as sizes for variable length arrays) and its initializer(s). VLAs cannot have an initializer, so an example of a definition that poses a problem requires an extra indirection level (eg: int a = 9, b[100][9], (*p)[a++] = b[a++];)Schaper
@nielsen: I agree. The answer is both more concise and citations from the C Standard are more pertinent.Schaper
@chqrlie: Updated. It goes back to C 1990.Yvor
Unfortunately the English prose here is ambiguous. The intent is almost certainly what you claim but.... "the initializers are evaluated in no particular order" AND "the values are stored in the objects in the order their declarators appear in the translation unit" is a valid parse as well.Rennin
W
16

"Does the definition int a = 0, b = a++, c = a++; have defined behavior in C?"...

In the current C standard ISO/IEC9899:2017, program execution is covered in section §5.1.2.3 (3) which includes discussion on sequencing and side effects. The source text is reproduced below for reference.

Summarizing from the sections of text below, initializers in a declaration statement are sequenced, guaranteeing that the initializer expressions in the declaration posted...

 int a = 0, b = a++, c = a++;

which describes an "...init-declarator-list [which] is a comma-separated sequence of declarators," (section 6.7 Declarations)
...will not invoke undefined behavior, or even indeterminate results. Each comma separated expression is guaranteed to be sequenced starting from left, and not moving to the right until all evaluations and side-effects for the current expression are resolved and complete. In this way the results of each expression is fully defined.

From §5.1.2.3

"Sequenced before is an asymmetric, transitive, pair-wise relation between evaluations executed by a single thread, which induces a partial order among those evaluations. Given any two evaluations A and B, if A is sequenced before B, then the execution of A shall precede the execution of B. (Conversely, if A is sequenced before B, then B is sequenced after A.) If A is not sequenced before or after B, then A and B are unsequenced. Evaluations A and B are indeterminately sequenced when A is sequenced either before or after B, but it is unspecified which.13) The presence of a sequence point between the evaluation of expressions A and B implies that every value computation and side effect associated with A is sequenced before every value computation and side effect associated with B. (A summary of the sequence points is given in annex C.)"

The relevant paragraph provided in Annex C:

"The following are the sequence points described in 5.1.2.3:"+ (3) ...

"Between the evaluation of a full expression and the next full expression to be evaluated. The following are full expressions: a full declarator for a variably modified type; an initializer that is not part of a compound literal (6.7.9); the expression in an expression statement (6.8.3); the controlling expression of a selection statement (if or switch) (6.8.4); the controlling expression of a while or do statement (6.8.5); each of the (optional) expressions of a for statement (6.8.5.3); the (optional) expression in a return statement (6.8.6.4)".
(emphasis mine)

Wits answered 9/8, 2023 at 13:57 Comment(19)
Nothing in the passages quoted in this answer says the initializers are evaluated in the order they appear in the source code. Where the text says “The init-declarator-list is a comma-separated sequence of declarators,” that just means they are a sequence of source code. It does not speak to sequencing of evaluation.Yvor
@EricPostpischil I don't think the quote you've quoted is what the answer is relying on to establish that their evaluation is sequenced? The paragraph quotes from Annex C (at the bottom of the answer) says that there is a sequence point between each full expression, and that declarators and initializers are full expressions. The quote you quoted just establishes that these things are declarators (and I suppose that they have a sequence so that "between" has meaning, so that there can be sequence points between them).Jinx
@Ben: There is nothing else in the quoted passages that mentions any sequence or sequence order of initializer evaluation.Yvor
@EricPostpischil Doesn't the bottom paragraph say there are sequence points between them, and the paragraph above that say that if there are sequence points between any two evaluations A and B then "every value computation and side effect associated with A is sequenced before every value computation and side effect associated with B". I'm not by any means super experienced at reading C standards language, but that seemed reasonably straightforward to me.Jinx
@Jinx The problem is that the bottom paragraph is from an informative annex in the standard. The C behavior should be fully defined by the normative clauses.Antiserum
@nielsen: The fact there are sequence points between things does not say what order they are in. There is a sequence point between E0 and E1 in int a[] = { E0, E1 }; because each is a full expression, but C 2018 6.7.9 23 says they are indeterminately sequenced (meaning they are sequenced but in any order).Yvor
@EricPostpischil This only adds to the problems with this answer. I was not aware of that distinction, but it makes good sense. Thank you for explaining.Antiserum
@EricPostpischil - Regarding your first comment. This clause of §5.1.2.3: "Given any two evaluations A and B, if A is sequenced before B, then the execution of A shall precede the execution of B." is offered within the context of a declarator, (as clarified in Annex C excerpt) and clearly perscribes a specified order of operation of two comma delimited evaluations Because in English, the words "sequenced before" imply "placed to the left of", it is clear that that in this case A is to be performed before B, in order from left to right.Wits
@ryyker: When 5.1.2.3 says “sequenced before”, it is talking about execution sequence, not about the order in which things appear in source code. In order for the condition “If A is sequenced before B” to be satisfied, there must be some statement that A is sequenced before B. There is no such statement in the passages quoted in this answer. Further, it is clear that the order initializers appear in source code is insufficient by itself to imply order, as 6.7.9 23 explicitly tells us that the order for E0 and E1 in int a[] = { E0, E1 }; is not specified. This answer is wrong.Yvor
@EricPostpischil - Do you not agree that sequenced implies both lexical position and in this context (declarators) specifies order of execution?Wits
@Antiserum - Annex C is intended here to clarify sequence points within context of declarators. The whole question is being asked within the context of a declarator. Annex C defines this specifically for §5.1.2.3. where the specified behavior is provided.Wits
@ryyker: “Do you not agree that sequenced implies both lexical position and in this context (declarators) specifies order of execution?”: No, I do not agree. That sentence solely talks about execution order and has nothing to do with lexical order. It is about execution order throughout C programs, which includes several things that do not occur in lexical order, including evaluation of the bodies of functions when they are called, the evaluation of arguments to functions, sequencing of evaluations in loops, evaluation of initializers in lists for aggregates, and more…Yvor
@Eric - Annex C includes exclusions, which is precisely what you are asserting in E0 and E1 in int a[] = { E0, E1 };, where there is no guarantee of sequencing between the two. But this example does not match the form of declarator OP is asking about, where each expression is clearly sequenced.Wits
… As an example, given two functions arguments or two initializers in a list for an aggregate, one of the two expressions is sequenced before the other, and 5.1.2.3 3 applies to this, but the order is not specified, and we know that because the standard explicitly says so.Yvor
@ryyker: Re “this example does not match the form of declarator OP is asking about”: The passage cannot both mean that the things it is talking about are specified to be sequenced in lexical order when those things are simple initializers in declarators and that the things it is talking about are not specified to be sequenced in lexical order when those things are other expressions, including initializers in lists for aggregates. Nothing in it makes such a distinction, and nothing in it brings lexical order into what it says in any way.Yvor
@EricPostpischil - two initializers in a list for an aggregate is a known exclusion, where sequencing is not specified, this rule does not apply to int a = 0, b = a++, c = a++;Wits
@ryyker: Re “each expression is clearly sequenced”: No such thing is stated in any way in the passages quoted in this answer, let alone clearly stated in those passages.Yvor
IANAL. "Sequenced before" clearly refers to time (at least in the as-if sense) and not space in the source code. Perhaps what's confusing is the use of "sequence" in "a comma-separated sequence of declarators" (because it does not also mean "sequenced") or perhaps it is intended (used on purpose to also mean "sequenced"). I don't know.Ellanellard
@ryyker: I'm afraid Eric's answer is more concise and cites more pertinent paragraphs of the C Standard. I have to change the accepted answer.Schaper
L
1

simply do not do such things... it is a recipe for disaster... no, don't. fullstop.

Latinalatinate answered 15/8, 2023 at 21:56 Comment(2)
Indeed, this advice is more important than whether or not such a statement is well-defined. The fact that it is confusing, and raises the question of how it is to be evaluated and formal language semantics - means it should be avoided. See also this answer of mine (albeit to a C++-related question).Gerrigerrie
The question is tagged [language-lawyer]. Of course such things are a recipe for disaster, the definition in the question is for illustration only. I amended the question for clarification.Schaper

© 2022 - 2024 — McMap. All rights reserved.