Using pointed to content in assignment of a pointer
Asked Answered
E

3

5

It has always been my understanding that the lack of a sequence point after the reading of the right expression in an assignment makes an example like the following produce undefined behavior:

void f(void)
{
   int *p;
   /*...*/
   p = (int [2]){*p};
   /*...*/
}
// p is assigned the address of the first element of an array of two ints, the
// first having the value previously pointed to by p and the second, zero. The
// expressions in this compound literal need not be constant. The unnamed object
// has automatic storage duration.

However, this is EXAMPLE 2 under "6.5.2.5 Compound literals" in the committee draft for the C11 standard, the version identified as n1570, which I understand to be the final draft (I don't have access to the final version).

So, my question: Is there something in the standard that gives this defined and specified behavior?

EDIT

I would like to expound on exactly what I see as the problem, in response to some of the discussion that has come up.

We have two conditions under which an assignment is explicitly stated to have undefined behavior, as per 6.5p2 of the standard quoted in the answer given by dbush:

1) A side effect on a scalar object is unsequenced relative to a different side effect on the same scalar object.

2) A side effect on a scalar object is unsequenced relative to a value computation using the value of the same scalar object.

An example of item 1 is "i = ++i + 1". In this case the side effect of writing the value i+1 into i due to ++i is unsequenced relative to the side effect of assigning the RHS to the LHS. There is a sequence point between the value calculations of each side and the assignment of RHS to LHS, as described in 6.5.16.1 given in the answer by Jens Gustedt below. However, the modification of i due to ++i is not subject to that sequence point, otherwise the behavior would be defined.

In the example I give above, we have a similar situation. There is a value computation, which involves the creation of an array and the conversion of that array to a pointer to its first element. There is also a side effect of writing a value to part of that array, *p to the first element.

So, I don't see what gaurantees we have in the standard that the modification of the otherwise uninitialized first element of the array will be sequenced before the writing of the array address to p. What about this modification (writing *p to the first element) is different from the modification of writing i+1 to i?

To put it another way, suppose an implementation looked at the statement of interest in the example as three tasks: 1st, allocate space for the compound literal object; 2nd: assign a pointer to said space to p; 3rd: write *p to the first element in the newly allocated space. The value computation for both RHS and LHS would be sequenced before the assignment, as computing the value of the RHS only requires the address. In what way is this hypothetical implementation not standard compliant?

Exactitude answered 12/2, 2018 at 20:3 Comment(10)
Why did you think it would produce undefined behavior? For example, i++ is equivalent to i = i + 1; which is well defined.Monmouth
There are no side effects, so no sequence point is needed.Fenestra
The side effect is the writing of a value to p. I think this differs from the i++ example, which I describe in my comment to the answer given by dbush below.Exactitude
The side effect of the assignment itself can't take place until after the RHS is calculated.Sabaean
@Barmar, Yes, but the right hand side can be calculated (the address of the array can be obtained), before the contents of the array are written.Exactitude
So you haven't previously initialized p? Then there's UB from trying to dereference an uninitialized pointer. It has nothing to do with sequence points and the use of the variable twice in the assignment.Sabaean
@Barmar, this example is taken from the standard, and is in context of compound literals. I think it is implied that p is assigned to point somewhere valid before the statement of interest.Exactitude
In that case, what's the problem? Its contents are valid, and you're using the old contents in the compound literal before you reassign the pointer.Sabaean
@Barmar, the problem is that this seems to be undefined behavior as per the rules of the standard. My reasoning for why this can be seen as undefined behavior is in the edit to my question above.Exactitude
Initializing the array with *p is not a side effect.Malvasia
M
1

In (int [2]){*p}, *p provides an initial value for the compound literal. This is not an assignment, and it is not a side effect. The initial value is part of the object when the object is created. There is no moment when the array exists and it is not initialized.

In p = (int [2]){*p}, we know the side effect of updating p is sequenced after the computation of the right side because C 2011 [N1570] 6.5.16 3 says “The side effect of updating the stored value of the left operand is sequenced after the value computations of the left and right operands.”

Malvasia answered 13/2, 2018 at 7:2 Comment(0)
L
6

You need to look at the definition of the assignment operator in 6.5.16.1

The side effect of updating the stored value of the left operand is sequenced after the value computations of the left and right operands. The evaluations of the operands are unsequenced.

So here you clearly see that first it evaluates the expressions on both sides in any order or even concurrently, and then stores the value of the right into the object designated by the left.

Additionally, you should know that LHS and RHS of an assignment are evaluated differently. Citations are a bit too long, so here is a summary

  • For the LHS the evaluation leaves "lvalues", that is objects such as p, untouched. In particular it doesn't look at the contents of the object.

  • For the RHS there is "lvalue conversion", that is for any object that is found there (e.g *p) the contents of that object is loaded.

  • If the RHS contains an lvalue of array type, this array is converted to a pointer to its first element. This is what is happening to your compound literal.

Edit: You added another question

What about this modification (writing *p to the first element) is different from the modification of writing i+1 to i?

The difference is simply that i in the LHS of the assignment and thus has to be updated. The array from the compound literal is not in the LHS and thus is of no concern for the update.

Lajuanalake answered 12/2, 2018 at 20:56 Comment(9)
Thank you for the response. However, I am not seeing how we get that the assignment of *p to an element of the array must be sequenced before the value computation of the right side, which is simply getting the address of the array that is the compound literal.Exactitude
@Kyle, hm, no sure that I see your problem. The whole compound literal, including its initializer is the RHS, p is the LHS of the assignment. So the text states that the RHS and the designation of p are sequence before the store operation. p is an lvalue, here, it is only evaluated to determine the object p itself not its contents.Lajuanalake
@Kyle: There is no assignment of *p to an element of the array. *p is an initializer in a compound literal. Providing an initial value is not assignment; it is part of the creation of the compound literal object and so is part of the value computation of the right side.Malvasia
@JensGustedt, I am not sure I am seeing how this leads to the conclusion of defined behavior in the example I gave. I have attempted to express my concern more clearly in an edit to my question, as it was too long for a comment.Exactitude
@EricPostpischil, If the compound literal could not be accessed at all until the initialization of its contents were complete, that would answer my question. However, Annex C in the standard discusses sequence points and specifically excludes initializers in compound literals from those which are followed by sequence points (see fifth bullet). This seems to at least cast doubt on whether there is a standard guarantee that contents will be initialized before accessing the address.Exactitude
@Kyle: The lack of a sequence point after the expression that is an initializer in a compound literal is irrelevant. There is no need for a sequence point because the evaluation of the compound literal (which includes its initialization) is sequenced before the update of the left operand by 6.5.16.1, quoted in this answer. The current C standard does not rely only on sequence points for defining sequencing. The initialization is part of value computation, and value computation is before assignment, so the sequence is defined.Malvasia
@Kyle: In what way can the compound literal be accessed before its initialization is complete? By the time it exists and is converted to a pointer to its first element, it is already initialized. The object is created with its initial value. Initialization does not occur in a separate assignment, and it is not a side effect. It is an inextricable part of creating the object defined by a compound literal. There is no moment when the object exists and does not have its initial value.Malvasia
@EricPostpischil, that makes sense. Evaluating *p and getting uninitialized content from the compound literal would violate the standard's assertion of what the initial contents of the compound literal are. If you want to put your response into an answer I will select it.Exactitude
I believe this is a bad example no matter, and should be removed or rewritten. Artificial code like this is just bad programming practice, period.Pia
D
4

Section 6.5p2 of the C standard details why this is valid:

If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. If there are multiple allowable orderings of the subexpressions of an expression, the behavior is undefined if such an unsequenced side effect occurs in any of the orderings. 84)

And footnote 84 states:

84) This paragraph renders undefined statement expressions such as

i = ++i + 1;
a[i++] = i; 

while allowing

i = i + 1;
a[i] = i;

The posted snippet from 6.5.2.5 falls under the latter, as there is no side effect.

Dameron answered 12/2, 2018 at 20:16 Comment(1)
Thank you for the response. However, I am not sure that this defines the behavior. In the two allowed examples in the footnote you provide, the value of i must be read to get the assigned value (and address to write the assignment in the latter case), so there is no option to order reading i after assigning it. However, in the example I gave, the compound literal could be allocated and produce an address which is assigned to p, only writing *p to part of that allocated space after the assignment.Exactitude
M
1

In (int [2]){*p}, *p provides an initial value for the compound literal. This is not an assignment, and it is not a side effect. The initial value is part of the object when the object is created. There is no moment when the array exists and it is not initialized.

In p = (int [2]){*p}, we know the side effect of updating p is sequenced after the computation of the right side because C 2011 [N1570] 6.5.16 3 says “The side effect of updating the stored value of the left operand is sequenced after the value computations of the left and right operands.”

Malvasia answered 13/2, 2018 at 7:2 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.