Why is an "address-of" expression not an lvalue?

Asked 27/6, 2024 at 6:16 Answered 28/6, 2024 at 1:36

c pointers language-lawyer memory-address

The C Standard defines lvalue as:

An lvalue is an expression (with an object type other than void) that potentially designates an object;64) if an lvalue does not designate an object when it is evaluated, the behavior is undefined. When an object is said to have a particular type, the type is specified by the lvalue used to designate the object. A modifiable lvalue is an lvalue that does not have array type, does not have an incomplete type, does not have a const- qualified type, and if it is a structure or union, does not have any member (including, recursively, any member or element of all contained aggregates or unions) with a const- qualified type.

I would like to understand exactly which part of this definition prevents "address-of" expressions such as &ch from being lvalues. When we take the address of an object, doesn't the address "designate" that object (so that if we had char ch, then &ch = 10 would be equivalent to ch = 10)? Or it it that &ch designates a temporary value, not an object? (Are there such things as temporary objects?)

My confusion initially came about while reading Pointers on C, which provides the following explanation:

The precedence table shows that the & operator produces an R-value as a result, and they cannot be used as L-values. But why? The answer is easy. When the expression &ch is evaluated, where is the result held in the computer? It must be somewhere, but you have no way of knowing where. The expression does not identify any specific loation in the machine's memory, and thus is not an lvalue.

I am confused by the part in bold. Doesn't &ch identify the location of ch?

Accommodation answered 27/6, 2024 at 6:16 Comment(5)

&ch=10, if valid, would set the address of ch to 10. The address of variables (and other addressable values) is not under user control in C so not an l-value. – Sharasharai 27/6, 2024 at 6:31

I think your question is whether pointers could be automatically dereferenced when used as lvalues. I guess so, but C requires an explicit dereference operator *. – Sharasharai 27/6, 2024 at 6:33

Thanks @PaulHankin, I am actually not sure where my confusion lies! By the way, why would &ch=10 set the address of ch to 10, if it were valid? (Also, the rvalue in the example was supposed to be a char, not an int. I forgot to change it, sorry.) – Accommodation 27/6, 2024 at 6:50

@Accommodation &ch is the address of ch and therefore &ch=10 would be an attempt to assign the address itself. – Skiba 27/6, 2024 at 7:1

&ch identifies the location of ch. But we are not talking about the location of ch. We are talking about the location of &ch. It is the latter that does not exist. – Federalist 27/6, 2024 at 7:50

Obviously the contents of &ch do represent a specific location in the machine's memory, namely the address of ch. What the book probably means is that the temporary result &ch itself is likely stored in a register or such and we can't know where, based on the C code alone. It is also actually not the address of ch but a copy of that address.

Example:

int x = 1;
int* ptr = &x;
printf("%p\n", ptr);

This results in the following assembler (gcc 14 x86):

lea     rsi, [rsp+12]
mov     DWORD PTR [rsp+12], 1
call    printf

That is: [rsp+12] means stack pointer plus relative address offset and that's the &x part of the C code. Store the address from the stack where x was allocated inside the register rsi. Move the value 1 to that location on the stack. As per ABI/calling convention, printf apparently expects a pointer in rsi and it will print the contents, some hex address like 0x7ffdb6a2997c.

Now if we had somehow injected some inline assembler here and written to the register rsi, it wouldn't have affected where x is actually stored. We would only have changed what output printf gave us.

In general we do not want programmers to change the address of variables because that would change the whole way compilers and linkers work. The compiler gives each variable an address if it needs one (and wasn't placed in a register) and based on that address the compiler can conclude what's stored in one particular address and generate code accordingly. If we want do change where a variable is stored, we generally do that by copying the contents into a new address location.

So an address of a variable is to be regarded as read-only and the & operator must result in an "rvalue" for that reason alone. We typically never change addresses of storage in run-time, we just use pointers to point elsewhere in memory.

Also of note is that the & and * operators work as each other's opposites. *&x is equivalent to x since the operators cancel each other out. & always results in an rvalue but * always results in an lvalue. Similarly, &*ptr is equivalent to ptr but it is an rvalue.

Brahma answered 27/6, 2024 at 8:27 Comment(1)

Yeah, no. I would not like to be the compiler and linker that was trying to build apps with simple vars whose address changes:( – Axillary 27/6, 2024 at 9:42

Informally, the 'L' in "lvalue" means left. An lvalue is roughly an expression that can appear on the left hand side of an assignment.

For example, 2 and 2+2 are not lvalues, because it doesn't make sense assign a value to them.

Likewise, &ch yields the address of the object ch, but you cannot assign a value to that address; an assignment like &ch = some_value; doesn't make sense and so is not allowed. You can imagine that assignment changing the location at which ch is stored, but that's not a permitted operation.

As I said, this is only roughly what "lvalue" means, which is why the C standard doesn't define it that way. There are lvalues that cannot appear on the left side of an assignment, but for other reasons. An object defined with const cannot be assigned, but its name is still an lvalue.

Historically, the C term "lvalue" is derived from earlier programming languages that defined the terms differently. In the original terminology, an expression could be "evaluated for its lvalue" (determining what object it designates) or "evaluated for its rvalue" (determining what value it yields). In such a language, given:

x = y + 2;

x is "evaluated for its lvalue" because it's on the left side of an assignment, and any value previously stored in x is ignored. The expression y + 2 is "evaluated for its rvalue" and could not have been evaluated for its lvalue.

C changed the meanings, making "lvalue" refer to the expression itself rather than to the result of evaluating it, and almost entirely dropping the term "rvalue".

Selena answered 28/6, 2024 at 1:36 Comment(0)

Recommended topics

Hot tags