Pointer interconvertibility vs having the same address
Asked Answered
S

4

33

The working draft of the standard N4659 says:

[basic.compound]
If two objects are pointer-interconvertible, then they have the same address

and then notes that

An array object and its first element are not pointer-interconvertible, even though they have the same address

What is the rationale for making an array object and its first element non-pointer-interconvertible? More generally, what is the rationale for distinguishing the notion of pointer-interconvertibility from the notion of having the same address? Isn't there a contradiction in there somewhere?

It would appear that given this sequence of statements

int a[10];

void* p1 = static_cast<void*>(&a[0]);
void* p2 = static_cast<void*>(&a);

int* i1 = static_cast<int*>(p1);
int* i2 = static_cast<int*>(p2);

we have p1 == p2, however, i1 is well defined and using i2 would result in UB.

Sheikdom answered 21/12, 2017 at 11:35 Comment(17)
Could you link to the relevant draft please? n4296 (which is the draft I have bookmarked) doesn't include "pointer-interconvertible".Grosbeak
@MartinBonner This and thisLutero
@Someprogrammerdude An array is not a pointer, but nor is the first element of an array a pointer (in general). I guess that "pointer-interconvertible" is about standardizing when you can cast between base and derived pointers through static casts to void* and back (and when you can't).Grosbeak
Relevant: #47653805Lutero
@Someprogrammerdude an pointer to an array represents the address of that array. A pointer to the first element of said array represents the address of the first element. The two pointers represent the same address, but they are not convertible to each other.Sheikdom
@MartinBonner done.Sheikdom
I think there is little benefit to make such codes defined, and the less the rules are, the happier the compiler/optimizer will be.Industrialism
@Industrialism Why define it for the first member of a struct then? What's the practical difference between that and the first element of an array?Sheikdom
I find this paragraph about static_cast<T*>(void*) also obscurantist [expr.static.cast]: ". Otherwise, if the original pointer value points to an object a, and there is an object b of type T (ignoring cv-qualification) that is pointer-interconvertible (6.9.2) with a, the result is a pointer to b. Otherwise, the pointer value is unchanged by the conversion. ". Is the pointer value a valid pointer value if it points to the right address?Ulberto
Given the first member of a struct, we can use the cast to access its enclosing struct, thus access other members. But this is unnecessary for an array element. We can just take its address and do pointer arithmetic to access other elements.Industrialism
If you read the comments below |this question](#47617008) you will read that the C++ memory model has, and is still, mostly influenced by Boehm, who sell a garbage collector library. Since I read this comment, I suspect that inconsistencies in the C++ memory model result from the influence of its interest and not for rational reasons.Ulberto
@Industrialism What if the array is the first member of a struct and we have a pointer to the first element of the array and want to access that struct?Sheikdom
I think this is rare in practice... This is the reason why I say "little benefit" rather than "no benefit".Industrialism
Maybe the answer is that standard as code, after having been modified a few time by many different poeple, finish to look like a soap where nobody know anymore the rational behind this floating maggot!Ulberto
@Ulberto Sells, you say?Dissatisfactory
@Dissatisfactory Sorry, I do not associate the idea of "selling" to "money" since I have worked as a researcher in a public research center! I associate it to the concept of value. For example, there is this (almost iso) morphism money/{material,services,etc...}, impact-factor/{researcher,post-doc,phd student,...}, manager-usefulness-perception/employees and so on. No matter the dimension on which is evaluated the value. I suppose you are close to, or a commitee member? Questions as this one are recurring. They never get a good answer. Is the commitee still working on the object/memory model?Ulberto
"The object representation of an object of type T is the sequence of N unsigned char objects taken up by the object of type T, where N equals sizeof(T). The value representation of an object is the set of bits that hold the value of type T. For trivially copyable types, the value representation is a set of bits in the object representation that determines a value, which is one discrete element of an implementation-defined set of values" [basic.types]/4 so at the end of the day, none of that matters.Clymer
D
28

There are apparently existing implementations that optimize based on this. Consider:

struct A {
    double x[4];
    int n;
};

void g(double* p);

int f() {
    A a { {}, 42 };
    g(&a.x[1]);
    return a.n; // optimized to return 42;
                // valid only if you can't validly obtain &a.n from &a.x[1]
}

Given p = &a.x[1];, g might attempt to obtain access to a.n by reinterpret_cast<A*>(reinterpret_cast<double(*)[4]>(p - 1))->n. If the inner cast successfully yielded a pointer to a.x, then the outer cast will yield a pointer to a, giving the class member access defined behavior and thus outlawing the optimization.

Dissatisfactory answered 21/12, 2017 at 23:45 Comment(17)
They optimize even without restrict on g's argument? Oh, restrict isn't in C++ except as compiler extensions. Nevermind...Livelihood
Would this still hold if x were int[4] ?Pinxit
@Pinxit I'm not currently seeing why not.Dissatisfactory
@Dissatisfactory I guess it optimises because the standard gives an explicit license to optimise, and not the other way around. It won't be able to optimise if x wasn't an array. My question is, why the standard draws a line betwenn arrays and non-atrays? It seems completely arbitrary.Sheikdom
@n.m. The optimizer was there first, and the wording written to (partially) accommodate it.Dissatisfactory
Interesting. What compiler does this? I didn't find any on godbolt.org.Sheikdom
What does the inner cast actually yield? N4659 8.2.9/13 does define the behaviour of reinterpret_cast when there is NOT a pointer-interconvertible object at the location. The definition is "the pointer value is unchanged by the conversion", and surely the only possible meaning of that is that the result of the cast does point to the same byte in memory that the cast's argument pointed toPinxit
To forestall any strict aliasing argument about the double array; imagine g did ((A *)((char *)p - sizeof(double)))->n . What is the result of the cast to A * ?Pinxit
@Pinxit Just because it represents the same address ("points to the same byte") doesn't mean that it has the same pointer value in the abstract machine. The result of the inner cast is a pointer of type "pointer to array of 4 double" with the value "pointer to the first element of a.x" and therefore the result of the outer cast is a pointer of type "pointer to A" with the value "pointer to the first element of a.x", and since it does not actually point to an A object, has undefined behavior when the class member access expression is used to access a non-static data member of A.Dissatisfactory
"Strict aliasing" in the [basic.lval]/11 sense is irrelevant. reinterpret_cast<A*>(reinterpret_cast<double(*)[4]>(p - 1))->n performs no "access" within the meaning of that rule. As to your substitute expression, there being no char array anywhere, the pointer arithmetic is undefined.Dissatisfactory
Are you saying it is no longer well-defined to inspect any object (other than character arrays) by iterating over it with unsigned char * ?Pinxit
@Pinxit See core issue 1701. This part has never been properly specified in the standard, so it's not really meaningful to evaluate how it would work in the cleaned-up pointer model. When it is eventually specified, presumably the range of permissible pointer arithmetic would need to be limited to the memory reachable through the original pointer (just like launder) to permit the optimization at issue.Dissatisfactory
@Dissatisfactory in light of your last comment, would it be right to say that the core rationale (which this question is asking about) is so that pointers to array elements can't "escape" their array. Compare with void h(double(*)[4]); h(&a.x); - presumably the optimization is no longer possible, since h might cast its argument to A *; with that cast being correct because pointer to standard-layout struct is interconvertible with pointer to its first element.Pinxit
@Dissatisfactory "In C++, malloc has never created an object because [intro.object]/1 says that an object is *only* created by [list does not mention union access]." Are these guys seriously saying that changing the active member of a union never created an object? Was any use of a union illegal in C++? Or rather, can we take that std text as a joke?Clymer
This example doesn't match the OP. &a.x[1] is not the first member of a.x, thus it's not relevant. The compiler could do the optimization here even if a.x[0] was pointer-interconvertible with a.x.Plaudit
@Plaudit You (and g) can trivially obtain &a.x[0] when given &a.x[1]. That's just pointer arithmetic.Dissatisfactory
@Dissatisfactory Yes, but it is needlessly confusing given the context of the OP, in my opinion. What is it about x[1] that makes this a better answer than x[0]?Plaudit
L
3

More generally, what is the rationale for distinguishing the notion of pointer-interconvertibility from the notion of having the same address?

It is hard if not impossible to answer why certain decisions are made by the standard, but this is my take.

Logically, pointers points to objects, not addresses. Addresses are the value representations of pointers. The distinction is particularly important when reusing the space of an object containing const members

struct S {
    const int i;
};

S s = {42};
auto ps = &s;
new (ps) S{420};
foo(ps->i);  // UB, requires std::launder

That a pointer with the same value representation can be used as if it were the same pointer should be thought of as the special case instead of the other way round.

Practically, the standard tries to place as little restriction as possible on implementations. Pointer-interconvertibility is the condition that pointers may be reinterpret_cast and yield the correct result. Seeing as how reinterpret_cast is meant to be compiled into nothing, it also means the pointers share the same value representation. Since that places more restrictions on implementations, the condition won't be given without compelling reasons.

Lutero answered 21/12, 2017 at 15:18 Comment(12)
I don't quite see how this is relevant to the question in hand. When two different objects reside at the same address at different times, using a pointer to one as if it was a pointer to another conflicts with some reasonable optimisations. But we have an array and its element, i.e. an object and its subobject that sits at the beginning of the object. Why does this work for a struct and its first member, but not for an array and its first element? What is the conceptual difference here? What difficilties would ensue if it was allowed also for arrays?Sheikdom
@n.m. My point is there might be nothing fundamental. Pointer interconvertibility is given only when compelling reasons arise since it is both logically weird (breaks abstraction), and it might restrict implementations. It isn't that there is reason not to, it's because there isn't reason to.Lutero
I dom't see how the reasoning that applies to struct subobjects doesn't also apply to array subobjects for purposes of pointer convertibility. There mst be something that applies to one and not the other but I don't see what it is.Sheikdom
@n.m. Well, C supports struct subobjects, and legacy C code in C++ assumes it works.Fear
@n.m. There is also a similar limitation for the common initialization sequence that seems unexplainable (the concept used to allow or not to read a value within a non active member of a union). That applies to struct but not to arrays!Ulberto
@Yakk I can't find anything similar in the C standard, can you quote chapter and verse?Sheikdom
It'd be nice if there were a rationale document that explained why all these rules existPinxit
"Logically, pointers points to objects, not addresses" That contradicts "pointers have trivial type", but OKClymer
"logically weird (breaks abstraction)," C/C++ is "high level assembly", there is no "abstraction"Clymer
@Clymer Quite untrue, bool f(int x) { return x + 1 > x; } gets constant folded to true.Lutero
C/C++ don't give you access to signed 2 complement operations. Well, unless you use volatile.Clymer
@n.1.8e9-where's-my-sharem. Re "I don't see how the reasoning that applies to struct subobjects doesn't also apply to array subobjects for purposes of pointer convertibility." That is the core issue, and after reading the discussion in groups.google.com/a/isocpp.org/g/std-proposals/c/gN-_7CJ58G4/m/…, I think the committee originally didn't want to allow either one. It needed pointing out that "Without support for this, struct sockaddr does not work" for them to allow interconvertability with the fist class member. So now we have this inconsistency.Heathheathberry
M
2

Because the comittee wants to make clear that an array is a low level concept an not a first class object: you cannot return an array nor assign to it for example. Pointer-interconvertibility is meant to be a concept between objects of same level: only standard layout classes or unions.

The concept is seldom used in the whole draft: in [expr.static.cast] where it appears as a special case, in [class.mem] where a note says that for standard layout classes, pointers an object and its first subobject are interconvertible, in [class.union] where pointers to the union and its non static data members are also declared interconvertible and in [ptr.launder].

That last occurence separates 2 use cases: either pointers are interconvertible, or one element is an array. This is stated in a remark and not in a note like it is in [basic.compound], so it makes it more clear that pointer-interconvertibility willingly does not concern arrays.

Micturition answered 21/12, 2017 at 16:38 Comment(6)
"allows an implementation to have different representions for object pointers and array pointers" Why a pointer to an array can be converted to a pointer to its struct super object then? (If it's the first member of course)Sheikdom
@n.m.: After some reflection, I think that a pointer to an array cannot be convertible to a pointer to its super object because an array is not a standard layout object. My opinion is that the first element of the array is pointer convertible to the super object but none to the array. Of course when converted to a byte pointer (char pointer before C++17) all 3 will be converted to the same pointer because the first byte of an array is the first byte of its first element. A pointer to an array can then be converted to a pointer to its first element (via a char* or reinterpret_cast), ...Micturition
... but nothing guarantees that the pointer to the array and the pointers to the objects (first element and super object) have same representation. A pointer to an object can be static_casted to a pointer to a subclass and back, but the subclass object can be at a different address.Micturition
An array and its first element have the same address. It's guaranteed by the standard. No one is talking about a subclass object of the first element of the array. Only about the first element itself. There is a guarantee that an array is convertible to its standard layout super object, so either these pointers must have the same representation, or the difference in the representation doesn't matter.Sheikdom
Furthermore, struct {int i;} and its first element are poiner-convertible even though corresponding pointers are very explicitly allowed to have different representation.Sheikdom
@n.m.: I must acknowledge that I have no evidence for it, so it is only an opinion. As such even if I have really appreciated the discussion in comments it has nothing to do in a answer. Edited. Many thanks for the feedback..Micturition
G
1

Having read this section of Standard closely, I have the understanding that two objects are pointer-interconvertible, as the name suggests, if

  1. They are “interconnected”, through their class definition (note that pointer interconvertible concept is defined for a class object and its first non-static data member).

  2. They point to the same address. But, because their types are different, we need to “convert” their pointers' types, using reinterpret_cast operator.

For an array object, mentioned in the question, the array and its first element have no interconnectivity in terms of class definition and also we don’t need to convert their pointer types to be able to work with them. They just point to the same address.

Gracia answered 12/6, 2021 at 9:38 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.