Creating an invalid reference via reinterpret cast
Asked Answered
A

1

7

I am trying to determine whether the following code invokes undefined behavior:

#include <iostream>

class A;

void f(A& f)
{
  char* x = reinterpret_cast<char*>(&f);
  for (int i = 0; i < 5; ++i)
    std::cout << x[i];
}

int main(int argc, char** argue)
{
  A* a = reinterpret_cast<A*>(new char[5])
  f(*a);
}

My understanding is that reinterpret_casts to and from char* are compliant because the standard permits aliasing with char and unsigned char pointers (emphasis mine):

If a program attempts to access the stored value of an object through an lvalue of other than one of the following types the behavior is undefined:

  • the dynamic type of the object,
  • a cv-qualified version of the dynamic type of the object,
  • a type that is the signed or unsigned type corresponding to the dynamic type of the object,
  • a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,
  • an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union),
  • a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
  • a char or unsigned char type.

However, I am not sure whether f(*a) invokes undefined behavior by creating a A& reference to the invalid pointer. The deciding factor seems to be what "attempts to access" verbiage means in the context of the C++ standard.

My intuition is that this does not constitute an access, since an access would require A to be defined (it is declared, but not defined in this example). Unfortunately, I cannot find a concrete definition of "access" in the C++ standard:

Does f(*a) invoke undefined behavior? What constitutes "access" in the C++ standard?

I understand that, regardless of the answer, it is likely a bad idea to rely on this behavior in production code. I am asking this question primarily out of a desire to improve my understanding of the language.

[Edit] @SergeyA cited this section of the standard. I've included it here for easy reference (emphasis mine):

5.3.1/1 [expr.unary.op]

The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an object type, or a pointer to a function type and the result is an lvalue referring to the object or function to which the expression points. If the type of the expression is “pointer to T,” the type of the result is “T.” [Note: indirection through a pointer to an incomplete type (other than cv void) is valid. The lvalue thus obtained can be used in limited ways (to initialize a reference, for example); this lvalue must not be converted to a prvalue, see 4.1. — end note ]

Tracing the reference to 4.1, we find:

4.1/1 [conv.lval]

A glvalue (3.10) of a non-function, non-array type T can be converted to a prvalue. If T is an incomplete type, a program that necessitates this conversion is ill-formed. If T is a non-class type, the type of the prvalue is the cv-unqualified version of T. Otherwise, the type of the prvalue is T.

When an lvalue-to-rvalue conversion is applied to an expression e, and either:

  • e is not potentially evaluated, or
  • the evaluation of e results in the evaluation of a member ex of the set of potential results of e, and ex names a variable x that is not odr-used by ex (3.2)

the value contained in the referenced object is not accessed.

I think our answer lies in whether *a satisfies the second bullet point. I am having trouble parsing that condition, so I am not sure.

Arad answered 6/4, 2016 at 17:9 Comment(1)
What is the alignment of A?Gretta
H
5

char* x = reinterpret_cast<char*>(&f); is valid. Or, more specifically, access through x is allowed - the cast itself is always valid.

A* a = reinterpret_cast<A*>(new char[5]) is not valid - or, to be precise, access through a will trigger undefined behaviour.

The reason for this is that while it's OK to access object through a char*, it's not OK to access array of chars through a random object. Standard allows first, but not the second.

Or, in layman terms, you can alias a type* through char*, but you can't alias char* through type*.

EDIT

I just noticed I didn't answer direct question ("What constitutes "access" in the C++ standard"). Apparently, Standard does not define access (at least, I was not able to find the formal definition), but dereferencing the pointer is commonly understood to qualify for access.

Hyperventilation answered 6/4, 2016 at 17:12 Comment(7)
I agree that "access through a will trigger undefined behavior," but (like you mention), A* a = reinterpret_cast<A*>(new char[5]) may not constitute an access. Again, the answer comes down to what constitutes an "access". Is the pointer deference or the conversion to an lvalue (I may be getting the type wrong) the "access"?Arad
A* a = ... is not an access. But dereferencing a pointer is an access.Hyperventilation
Sorry to be pedantic, but why not? It is not clear to me why A& b = *a is inherently any more of an access than A* b = a is. Neither of those statements require A to be defined; only declared. Is there anything in the standard that hints at this interpretation?Arad
Yes, and I said it several times - dereferencing a pointer. A specific standard term is indirection, and as specified in 5.3.1 / 1, indirection to incomplete type is allowed to initialize a reference. However, it is still understood to be access.Hyperventilation
Thank you for the pointer to 5.3.1 / 1 - that is definitely relevant. However, I am specifically looking for a statement in the standard that says that dereferencing a pointer (indirection) constitutes "access". I followed the reference to 4.1 / 1 that defines when lvalue to prvalue conversion constitutes access. I am not positive, but I think A& b = *a is not an access because the result is not odr-used.Arad
@MichaelKoval, well, to the best of my knowldge, standard doesn't have formal definition for access. However, this seems to be the actual essence of your question. In this case, I suggest you ask another straightforward question, free from any type aliasing or anything - what does constitue an access. Provide this example (with reference to the same type, though). You are likely to get more interesting answers than here.Hyperventilation
Fair enough. I will post another question specifically about the definition of "access" in the C++ standard.Arad

© 2022 - 2024 — McMap. All rights reserved.