Pointer arithmetics on non-array types

Asked 19/1, 2016 at 14:51 Answered 20/1, 2016 at 23:32

c++language-lawyer type-punning char-pointer

Let's consider following piece of code:

struct Blob {
    double x, y, z;
} blob;

char* s = reinterpret_cast<char*>(&blob);
s[2] = 'A';

Assuming that sizeof(double) is 8, does this code trigger undefined behaviour?

Feeble answered 19/1, 2016 at 14:51 Comment(11)

This wouldn't even compile without a cast though. – Cortezcortical 19/1, 2016 at 14:55

Is this C? In C++ this will not compile. – Inlaid 19/1, 2016 at 14:55

This is not legal C or C++, you need a cast. Once you have a cast, this is legal and well-defined. Of course anything can happen if you try to access blob.x afterwards. – Reisinger 19/1, 2016 at 14:56

@Bathsheba, added the cast, question remains. – Feeble 19/1, 2016 at 14:57

@n.m. hint - the question is regarding pointer arithmetics on non-arrays. – Feeble 19/1, 2016 at 14:57

Pointer arithmetic is done on pointers. All pointers involved must point into the same array. Any POD/trivial-layout/whatever-its-name object can be treated as an array of char or its differently signed siblings. This is the basis of functions like memcpy or fread. – Reisinger 19/1, 2016 at 15:2

@hvd You do realize that the quote refers to the fact that both pointers point to the same array (or to the hypothetical element past it)? Your interpretation is incorrect. – Kellie 19/1, 2016 at 15:2

@n.m., this is common sense answer, not a language-lawyer one :) – Feeble 19/1, 2016 at 15:3

In this particular case I bet on no undefined behavior here. You struct has only doubles in it so no padding is necessary. In case if padding is required you could get a unspecified behavior (I guess). Still a language-lawyer opinion required. – Mattland 19/1, 2016 at 15:6

I don't see any pointer arithmetic happening here. I see array accessing, but that's not pointer arithmetic. – Descombes 19/1, 2016 at 15:6

@NicolBolas s[2] = *(s+2) – Lenlena 19/1, 2016 at 15:7

Quoting from N4140 (roughly C++14):

3.9 Types [basic.types]

2 For any object (other than a base-class subobject) of trivially copyable type T, whether or not the object holds a valid value of type T, the underlying bytes (1.7) making up the object can be copied into an array of char or unsigned char.⁴² If the content of the array of char or unsigned char is copied back into the object, the object shall subsequently hold its original value.

42) By using, for example, the library functions (17.6.1.2) std::memcpy or std::memmove.

3 For any trivially copyable type T, if two pointers to T point to distinct T objects obj1 and obj2, where neither obj1 nor obj2 is a base-class subobject, if the underlying bytes (1.7) making up obj1 are copied into obj2,⁴³ obj2 shall subsequently hold the same value as obj1. [ Example: ... ]

43) By using, for example, the library functions (17.6.1.2) std::memcpy or std::memmove.

This does, in principle, allow assignment directly to s[2] if you take the position that assignment to s[2] is indirectly required to be equivalent to copying all of some other Blob into an array that just happens to be bytewise identical except for the third byte, and copying it into your Blob: you're not assigning to s[0], s[1], etc. For trivially copyable types including char, that is equivalent to setting them to the exact value they already have, which also has no observable effect.

However, if the only way to get s[2] == 'A' is by memory manipulation, then a valid argument could also be made that what you're copying back into your Blob isn't the underlying bytes that made up any previous Blob. In that case, technically, the behaviour would be undefined by omission.

I do strongly suspect, especially given the "whether or not the object holds a valid value of type T" comment, that it's intended to be allowed.

Demijohn answered 19/1, 2016 at 15:9 Comment(1)

Comments are not for extended discussion; this conversation has been moved to chat. – Hawser 19/1, 2016 at 23:15

Chapter 3.10 of the standard seems to allow for that specific case, assuming that "access the stored value" means "read or write", which is unclear.

3.10-10

If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undeﬁned:

—(10.1) the dynamic type of the object,

—(10.2) a cv-qualiﬁed version of the dynamic type of the object,

—(10.3) a type similar (as deﬁned in 4.4) to the dynamic type of the object,

—(10.4) a type that is the signed or unsigned type corresponding to the dynamic type of the object,

—(10.5) a type that is the signed or unsigned type corresponding to a cv-qualiﬁed version of the dynamic type of the object,

—(10.6) an aggregate or union type that includes one of the aforementioned types among its elements or nonstatic data members (including, recursively, an element or non-static data member of a subaggregate or contained union),

—(10.7) a type that is a (possibly cv-qualiﬁed) base class type of the dynamic type of the object,

—(10.8) a char or unsigned char type.

Kaminski answered 20/1, 2016 at 23:32 Comment(0)

Recommended topics

Hot tags