Can ptrdiff_t represent all subtractions of pointers to elements of the same array object?

Asked 20/3, 2018 at 9:27 Answered 23/3, 2018 at 15:45

Solved c++arrays language-lawyer pointer-arithmetic ptrdiff-t

For subtraction of pointers i and j to elements of the same array object the note in [expr.add#5] reads:

[ Note: If the value i−j is not in the range of representable values of type std::ptrdiff_t, the behavior is undefined. — end note ]

But given [support.types.layout#2], which states that (emphasis mine):

The type ptrdiff_t is an implementation-defined signed integer type that can hold the difference of two subscripts in an array object, as described in [expr.add].

Is it even possible for the result of i-j not to be in the range of representable values of ptrdiff_t?

PS: I apologize if my question is caused by my poor understanding of the English language.

Justificatory answered 20/3, 2018 at 9:27 Comment(3)

On many popular architectures (most <=32 bit platforms) it would be rather difficult and expensive to provide ptrdiff_t that can always hold i-j (and they do not in fact provide such ptrdiff_t). The intent of the standard is not to make stuff difficult and expensive, or to make most existing implementations non-conforming, but rather the oposite. So yeah, it "can hold the difference"... when it can. – Bribe 20/3, 2018 at 9:41

Yes, the second quote says that - if i and j are valid indices for the same array, that a ptrdiff_t can represent the result of i - j. The first quote amounts to the reverse requirement - that i - j must also be able to be represented in a ptrdiff_t or the behaviour is undefined (I'd argue the first note is redundant given the presence of the second, but it probably reduces opportunities for language lawyers to find obscure exploitable loopholes in the language). – Strachan 20/3, 2018 at 9:42

One thing that is implicit is that as far as I understand problems can only occur if it is an array of a type with sizeof=1 (like char). (Or is there some corner case for sizeof=2 as well?) – Evenhanded 26/3, 2018 at 10:30

Is it even possible for the result of i-j not to be in the range of representable values of ptrdiff_t?

Yes, but it's unlikely.

In fact, [support.types.layout]/2 does not say much except the proper rules about pointers subtraction and ptrdiff_t are defined in [expr.add]. So let us see this section.

[expr.add]/5

When two pointers to elements of the same array object are subtracted, the type of the result is an implementation-defined signed integral type; this type shall be the same type that is defined as std::ptrdiff_t in the <cstddef> header.

First of all, note that the case where i and j are subscript indexes of different arrays is not considered. This allows to treat i-j as P-Q would be where P is a pointer to the element of an array at subscript i and Q is a pointer to the element of the same array at subscript j. In deed, subtracting two pointers to elements of different arrays is undefined behavior:

[expr.add]/5

If the expressions P and Q point to, respectively, elements x[i] and x[j] of the same array object x, the expression P - Q has the value i−j ; otherwise, the behavior is undefined.

As a conclusion, with the notation defined previously, i-j and P-Q are defined to have the same value, with the latter being of type std::ptrdiff_t. But nothing is said about the possibility for this type to hold such a value. This question can, however, be answered with the help of std::numeric_limits; especially, one can detect if an array some_array is too big for std::ptrdiff_t to hold all index differences:

static_assert(std::numeric_limits<std::ptrdiff_t>::max() > sizeof(some_array)/sizeof(some_array[0]),
    "some_array is too big, subtracting its first and one-past-the-end element indexes "
    "or pointers would lead to undefined behavior as per [expr.add]/5."
);

Now, on usual target, this would usually not happen as sizeof(std::ptrdiff_t) == sizeof(void*); which means an array would need to be stupidly big for ptrdiff_t to overflow. But there is no guarantee of it.

Cavernous answered 20/3, 2018 at 9:30 Comment(18)

In other words, if i and j are pointers to elements of the same array, the result of i-j is always in the range of representable values of type std::ptrdiff_t? – Justificatory 20/3, 2018 at 9:36

No, support.types.layout doesn't say any such thing. It says "... as described in expr.add"" and expr.add describes a case where i-j is not representable l. – Bribe 20/3, 2018 at 9:54

Interesting aspect about: Assuming size_t and ptrdiff_t having the same size, as the former is unsigned, but the latter is signed, it could be used to define arrays larger than what a pointer difference could hold. What does the standard say about such? If any difference must be representable, we'd have to conclude (possibly without actually being mentioned) that array sizes must not exceed std::numeric_limits<std::ptrdiff_t>::max(), no matter if std::size_t is capable to hold such values or not... – Borgerhout 20/3, 2018 at 10:2

note: std::size_t as subscript index was an error anyway. We might see a standard std::index defined as a signed integer with sizeof(std::index) == sizeof(void*) someday. – Cavernous 20/3, 2018 at 10:4

@Borgerhout That's always been my personal interpretation of those quotes. – Pleasure 20/3, 2018 at 10:4

There are always numeric_limits. – Bribe 20/3, 2018 at 10:11

@Cavernous Then imagine we are on modern 64-bit linux with sizeof(unsigned long) == sizeof(size_t) == 8, would I then have to conclude that unsigned long as array subscript is an error, too? Or older 32-bit OS, sizeof(size_t) == sizeof(unsigned int) == 4, then I wouldn't even be able to use unsigned int as subscript - getting to microcontrollers, we might end up in not being able to use any unsigned subscript at all... – Borgerhout 20/3, 2018 at 10:12

@Borgerhout if it must be representable, then the note in comp.add/5 makes no sense, as it describes an impossible condition. Notes are not normative, but still if we detect that our interpretation of the standard makes a non-normative part meaningless, we must assume that perhaps our interpretatiin is not the only possible one and may in fact be incorrect. Google PTRDIFF_MAX problems to see that there are in fact different interpretations assumed by actual implementations. – Bribe 20/3, 2018 at 10:23

@Cavernous If you happen to be aware of, could you cite where the standard prohibits using size_t as array subscript? I can't agree on with my personal reasoning, as exceeding array bounds is already UB anyway and with the limitation of array size to what a ptrdiff_t can hold, there is no further need for such limitation... – Borgerhout 20/3, 2018 at 10:32

@Borgerhout no it does not. But there is discussion about using another type than std::size_t as subscript type for the standard library in future versions. – Cavernous 20/3, 2018 at 10:33

@n.m. I. e. such large arrays are allowed, but calculating pointer differences within do lead to UB, if distances of elements are too large? – Borgerhout 20/3, 2018 at 10:51

@Borgerhout It would be rather impractical to disallow "large" arrays on e.g. 16-bit platforms. You would have a choice between 64K bytes of data memory but only 32K bytes max array size, or a wider than 16 bit ptrdiff_t, both of which are undesirable. So UB it is. – Bribe 20/3, 2018 at 11:40

This doesn't really answer the question – Nikos 23/3, 2018 at 11:21

Why would you write sizeof(decltype(some_array[0]) instead of sizeof some_array[0] – Nikos 24/3, 2018 at 3:21

@Nikos I got a new keyboard and I'm excited to write superfluous words :D all jokes aside, that was a mistake. – Cavernous 26/3, 2018 at 7:56

"Now, on usual target, this would not happen as sizeof(std::ptrdiff_t) == sizeof(void * )." This seems not to consider the sign bit, while the previous snippet does. – Pleasure 26/3, 2018 at 8:46

@Pleasure it just means ptrdiff_t can hold really large numbers on usual targets (32 or 64 bits), not that it can hold any void* difference (remember, you can only subtract pointers to elements of the same array). – Cavernous 26/3, 2018 at 8:50

For some bizarre reason, C11 no longer allows a 16-bit ptrdiff_t, even on a freestanding implementation where the total amount of storage is less than 32K. – Fanning 15/9, 2018 at 1:49

I think it is a bug of the wordings.

The rule in [expr.add] is inherited from the same rule for pointer subtraction in the C standard. In the C standard, ptrdiff_t is not required to hold any difference of two subscripts in an array object.

The rule in [support.types.layout] comes from Core Language Issue 1122. It added direct definitions for std::size_t and std::ptrdiff_t, which is supposed to solve the problem of circular definition. I don't see there is any reason (at least not mentioned in any official document) to make std::ptrdiff_t hold any difference of two subscripts in an array object. I guess it just uses an improper definition to solve the circular definition issue.

As another evidence, [diff.library] does not mention any difference between std::ptrdiff_t in C++ and ptrdiff_t in C. Since in C ptrdiff_t has no such constraint, in C++ std::ptrdiff_t should not have such constraint too.

Deuno answered 23/3, 2018 at 15:45 Comment(1)

It might also be worth to note that the C standard explicitly lists the undefined behavior caused by pointer subtraction (if the result does not fit in an ptrdiff_t) in the informative Annex J under "J.2 Undefined behavior". – Justificatory 23/3, 2018 at 16:19

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

`[expr.add]/5`

`[expr.add]/5`

Recommended topics

Hot tags