Is it unspecified behavior to compare pointers to different arrays for equality?

Asked 5/2, 2011 at 21:26 Answered 6/2, 2011 at 2:59

Solved c++pointers standards unspecified-behavior

The equality operators have the semantic restrictions of relational operators on pointers:

The == (equal to) and the != (not equal to) operators have the same semantic restrictions, conversions, and result type as the relational operators except for their lower precedence and truth-value result. [C++03 §5.10p2]

And the relational operators have a restriction on comparing pointers:

If two pointers p and q of the same type point to different objects that are not members of the same object or elements of the same array or to different functions, or if only one of them is null, the results of p<q, p>q, p<=q, and p>=q are unspecified. [§5.9p2]

Is this a semantic restriction which is "inherited" by equality operators?

Specifically, given:

int a[42];
int b[42];

It is clear that (a + 3) < (b + 3) is unspecified, but is (a + 3) == (b + 3) also unspecified?

Durwood answered 5/2, 2011 at 21:26 Comment(5)

Interesting question. If it was so, what about all the self-assignment tests if (this != &other) – Syzran 5/2, 2011 at 21:31

Except for the opaque phrasing, it seems really quite simple: The standard fully specifies under which circumstances two pointers compare not-equal. Which pointer value (address) of two unrelated objects is larger however, is simply (and - IMHO - rather obviously) unspecified. – Shoot 5/2, 2011 at 21:44

@Martin: What about a segmented architecture with near pointers having same offset, but for different segments? I don't think you'd want equality to be fully specified in this case, and the standard requires this comparison case to be well-formed (must compile, execute, etc.), near as I can tell. – Durwood 5/2, 2011 at 21:50

the standard does require meaningful results, even in that case -- i.e., == must only be true if both of them are null pointers, or else refer to the same object (and the converse for !=, of course). – Alvita 5/2, 2011 at 21:58

In segmented memory the == and != must compare the segment part of the pointer too. – Lighterman 25/10, 2019 at 13:9

The semantics for op== and op!= explicitly say that the mapping is except for their truth-value result. So you need to look what is defined for their truth value result. If they say that the result is unspecified, then it is unspecified. If they define specific rules, then it is not. It says in particular

Two pointers of the same type compare equal if and only if they are both null, both point to the same function, or both represent the same address

Snodgrass answered 5/2, 2011 at 21:31 Comment(3)

I was not reading it as the unspecified-ness of the result being included in "except for their truth-value result", since that would seem to negate "have the same semantic restrictions". I'm not sure this answer is the best way to interpret or not, but it would resolve this question. – Durwood 5/2, 2011 at 21:39

+1 : I totally agree about the interpretation of except for their truth-value result. It is horrible standardese though :-) – Shoot 5/2, 2011 at 21:39

The quoted text has been superseded by DR1652 – Ustkamenogorsk 8/2, 2018 at 1:23

The result from equality operators (== and !=) produce specified results as long as the pointers are to objects of the same type. Given two pointers to the same type, exactly one of the following is true:

both are null pointers, and they compare equal to each other.
both are pointers to the same object, and they compare equal to each other.
they are pointers to different objects, and they compare not-equal to each other.
at least one is not initialized, and the result of the comparison is not defined (and, in fact, the comparison itself may never happen--just trying to read the pointer to do the comparison gives undefined behavior).

Under the same constraints (both pointers are to the same type of object) the result from the ordering operators (<, <=, >, >=) is only specified if both of them are pointers to the same object, or to separate objects in the same array (and for this purpose, a "chunk" of memory allocated with malloc, new, etc., qualifies as an array). If the pointers refer to separate objects that are not part of the same array, the result is unspecified. If one or both the pointers has not be initialized, you have undefined behavior.

Despite that, however, the comparison templates in the standard library (std::less, std::greater, std::less_equal and std::greater_equal) do all yield a meaningful result, even when/if the built-in operators do not. In particular, they are required to yield a total ordering. As such, you can get ordering if you want it, just not with the built-in comparison operators (though, of course, if either or both of the pointers is un-initialized, the behavior is still undefined).

Alvita answered 5/2, 2011 at 21:52 Comment(9)

The relational operators are defined, but unspecified in the given case. None of the comparison operators (when used on pointers) are undefined. I'm asking about == and not <, and std::equal_to says it uses == without being included in the special allowance for std::less, etc. – Durwood 5/2, 2011 at 21:53

@Fred -- the operators are defined, but the results are not. I guess I could try to re-word that to be a bit more clear, but (IMO) what I've said is already easier to understand than the wording in the standard. – Alvita 5/2, 2011 at 21:56

The results are unspecified, which is distinctly defined differently from undefined (as in undefined behavior). A terminological nitpick, perhaps, but important since UB has severe implications; and one which I found surprising, since I had also thought they were UB. – Durwood 5/2, 2011 at 21:58

At least IMO, saying a result is not defined is different from saying that the code has undefined behavior, but I've done a bit of editing to ensure against that misunderstanding. – Alvita 5/2, 2011 at 22:2

I think you are using "a result is not defined" to mean exactly what the standard means with "unspecified", which, in contrast to "implementation-defined", doesn't require documentation and thus doesn't require consistency. (I base the lack of consistency on behavior that's undocumented can always include non-obvious and unspecified factors affecting it.) We both definitely agree it's different from undefined, but if you do mean what the standard does with "unspecified", then I think it's best to use that word when talking standardese. – Durwood 5/2, 2011 at 23:20

I would like to point out that confusingly for C, comparing "unrelated" pointers gives rise to undefined behavior (6.5.8.5). – Agio 13/4, 2011 at 17:11

Your case 3 has a special case that should be mentioned: comparison of one-past-the-end pointer of one object, to the start pointer of another object. In the C++11 original text it actually said this case should compare equal; however DR 1652 changed/clarified it to be unspecified – Ustkamenogorsk 8/2, 2018 at 1:26

Late comment, I know – but 'exactly one of the following is true' lacks yet a case: one pointer null pointer, the other one pointing to a valid object... Would a (dangling) pointer to a destructed object count as 'uninitialised'??? – Clavicle 27/3, 2022 at 16:49

@Aconcagua: You're right, there is another case there. Comparing a null pointer to a non-null pointer should always compare as not equal. After you've destroyed the pointee object, attempting to use the pointer in any way is pretty much like its uninitialized (comparison can fail completely), but in practice that's now so rare it hardly matters (the usual case it would fail would be on a segmented architecture, which can fail completely once a segment ceases to exist). – Alvita 27/3, 2022 at 18:51

Since there's confusion on conformance semantics, these are the rules for C++. C uses a completely different conformance model.

Undefined behaviour is an oxymoronic term, it means the translator NOT your program, may do as it pleases. This generally means it can generate code which will also do anything it pleases (but that is a deduction). Where the Standard says behaviour is undefined the text is actually of no significance to the user in the sense that eliding this text will not change the requirements the Standard imposes on translators.
Ill formed program means that unless otherwise specified the behaviour of the translator is rigidly defined: it is required to reject your program and issue a diagnostic message. The primary special case here is the One-Definition Rule, if you breach that your program is ill-formed but no diagnostic is required.
Implementation defined imposes a requirement on the translator that it contain documentation specifying the behaviour explicitly. In this special case Undefined Behaviour can be the result but must be explicitly stated.
Unspecified is a stupid term which means that the behaviour come from a set. In this sense well-defined is just a special case where the set of permitted behaviours contains only one element. Unspecified does not require documentation, so in some sense it also means the same as implementation defined without documentation.

In general, the C++ Standard is a not a Language Standard, it is a model for a language Standard. To generate an actual Standard you have to plug in various parameters. The easiest of these to recognize are the implementation defined limits.

There are a couple of silly conflicts in the Standard, for example, a legitimate translator can reject every apparently good C++ program on the basis that you are required to supply a main() function but the translator only supports identifiers of 1 character. This problem is resolve by the notion of QOI or Quality of Implementation. It basically says, who cares, no one is going to buy that compiler just because it is conforming.

Technically the unspecified nature of operator < when the pointers are to unrelated objects is probably intended to mean: you will get some kind of result which is either true or false but your program will not crash, however this is not the correct meaning of unspecified, so that is a Defect: unspecified imposed a burden on the Standards writers to document the set of allowed behaviours because if the set is open, then it is equivalent to undefined behaviour.

I actually proposed std::less as a solution to the problem that some data structures require keys to be totally ordered, but pointers are not totally ordered by operator <. On most machines using linear addressing less is the same as <, but the less operation on, say, an x86 processor is potentially more expensive.

Sheilasheilah answered 6/2, 2011 at 2:59 Comment(3)

What term would you prefer to describe the situation where there is no guarantee as to the value an expression will produce, but code may nonetheless legitimately evaluate the expression if it is prepared for any value that may result? For example, suppose f(x), f(y), and f(z) should ideally all yield the same value, but at most one of x, y, or z may be corrupt. If operations on a corrupt value yield indeterminate result, one could (absent side-effects) safely say temp = f(x); return temp==f(y) ? temp : f(z); even if one couldn't determine whether x, y, or z was corrupt. – Hemihedral 11/12, 2013 at 19:45

If invoking f() on a corrupt value were not guaranteed safe, then if one didn't save some other means of telling whether x, y, or z was corrupt one couldn't handle corruption. On the other hand, if the invocation is safe, then even though one wouldn't know when storing temp whether the value was good or not, one would nonetheless be able to determine that later. – Hemihedral 11/12, 2013 at 19:48

The implementation is not required to reject an ill-formed program. It must generate a diagnostic but it could carry on to generate a binary or do anything else. – Ustkamenogorsk 8/2, 2018 at 1:22

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags