It is undefined behavior in C to perform arithmetic comparison on two unrelated pointers, that is, two pointers that don't point to the same array or object.
More or less, supposing that by "arithmetic comparison" you mean relational expressions (<
, <=
, >=
, or >
). There are some additional combinations of operands for which these have defined behavior, however:
When two pointers are compared, [... if] both point
one past the last element of the same array object [...;] pointers to structure members [of the same structure object ...;] pointers to members of the same union object [...; when] the expression
P
points to an element of an array object and the expression Q
points to the last element of the same
array object, the pointer expression[s] Q+1
[and] P
.
[C17 6.5.8/5]
On the other hand, (in)equality tests are between pointers of compatible type are well defined as long as the values are determinate, regardless of what they point to:
The ==
(equal to) and !=
(not equal to) operators are analogous to
the relational operators except for their lower precedence. [...] For
any pair of operands [satisfying the constraints on such expressions],
exactly one of the relations is true.
[C17 6.5.9/3, emphasis added]
You go on to say,
One could, however, cast them to uintptr_t:
int a, b;
bool not_ub = (uintptr_t)&a < (uintptr_t)&b;
The cast is defined and the comparison is, too.
Yes, but the significance of such an integer comparison with respect to the original pointers is not defined.
However, is it UB to compare the two pointers using memcmp
?
No. The specifications for memcmp()
place no limitations of their own on to what their arguments point, and a restriction such as you postulate would not serve the purposes of the function. Its description simply says:
The memcmp function compares the first n
characters of the object pointed to by s1
to the first n
characters of the object pointed to by s2
.
A footnote calls out potential problems with comparing structure and union padding, or the contents of char arrays containing strings, past the string terminator, but nowhere in the function description is there any reason to suppose that there are special cases for pointers to any kind of objects, including pointer objects.
And why should there be? memcmp()
is not about the semantic values of the objects to which its arguments point. It is about their representations. In that respect, note well that given valid pointers to compatible types p1
and p2
, the expression p1 == p2
evaluating to 1 does not imply that memcpy(&p1, &p2, sizeof p1)
will return 0. That is, two pointers to the same object can have different representations.
By the same token, nonzero results from memcmp()
operating on a pair of pointers have no defined relationship with the relative locations in memory of the objects to which they point (not even under the assumption that the question is meaningful in the execution environment, which is not a given). For a rather pedestrian case, suppose that the environment represents pointers as 64-bit integers conveying indexes into a flat address space. For a given pair of distinct indices, the result from memcmp()
would depend on the machine's endianness.