Rationale for pointer comparisons outside an array to be UB

Asked 1/7, 2015 at 1:19 Answered 1/7, 2015 at 15:30

c pointers language-lawyer undefined-behavior

So, the standard (referring to N1570) says the following about comparing pointers:

C99 6.5.8/5 Relational operators

When two pointers are compared, the result depends on the relative locations in the address space of the objects pointed to. ... [snip obvious definitions of comparison within aggregates] ... In all other cases, the behavior is undefined.

What is the rationale for this instance of UB, as opposed to specifying (for instance) conversion to intptr_t and comparison of that?

Is there some machine architecture where a sensible total ordering on pointers is hard to construct? Is there some class of optimization or analysis that unrestricted pointer comparisons would impede?

A deleted answer to this question mentions that this piece of UB allows for skipping comparison of segment registers and only comparing offsets. Is that particularly valuable to preserve?

(That same deleted answer, as well as one here, note that in C++, std::less and the like are required to implement a total order on pointers, whether the normal comparison operator does or not.)

Hollis answered 1/7, 2015 at 1:19 Comment(11)

The committee did not answer this specific question in its official rationale for the standard, so any answer is speculative, unless possibly from a member of the committee. Nevertheless, official rationale for related aspects of the spec considers segmented architectures and addressability of bytes near aggregate object representations, so I'd be inclined to guess that @cheersandhth has it about right in that part of his answer that refers to C. – Daleth 1/7, 2015 at 1:55

It follows, by the way, that although you can explicitly convert to intptr_t and compare, the result does not necessarily carry information about the relative locations of unrelated objects in memory. Putting it in those terms makes the restriction more sensible to me. – Daleth 1/7, 2015 at 1:58

Segmented architectures make it hell — as found on Intel 80386 and relatives. It was particularly painful in the bad old days of 80286, 80186, 8086, 8088. You could have multiple bit representations for the same address. Even with modern Pentiums and x86_64s etc, you can in theory have segmented addresses; no-one is insane enough to use them, though. Well, almost no-one — I'm sure there are people out there still running software with segmented addresses. – Yusem 1/7, 2015 at 2:41

This is not an opinion based question, as far as I know all undefined behavior in the C and C++ standards have rationales. Whether they are easy to discern from public sources is another question but that does not make them bad questions. – Fia 1/7, 2015 at 2:55

Note that you can quite easily dodge this undefined behavior by casting the pointers to integers and then compare the integers instead. 100% deterministic and well-defined behavior. – Feucht 1/7, 2015 at 6:54

@JonathanLeffler This is still quite common in microcontrollers, since old 8/16 bit microcontroller architectures invented in the 70s, 80s and 90s are still used in production. You'll have "banked" memory and all kinds of strange pointer types to access it. Many embedded programmers fanatically cling on to this old crap for no sound reason. – Feucht 1/7, 2015 at 7:4

@Lundin: Nothing in the Standard requires that comparisons of integer representations yield the same results as comparisons of pointers, even in scenarios where the latter are well-defined. Embedded programmers often need to be able to tweak existing designs. If a product has been selling well for ten years and people keep buying it, but a part becomes unavailable, being able to use the existing code with new parts may be preferable to rewriting everything from scratch. That may be especially true for products that need to interact with other equipment (which may depend upon some... – Hovercraft 2/7, 2015 at 16:11

...undocumented aspects of its behavior, such as a delay between when it sees a certain input and when it triggers a certain output). Being able to engineer something with new parts that can serve as a drop-in replacement for something built with old parts is very useful in the real world. I'm not sure what you mean by "no sound reason". – Hovercraft 2/7, 2015 at 16:15

@Hovercraft In the real world for 99% of all computers, a pointer converted to an integer will compare the same as a pointer compared with a pointer, because addresses are just numbers in 99% of all computers. As for clinging to old crap, I was rather referring to would-be engineers picking a new MCU for their new project and stating things like "lets take PIC because it is cheapest" and then they haven't checked MCU prices since the mid-90s, let alone considered that an extremely code inefficient CPU needs far more costly flash. Or "lets take PIC because I'm too lazy to learn new things". – Feucht 3/7, 2015 at 6:17

The 8-bit PIC architectures are very good at things like bit-banging, and for many embedded-systems purposes having one primary accumulator and a large permanently-accessible working set is more useful than having a working set of 8 or 16 equivalent registers. For some kinds of code the ARM can spin circles around the PIC, but for other kinds of code the reverse is true. On the PIC, decrementing an 8-bit global variable in the common bank is one instruction; on the ARM it's 3-4. On the PIC, setting an I/O register bit in interrupt-safe fashion is one instruction; on most ARMs... – Hovercraft 4/7, 2015 at 19:21

...it can be a lot more (generally requiring the access to be bracked in a save/disable/restore sequence or use ldrex/strex/bne sequence). From an architectural perspective I know of nothing that's better than the PIC for the things it does well, though cheaper ARMs are often good enough even at those purposes. – Hovercraft 4/7, 2015 at 19:26

Various comments in the ub mailing list discussion Justification for < not being a total order on pointers? strongly allude to segmented architectures being the reason. Including the follow comments, 1:

Separately, I believe that the Core Language should simply recognize the fact that all machines these days have a flat memory model.

and 2:

Then we maybe need an new type that guarantees a total order when converted from a pointer (e.g. in segmented architectures, conversion would require taking the address of the segment register and adding the offset stored in the pointer).

and 3:

Pointers, while historically not totally ordered, are practically so for all systems in existence today, with the exception of the ivory tower minds of the committee, so the point is moot.

and 4:

But, even if segmented architectures, unlikely though it is, do come back, the ordering problem still has to be addressed, as std::less is required to totally order pointers. I just want operator< to be an alternate spelling for that property.

Why should everyone else pretend to suffer (and I do mean pretend, because outside of a small contingent of the committee, people already assume that pointers are totally ordered with respect to operator<) to meet the theoretical needs of some currently non-existent architecture?

Counter to the trend of comments from the ub mailing list, FUZxxl points out that supporting DOS is a reason not to support totally ordered pointers.

Update

This is also supported by the Annotated C++ Reference Manual(ARM) which says this was due to burden of supporting this on segmented architectures:

The expression may not evaluate to false on segmented architectures [...] This explains why addition, subtraction and comparison of pointers are defined only for pointers into an array and one element beyond the end. [...] Users of machines with a nonsegmented address space developed idioms, however, that referred to the elements beyond the end of the array [...] was not portable to segmented architectures unless special effort was taken [...] Allowing [...] would be costly and serve few useful purposes.

Fia answered 1/7, 2015 at 2:47 Comment(2)

I think supporting DOS is a huge reason for not mandating totally ordered pointers. Where do these people get the idea that their machines are the only machines people write C code for? – Arezzo 1/7, 2015 at 7:10

There are some algorithms that require a total ordering, but a lot more that merely require that a comparison between two arbitrary pointers have no side-effect other than yielding zero or one. Can you think of any sane reason the Standard shouldn't have required that implementations either guarantee that or predefine a macro indicating that they cannot? – Hovercraft 24/3, 2016 at 22:45

The 8086 is a processor with 16 bit registers and a 20 bit address space. To cope with the lack of bits in its registers, a set of segment registers exists. On memory access, the dereferenced address is computed like this:

address = 16 * segment + register

Notice that among other things, an address has generally multiple ways to be represented. Comparing two arbitrary addresses is tedious as the compiler has to first normalize both addresses and then compare the normalized addresses.

Many compilers specify (in the memory models where this is possible) that when doing pointer arithmetic, the segment part is to be left untouched. This has several consequences:

objects can have a size of at most 64 kB
all addresses in an object have the same segment part
comparing addresses in an object can be done just by comparing the register part; that can be done in a single instruction

This fast comparison of course only works when the pointers are derived from the same base-address, which is one of the reasons why the C standard defines pointer comparisons only for when both pointers point into the same object.

If you want a well-ordered comparison for all pointers, consider converting the pointers to uintptr_t values first.

Arezzo answered 1/7, 2015 at 7:18 Comment(0)

I believe it's undefined so that C can be run on architectures where, in effect, "smart pointers" are implemented in hardware, with various checks to ensure that pointers never accidentally point outside of the memory regions they're defined to refer to. I've never personally used such a machine, but the way to think about them is that computing an invalid pointer is precisely as forbidden as dividing by 0; you're likely to get a run-time exception that terminates your program. Furthermore, what's forbidden is computing the pointer, you don't even have to dereference it to get the exception.

Yes, I believe the definition also ended up permitting more-efficient comparisons of offset registers in old 8086 code, but that was not the only reason.

Yes, a compiler for one of these protected pointer architectures could theoretically implement the "forbidden" comparisons by converting to unsigned or the equivalent, but (a) it would likely be significantly less efficient to do so and (b) that would be a wantonly deliberate circumvention of the architecture's intended protection, protection which at least some of the architecture's C programmers would presumably want to have enabled (not disabled).

Sequestration answered 1/7, 2015 at 4:0 Comment(0)

Historically, saying that action invoked Undefined Behavior meant that any program which made use of such actions could be expected to correctly only on those implementations which defined, for that action, behavior meeting their requirements. Specifying that an action invoked Undefined Behavior didn't mean that programs using such action should be considered "illegitimate", but was rather intended to allow C to be used to run programs that didn't require such actions, on platforms which could not efficiently support them.

Generally, the expectation was that a compiler would either output the sequence of instructions which would most efficiently perform the indicated action in the cases required by the standard, and do whatever that sequence of instructions happened to do in other cases, or would output a sequence of instructions whose behavior in such cases was deemed to be in some fashion more "useful" than the natural sequence. In cases where an action might trigger a hardware trap, or where triggering an OS trap might plausibly in some cases be considered preferable to executing the "natural" sequence of instructions, and where a trap might cause behaviors outside the control of the C compiler, the Standard imposes no requirements. Such cases are thus labeled as "Undefined Behavior".

As others have noted, there are some platforms where p1 < p2, for unrelated pointers p1 and p2, could be guaranteed to yield 0 or 1, but where the most efficient means of comparing p1 and p2 that would work in the cases defined by the Standard might not uphold the usual expectation that p1 < p2 || p2 > p2 || p1 != p2. If a program written for such a platform knows that it will never deliberately compare unrelated pointers (implying that any such comparison would represent a program bug) it may be helpful to have stress-testing or troubleshooting builds generate code which traps on any such comparisons. The only way for the Standard to allow such implementations is to make such comparisons Undefined Behavior.

Until recently, the fact that a particular action would invoke behavior that was not defined by the Standard would generally only pose difficulties for people trying to write code on platforms where the action would have undesirable consequences. Further, on platforms where an action could only have undesirable consequences if a compiler went out of its way to make it do so, it was generally accepted practice for programmers to rely upon such an action behaving sensibly.

If one accepts the notions that:

The authors of the Standard expected that comparisons between unrelated pointers would work usefully on those platforms, and only those platforms, where the most natural means of comparing related pointers would also work with unrelated ones, and
There exist platforms where comparing unrelated pointers would be problematic

Then it makes complete sense for the Standard to regard unrelated-pointer comparisons as Undefined Behavior. Had they anticipated that even compilers for platforms which define a disjoint global ranking for all pointers might make unrelated-pointer comparisons negate the laws of time and causality (e.g. given:

int needle_in_haystack(char const *hs_base, int hs_size, char *needle)
{ return needle >= hs_base && needle < hs_base+hs_size; }

a compiler may infer that the program will never receive any input which would cause needle_in_haystack to be given unrelated pointers, and any code which would only be relevant when the program receives such input may be eliminated) I think they would have specified things differently. Compiler writers would probably argue that the proper way to write needle_in_haystack would be:

int needle_in_haystack(char const *hs_base, int hs_size, char *needle)
{
  for (int i=0; i<size; i++)
    if (hs_base+i == needle) return 1;
  return 0;
}

since their compilers would recognize what the loop is doing and also recognize that it's running on a platform where unrelated pointer comparisons work, and thus generate the same machine code as older compilers would have generated for the earlier-stated formulation. As to whether it would be better to require compilers provide a means of specifying that code resembling the former version should either sensibly on platforms that will support it or refuse compilation on those that won't, or better to require that programmers intending the former semantics should write the latter and hope that optimizers turn it into something useful, I leave that to the reader's judgment.

Hovercraft answered 1/7, 2015 at 15:30 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags