Ramifications of C++20 requiring two's complement
Asked Answered
A

2

19

C++20 will specify that signed integral types must use two's complement. This doesn't seem like a big change given that (virtually?) every implementation currently uses two's complement.

But I was wondering if this change might shift some "undefined behaviors" to be "implementation defined" or even "defined."

Consider, the absolute value function, std::abs(int) and some of its overloads. The C++ standard includes this function by reference to the C standard, which says that the behavior is undefined if the result cannot be represented.

In two's complement, there is no positive counterpart to INT_MIN:

abs(INT_MIN) == -INT_MIN == undefined behavior

In sign-magnitude representation, there is:

-INT_MIN == INT_MAX

Thus it seemed reasonable that abs() was left with some undefined behavior.

Once two's complement is required, it would seem to make sense that abs(INT_MIN)'s behavior could be fully specified or, at least, implementation defined, without any issue of backward compatibility. But I don't see any such change proposed.

The only drawback I see is that the C++ Standard would need to specify abs() explicitly rather than referencing the C Standard's description of abs(). (As far as I know, C is not mandating two's complement.)

Was this just not a priority for the committee or are there still reasons not to take advantage of the simplification and certainty that the two's complement mandate provides?

Atalante answered 5/8, 2019 at 17:20 Comment(1)
A note - C23 now removes support for any signed integer representation other than two's complement. Reference - en.cppreference.com/w/c/23Sowder
W
19

One of the specific questions considered by the committee was what to do about -INT_MIN, and the results of that poll were:

addition / subtraction / multiplication and -INT_MIN overflow is currently undefined behavior, it should instead be:

4: wrap
6: wrap or trap
5: intermediate values are mathematical integers
14: status quo (remain undefined behavior)

This was explicitly considered and people felt that the best option was keeping it undefined behavior.

To clarify on "intermediate values are mathematical integers", there is a other part of the paper which clarifies that means that (int)a + (int)b > INT_MAX might be true.


Note that implementations are free to define specific behavior in these cases if they so choose. I don't know if any of them do.

Wanton answered 5/8, 2019 at 17:38 Comment(2)
Comments are not for extended discussion; this conversation has been moved to chat.Rhyme
gcc -fwrapv is a C++ implementation that fully defines all signed integer overflow as 2's complement wrap-around. gcc -ftrapv defines it as trapping. The GCC default is that it's undefined behaviour so the optimizer can assume it doesn't happen. e.g. x = -x; would allow it to assume x != INT_MIN is true.Ictus
M
3

The Committee that wrote C89 deliberately avoided any judgments about things that quality implementations "should" do when practical. The published Rationale indicates that they expected implementations to behave usefully in circumstances beyond those required by the Standard (and in the case of integer overflow, even documents some very specific expectations), but for whatever reason the Committee deliberately avoided saying such things within the Standard itself.

When later C or C++ committees added new features, they were willing to consider the possibility that they might be supportable on some platforms and unsupportable on others, but there has almost never been any effort to revisit questions of whether the Standard should recognize cases where many implementations would process code in the same useful and consistent fashion even though the Standard had imposed no requirements, and provide a means by which a program could test whether an implementation supports such behavior, refuse to compile on one that doesn't, and have defined behavior on those that do.

The net effect is that something like: unsigned mul_mod_65536(unsigned short x, unsigned short y) { return (x*y) & 0xFFFFu; } may arbitrarily disrupt the behavior of calling code if the arithmetical value of x*y is between INT_MAX+1u and UINT_MAX even though that would be a situation that the authors of the Standard said they expected to be processed consistently by most implementations. The recent Standard have eliminated the main reason the authors of C89 would have expected that some implementations might process the aforementioned function strangely, but that doesn't mean that implementations haven't decided to treat it weirdly in ways the authors of C89 could never have imagined, and would never knowingly have allowed.

Maas answered 5/8, 2019 at 19:37 Comment(7)
Isn't that example more a problem of integer promotion not preserving signedness rather than a signed integer implementation issue?Walkway
@Spencer: My intention was to provide an example to illustrate the general principles that (1) if 99.9% of implementations would process a construct in the same useful fashion but 1% would have reason to do something else, that 0.1% would be adequate reason not to mandate the behavior, and (2) the fact that the fraction of implementations that would have a reason to do something else drops from 0.1% to zero would be insufficiently noteworthy to justify changing the Standard in response.Maas
@seupercat Agreed, execept to go further and say the fact that the fraction went to zero removes this as a reason to preserve signedness in integer promotions.Walkway
What has disappeared is not the need for the existing promotion rules when using the existing type, but rather the reason to treat signed and unsigned math differently in cases where all of their defined behaviors would be consistent.Maas
That, and as a consequence, one rationale for changing the existing integer promotion rules has gone away.Walkway
@Spencer: The fraction of implementations that would have any good reason to deviate from common practice has gone to zero, but that does not mean that the fraction of implementations that deviate from common practice has gone to zero.Maas
Let us continue this discussion in chat.Walkway

© 2022 - 2024 — McMap. All rights reserved.