Right shift and signed integer

Asked 22/9, 2011 at 22:46 Answered 1/2, 2019 at 15:52

On my compiler, the following pseudo code (values replaced with binary):

sint32 word = (10000000 00000000 00000000 00000000);
word >>= 16;

produces a word with a bitfield that looks like this:

(11111111 11111111 10000000 00000000)

Can I rely on this behaviour for all platforms and C++ compilers?

Coelenteron answered 22/9, 2011 at 22:46 Comment(0)

From the following link:

INT34-C. Do not shift an expression by a negative number of bits or by greater than or equal to the number of bits that exist in the operand

Noncompliant Code Example (Right Shift)

The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a nonnegative value, the value of the result is the integral part of the quotient of E1 / 2^E2. If E1 has a signed type and a negative value, the resulting value is implementation defined and can be either an arithmetic (signed) shift:

Arithmetic (signed) shift

Or a logical (unsigned) shift:

Logical (unsigned) shift

This noncompliant code example fails to test whether the right operand is greater than or equal to the width of the promoted left operand, allowing undefined behavior.

unsigned int ui1;
unsigned int ui2;
unsigned int uresult;
 
/* Initialize ui1 and ui2 */
 
uresult = ui1 >> ui2;

Making assumptions about whether a right shift is implemented as an arithmetic (signed) shift or a logical (unsigned) shift can also lead to vulnerabilities. See recommendation INT13-C. Use bitwise operators only on unsigned operands.

Dahlgren answered 22/9, 2011 at 23:5 Comment(3)

Was there no recommendation that actually focused on this issue? Because based on the name of that rule, it doesn't apply here... you're just quoting some of the provided background information. – Amiss 15/2, 2015 at 18:22

The 2nd, more relevant link has moved. The new link is wiki.sei.cmu.edu/confluence/display/c/… – Nels 29/10, 2018 at 20:14

If the amount you are shifting is a compile-time constant, you can force an arithmetic shift by using (signed) division instead. Thus, instead of 'a >> 16', which is implementation defined, you'd write 'a / (1 << 16)' which the compiler will almost certainly replace with an arithmetic shift. – Satinwood 21/1, 2020 at 19:48

From the latest C++20 draft:

Right-shift on signed integral types is an arithmetic right shift, which performs sign-extension.

Biscuit answered 1/2, 2019 at 15:52 Comment(9)

Wait! Does that mean all signed values are required to use 2s complement? – Sixpenny 23/6, 2019 at 8:27

@Sixpenny Yes. – Biscuit 24/6, 2019 at 23:35

@Sixpenny Yes. C++20 now assumes 2s complement. Why? Because it's 2022 and f**k that thing in grandad's basement. Every credible platform in the world is 2s complement and we should stop dancing round supporting obscure and arcane hardware architectures. – Cofferdam 26/8, 2022 at 18:16

This is the corresponding paper with the survey chapter of existing architectures: open-std.org/jtc1/sc22/wg21/docs/papers/2018/… (mostly 2s complement. The final wording came from another paper.). Some very large computers, which cost millions are 1s compliment, e.g. en.wikipedia.org/wiki/CDC_6600 or en.wikipedia.org/wiki/UNIVAC_1100/2200_series – Gyre 26/8, 2022 at 19:18

@Cofferdam Maybe one day we can also have a unified ISA (metaphor: A strong ARM lifts better)? Standardized SSE instructions? Or why not just reveal the underlying shift instruction to the user, not as a compiler intrinsic (ie. not portable), but as a language feature? The rule of only using bitwise operations on unsigned values seems so pointless, because in the end it's only bits in a computer, and the hardware doesn't care about signed/unsigned. Am I wrong in wanted these things? – Unregenerate 10/8, 2023 at 12:42

@Unregenerate With C++ one can put the compiler intrinsics into the member operators of a custom arithmetic class. For more advanced stuff, one can define member functions with more parameters. What would be missing? More syntax sugar? More operator symbols? For simulating/validating/running Nvidia CUDA algorithm code on the CPU I created a class, which keeps the local variables for all 32 threads of a warp and which also supports operations like warp shuffle and supports conditionals with lambda blocks. So really a lot is already possible. – Gyre 10/8, 2023 at 13:8

@Gyre Thank you for the insight! Regarding what "would be missing": I don't understand the notion of undefined behaviour for something low-level like a bit-shift operator, because it has an exact instruction in the ISA. I personally am not a fan of syntax sugar, but the Java choice of a dedicated >>> operator for 0-insert on signed values seems like a good option. Probably there's some history that I'm missing here, but I would prefer >> to always be arithmetic, and let people use another operator symbol for the logical op. I'm just asking because I want to understand the problem :) – Unregenerate 10/8, 2023 at 14:13

@Unregenerate The reasons are multifold: 1. The semantics of most operators on the standard types go back half a century -> Compatibility with existing C code. 2. Differences between target architectures -> Portability of algorithms while still using common native ISA instructions, which are available everywhere. 3. E.g. If signed values never overflow, the optimizer can deduce rules like 'adding a positive number always increases the value' -> Performance Optimization. 4. No good, convincing, comprehensive proposal was put together yet -> Improvements will hopefully come in the end. – Gyre 10/8, 2023 at 15:45

@Unregenerate If (as people once were) you haven't settled on the layout of signed integers defining shift at bit-pattern level is still isn't fully portable (or useful) on signed integers on all platforms. The way forward is further standardisation but arcane platforms shouldn't be left out in the cold. C++ has done a lot to provide platform independent constructs so there are built-in ways to be platform agnostic. But it has taken time and there is more to do. – Cofferdam 10/8, 2023 at 16:9

No, you can't rely on this behaviour. Right shifting of negative quantities (which I assume your example is dealing with) is implementation defined.

Quaggy answered 22/9, 2011 at 22:51 Comment(2)

Okay, that's fair. I still wonder though, if the compiler creates a binary that does use this method, would it work as expected across most hardware at least? – Coelenteron 22/9, 2011 at 22:58

If you compile something for a certain architecture, that's supposed to work the same across all implementations of that architecture. x86, for example, has different shift operations for sign-extension and non-sign-extension shifts, and it's the compiler that decides which one to use. It probably won't work at all (read: anything at all, not just this behaviour) on other architectures. – Quaggy 22/9, 2011 at 23:1

In C++, no. It is implementation and/or platform dependent.

In some other languages, yes. In Java, for example, the >> operator is precisely defined to always fill using the left most bit (thereby preserving sign). The >>> operator fills using 0s. So if you want reliable behavior, one possible option would be to change to a different language. (Although obviously, this may not be an option depending on your circumstances.)

Reserved answered 23/9, 2011 at 6:55 Comment(0)

AFAIK integers may be represented as sign-magnitude in C++, in which case sign extension would fill with 0s. So you can't rely on this.

Berlinda answered 22/9, 2011 at 22:53 Comment(1)

You're right. The standard makes some requirements that make two's complement the optimal representation, but in general, signed integers can be represented in any way the implementation wants. – Quaggy 22/9, 2011 at 22:56

Recommended topics

Hot tags