How can I reliably perform an arithmetic shift right in C++?
Asked Answered
I

4

1

The "arithmetic shift right" operation is similar to a normal (logical) shift right, except the most significant (i.e. shifted-in) bits are filled with the sign bit rather than 0. Unfortunately, in C++ (prior to C++20, and also in C), the result of performing a right shift on a signed integer is [compiler/platform] implementation-defined.

Is there a way to perform an "arithmetic shift right" that is guaranteed to provide a correct result regardless of implementation details? Ideally the code is simple enough to inline, and does not contain any conditionals or branches.

Inly answered 17/6, 2023 at 6:49 Comment(4)
If the shift count is a constant, write the shift as a division by 2^count instead. The compiler will typically optimize it to a SAR operation.Luellaluelle
True, but the provided answer handles the (pre- C++20) case where the shift count is not constant.Inly
Is there a practical need for this though? C++20 probably standardized existing implementations, so just use >>. Do you know an implementation where this doesn't work?Slipslop
There are plenty of projects where C++20 is not an available option. One example would be platforms currently without a C++20 compiler. It may be arithmetic shift on prior versions, but you have no guarantee; it is explicitly "implementation defined". It might work for you, but may not work for someone else. A "probably" shouldn't going into aerospace / life-support code, for example...Inly
T
1

Just use the >> operator, but use a detour via a wider integer:

#include "stdio.h"
#include <stdint.h>

int32_t sar(int32_t val, unsigned sh)
{
    return (int32_t)((int64_t)val >> sh);
}
volatile int32_t value=-128;
volatile unsigned shift=1;
int main(void)
{
    volatile int32_t result = sar (value, shift);
    printf("sar(%d, %u) = %d\n", value, shift, result);
    return 0;
}

My gcc inlines sar() without declaring it inline and compiles main() to:

main:
.LFB31:
    .cfi_startproc
    endbr64
    sub rsp, 24
    .cfi_def_cfa_offset 32
    mov ecx, DWORD PTR shift[rip]
    movsx   rax, DWORD PTR value[rip]
    lea rsi, .LC0[rip]
    mov edi, 1
    sar rax, cl ;<--------Shift Arithmetically Right
    mov DWORD PTR 12[rsp], eax
    mov r8d, DWORD PTR 12[rsp]
    xor eax, eax
    mov ecx, DWORD PTR shift[rip]
    mov edx, DWORD PTR value[rip]
    call    __printf_chk@PLT
    xor eax, eax
Telegony answered 17/6, 2023 at 9:57 Comment(3)
Sorry but this looks dumb to me. Not only is casting to a wider integer not necessary for right shifting, but your function is still potentially right shifting a negative number which is not defined behaviour in c.Feat
@SimonGoater Casting to a wider signed integer fills the new bits with the existing sign bit. Shifting that entire bit package to the right ensures that those correct sign bits are shifted in to our 32 bit target. Casting it to uint64_t before shifting alters the compiler output to an unsigned shr rax, cl instruction. However, that doesn't changes the output of my sample, because the upper 32 bits are discarded anyway.Telegony
This is an interesting approach, at least when wider integers are available. Thanks!Inly
I
0

Here is a C++ inline function that performs an "arithmetic shift right" on a signed 32-bit integer, regardless of implementation details and with no conditionals or branches. It can be easily adapted to C if needed.

#include <cstdint>
   
inline int32_t sar(int32_t val, unsigned int sh)
{
  uint32_t uval = static_cast<uint32_t>(val);
  uint32_t result = (uval >> sh) | -((uval & 0x80000000) >> sh);
  return static_cast<int32_t>(result);
}

Explanation:

The function name sar stands for "shift arithmetic right", and is reminiscent of common assembly mnemonics. The function accepts a signed 32-bit integer val as the value to shift, and an unsigned integer sh as the number of bits to shift right. Note: On some platforms, shifting right by a number of bits equal to or larger than the bit-width of the value being shifted can result in undefined behavior! You can limit the maximum value of sh (to 31, in this case) to avoid this possibility.

Since the result of a right shift on a signed integer is implementation-defined, all of our operations will be done using unsigned numbers. We begin by casting our input value to an unsigned integer uval.

Next, we perform the right shift. Since this is an unsigned shift, the most significant (i.e. shifted-in) bits are filled with 0. However, for a proper arithmetic shift right, we would want them filled with the sign bit, which is the most-significant bit of the original value.

The expression -((uval & 0x80000000) >> sh) provides the string of high-order sign bits that we need. First, we use bitwise AND (&) with a mask to extract the most significant bit, which is the sign bit. Then, we shift this bit to the right sh places. Next, we negate the result, which, on unsigned integers, performs a 2's complement operation. This gives us a number with all higher-order bits also set equal to the [shifted] sign bit! Finally, we perform a bitwise OR (|) to combine these sign bits with our shifted uval, filling the high-order bits with the sign bit.

In C++11 or later, we can use the following template to handle any signed integer type:

#include <type_traits>

template<typename T>
typename std::enable_if<std::is_signed<T>::value && std::is_integral<T>::value, T>::type
sar(T val, unsigned int sh) {
    using UnsignedT = typename std::make_unsigned<T>::type;
    UnsignedT uval = static_cast<UnsignedT>(val);
    UnsignedT high_bit = static_cast<UnsignedT>(-1);
    high_bit = high_bit ^ (high_bit >> 1);
    UnsignedT result = (uval >> sh) | -((uval & high_bit) >> sh);
    return static_cast<T>(result);
}

The explanation of the calculation of high_bit from the template type T is left as exercise for the reader.

In C++20 and later, the right bit-shift operator >> is guaranteed to be arithmetic shift right for signed integers. For earlier language versions, there are of course are a variety of library and other solutions to this problem, but this answer, based on pure C++ code, is intended to be pedantic.

Inly answered 17/6, 2023 at 6:49 Comment(4)
Hmm, you are also turning on the low order bits with the final OR.Yogi
I don't think so... code has been tested. The high_bit 1s mask extends from the shifted sign bit to the most-significant bit only.Inly
The final cast is still implementation specific (pre-C++20) (see Integral_conversions).Primateship
@Primateship interesting point. I suppose the final cast could be replaced with a reinterpret_cast, as long as the platform uses 2's complement representation for signed integers. However, even reinterpret_cast has some implementation-defined behavior. It seems the only reliable solution, without depending on the platform or compiler, is to use C++20!Inly
F
0

Expanding a little on others' answers, here are c versions of the functions which calculate 'shift-arithmetic-right' on 32 and 64 bit signed integers without branching. The final cast is problematic however.

int32_t sar32(int32_t val, uint8_t sh) {
  sh &= 0x1f;
  uint32_t uval = (uint32_t)val;
  uint32_t result = (uval >> sh) | -((uval & 0x80000000) >> sh);
  return (int32_t)result;
}

int32_t sar32b(int32_t val, uint8_t sh) {
  sh &= 0x1f;
  uint64_t uval = val;
  return (int32_t)(uval >> sh);
}

int64_t sar64(int64_t val, uint8_t sh) {
  sh &= 0x3f;
  uint64_t uval = (uint64_t)val;
  uint64_t result = (uval >> sh) | -((uval & 0x8000000000000000UL) >> sh);
  return (int64_t)result;
}

These functions sanitise the input of sh to make the shifting safe, but they do this in a way which wraps around if values outside of what is permitted are entered. To avoid the wrapping, something like

sh = (sh >= 0x1f ? 0x1f : sh & 0x1f);

could be used but this introduces branching. One way to avoid this would be to introduce another variable

uint8_t sh2 = ((sh >= 0x1f)*0x1f) | (sh & 0x1f);

and shift with that instead.

It's worth mentioning I think, that while the function below compiles with gcc (which ensures sign extension) and gives no warning even with -Wall -fsanitize=undefined flags, this should not be used where strict compliance with c standards are required because right shifting negative integer values is implementation defined behaviour in c.

int32_t sar(int32_t val, uint8_t sh)
{
    return val >> (sh & 0x1f); // DO NOT USE IF val < 0!!!
}

For 32 and 64 bit non-branching functions which use union based type-punning, which is behaviour that is (allegedly) not in any way undefined in 'modern' c, and which don't wrap the shift amount, are below. This approach may not carry over to c++.

int32_t sar32(int32_t val, uint8_t sh) {
  uint8_t sh2 = ((sh >= 0x1f)*0x1f) | (sh & 0x1f);
  union {
    int64_t i;
    uint64_t u;
  } input = {0};
  input.i = val;
  input.u >>= sh2;
  return (int32_t)input.i;
}

int64_t sar64(int64_t val, uint8_t sh) {
  uint8_t sh2 = ((sh >= 0x3f)*0x3f) | (sh & 0x3f);
  union {
    int64_t i;
    uint64_t u;
  } input = {0};
  input.i = val;
  input.u = (input.u >> sh2) | -((input.u & 0x8000000000000000UL) >> sh2);
  return input.i;
}

A somewhat laborious approach (which might better convert to other languages such as c++) would be to use memcpy.

int32_t sar32(int32_t val, uint8_t sh) {
  uint8_t sh2 = ((sh >= 0x1f)*0x1f) | (sh & 0x1f);
  int32_t result;
  uint32_t uval32, uval32mask;
  memcpy(&uval32, &val, 4);
  uval32mask = -(uval32 >> 31);
  uval32 = (uval32 >> sh2) | (uval32mask << (31 - sh2));
  memcpy(&result, &uval32, 4);
  return result;
}

int64_t sar64(int64_t val, uint8_t sh) {
  uint8_t sh2 = ((sh >= 0x3f)*0x3f) | (sh & 0x3f);
  int64_t result;
  uint64_t uval64, uval64mask;
  memcpy(&uval64, &val, 8);
  uval64mask = -(uval64 >> 63);
  uval64 = (uval64 >> sh2) | (uval64mask << (63 - sh2));
  memcpy(&result, &uval64, 8);
  return result;
}
Feat answered 17/6, 2023 at 11:5 Comment(6)
Thanks for the C implementation. The lack of warnings could be because GCC is only considering its own implementation (which may be well-defined), or perhaps because the result of the shift is implementation-defined, not undefined.Inly
Is the final cast back to a signed integer well-defined? My parade was rained on too...Inly
I've spent some time looking into this today, to find a definitive answer. The fact that gcc gives no warning is no reassurance, given what else is written above. I think it is okay, but I want to find some credible reference to substantiate this claim. I'll keep looking....Feat
@KevinH.Patterson Admitted defeat on the casting. Please let me know if you find anything wrong with the union based solutions I posted.Feat
(((uint8_t)(sh <= 0x1f) - 1) & 0x1f) | (sh & 0x1f) also works for sh2 if the multiplication is undesirable.Feat
Even if the union escapes the warning label of "implementation-defined" (by intention or oversight), if the underlying architecture does not use 2's complement representation for negative integers, all these approaches will fail (perhaps unlikely in real life, but technically still possible). C++20 requires 2's complement, and provides a well-defined bit_cast for these operations. It also defines >> to perform arithmetic shift on signed integers. All of these posts have been noble efforts, but as a great appreciator of C++, I feel that noblesse oblige compels me to concede victory to C++20.Inly
S
0

Just use >> directly on signed integers.

The major compilers document that it perform the arithmetic shift:

  • GCC

  • MSVC

  • Clang doesn't document implementation-defined behavior altogether, but since it goes out of its way to serve as a drop-in replacement for GCC and MSVC, it should be safe too.

And, as you said, C++20 guarantees sign extension for >>. I'm fairly sure this just standardizes what the compilers were doing anyway.

And to be certain, add a test:

static_assert(-4 >> 1 == -2, ">> doesn't do sign extension");
Slipslop answered 17/6, 2023 at 12:57 Comment(3)
Thanks for the additional insight. My question (and answer especially) was intended to be mainly pedagogical, in terms of branch and condition-free bitwise algorithms.Inly
@KevinH.Patterson I kind of understand that, but I've seen to many programmers create unnecessary problems for themselves to achieve theoretical correctness.Slipslop
This is true, I've seen it as well. A potentially practical use for this approach might be on (perhaps less-mainstream) C++ compilers targeting microcontrollers which may not have a "shift arithmetic right" instruction. But in most common applications it's probably unnecessary.Inly

© 2022 - 2024 — McMap. All rights reserved.