This question is motivated by me implementing cryptographic algorithms (e.g. SHA-1) in C/C++, writing portable platform-agnostic code, and thoroughly avoiding undefined behavior.
Suppose that a standardized crypto algorithm asks you to implement this:
b = (a << 31) & 0xFFFFFFFF
where a
and b
are unsigned 32-bit integers. Notice that in the result, we discard any bits above the least significant 32 bits.
As a first naive approximation, we might assume that int
is 32 bits wide on most platforms, so we would write:
unsigned int a = (...);
unsigned int b = a << 31;
We know this code won't work everywhere because int
is 16 bits wide on some systems, 64 bits on others, and possibly even 36 bits. But using stdint.h
, we can improve this code with the uint32_t
type:
uint32_t a = (...);
uint32_t b = a << 31;
So we are done, right? That's what I thought for years. ... Not quite. Suppose that on a certain platform, we have:
// stdint.h
typedef unsigned short uint32_t;
The rule for performing arithmetic operations in C/C++ is that if the type (such as short
) is narrower than int
, then it gets widened to int
if all values can fit, or unsigned int
otherwise.
Let's say that the compiler defines short
as 32 bits (signed) and int
as 48 bits (signed). Then these lines of code:
uint32_t a = (...);
uint32_t b = a << 31;
will effectively mean:
unsigned short a = (...);
unsigned short b = (unsigned short)((int)a << 31);
Note that a
is promoted to int
because all of ushort
(i.e. uint32
) fits into int
(i.e. int48
).
But now we have a problem: shifting non-zero bits left into the sign bit of a signed integer type is undefined behavior. This problem happened because our uint32
was promoted to int48
- instead of being promoted to uint48
(where left-shifting would be okay).
Here are my questions:
Is my reasoning correct, and is this a legitimate problem in theory?
Is this problem safe to ignore because on every platform the next integer type is double the width?
Is a good idea to correctly defend against this pathological situation by pre-masking the input like this?:
b = (a & 1) << 31;
. (This will necessarily be correct on every platform. But this could make a speed-critical crypto algorithm slower than necessary.)
Clarifications/edits:
I'll accept answers for C or C++ or both. I want to know the answer for at least one of the languages.
The pre-masking logic may hurt bit rotation. For example, GCC will compile
b = (a << 31) | (a >> 1);
to a 32-bit bit-rotation instruction in assembly language. But if we pre-mask the left shift, it is possible that the new logic is not translated into bit rotation, which means now 4 operations are performed instead of 1.
(a << 31) & 0xFFFFFFFF
does not jibe. The code does a mask after the shift. – Explication(a << 31) & 0xFFFFFFFF
as the theoretical infinite-width int code in Python – Zoophiliastd::numeric_limits<T>::digits
. – Canticle31u
thena
will be promoted touint48
. – Seetheuint32_t a; uint32_t b = (a & 1) << 31;
--> no UB nor ID. – Explication1U
, not1
. – Colewortint
for 32 bits. – Hydrocortisoneuint32_t a; uint32_t b = (a & 1) << 31;
anduint32_t a; uint32_t b = (a & 1u) << 31;
work the same. – Explicationusing my_uint_at_least32 = std::conditional_t<(sizeof(std::uint32_t) < sizeof(unsigned)), unsigned, std::uint32_t>;
. – Hydrocortisone(x & 0xffffffff) << 31
makes no sense at all. – Pensiletypedef unsigned short uint32_t
be dismissed as their own fault if it doesn't work? – Leupold