I was reading the C Standard the other day, and noticed that unlike signed integer overflow (which is undefined), unsigned integer overflow is well defined. I've seen it used in a lot of code for maximums, etc. but given the voodoos about overflow, is this considered good programming practice? Is it in anyway insecure? I know that a lot of modern languages like Python do not support it- instead they continue to extend the size of large numbers.
Unsigned integer overflow (in the shape of wrap-around) is routinely taken advantage of in hashing functions, and has been since the year dot.
To put it shortly:
It is perfectly legal/OK/safe to use unsigned integer overflow as you see fit as long as you pay attention and adhere to the definition (for whatever purpose - optimization, super clever algorithms, etc.)
Just because you know the minutiae of the standard doesn't mean the person maintaining your code does. That person may have to waste time worrying about this while debugging later, or have to go look up the standard to verify this behavior later.
Sure, we expect proficiency with the reasonable features of a language in a working programmer -- and different companies / groups have a different expectation about where that reasonable proficiency is. But for most groups this seems to be a bit much to expect the next person to know off the top of his/her head and not have to think about it.
If that weren't enough, you're more likely to run into compiler bugs when you're working around the edges of the standard. Or worse, the person porting this code to a new platform may run into them.
In short, I vote don't do it!
One way where I could think of unsigned integer overflow causing a problem is when subtracting from a small unsigned value, resulting in it wrapping to a large positive value.
Practical advice for integer overflows:
http://www.gnu.org/software/hello/manual/autoconf/Integer-Overflow-Basics.html#Integer-Overflow-Basics
a
and b
are e.g. UInt32
, the expression (UInt32)(a-b)
will yield wrapping behavior, and it will be clear that such behavior is expected. The expectation that (a-b) is supposed to yield wrapping behavior, however, is far less obvious, and on machines where int
is bigger than 32 bits, it might in fact not yield such behavior. –
Melanoma I use it all the time to tell if it is time to do something.
UInt32 now = GetCurrentTime()
if( now - then > 100 )
{
// do something
}
As long as you check the value before 'now' laps 'then', you are fine for all values of 'now' and 'then'.
EDIT: I guess this is really an underflow.
int
is 64 bits, since the operands to the subtraction would be extended to signed 64-bit integers, so if 'now' is 0 and then
is 4294967295u (0xFFFFFFFF) the result of the subtraction would not be 1 but -4294967295. Casting the result of the subtraction to a UInt32 before the comparison would avoid that problem since (UInt32)(-4294967295) would be 1. –
Melanoma Another place where unsigned overflow can be usefully used is when you have to iterate backwards from a given unsigned type:
void DownFrom( unsigned n )
{
unsigned m;
for( m = n; m != (unsigned)-1; --m )
{
DoSomething( m );
}
}
Other alternatives are not as neat. Trying to do m >= 0
doesn't work unless you change m to signed, but then you might be truncating the value of n - or worse - converting it to a negative number on initialisation.
Otherwise you have to do !=0 or >0 and then manually do the 0 case after the loop.
I wouldn't rely on it just for readability reasons. You're going to be debugging your code for hours before you figure out where you're resetting that variable to 0.
It's fine to rely on overflow as long as you know WHEN it will occur ...
I, for example, had troubles with C implementation of MD5 when migrating to a more recent compiler... The code did expect overflow but it also expected 32 bits ints.
With 64 bits the results were wrong !
Fortunately that's what automated tests are for : I caught the problem early but this could have been a real horror story if gone unnoticed.
You could argue "but this happens rarely" : yes but that's what makes it even more dangerous ! When there is a bug, everybody is suspicious of code written in the last few days. No one is suspicious f code that "just worked for years" and usually no one still knows how it works...
Since signed numbers on CPUs can be represented in different ways, 99.999% of all current CPUs use twos-complement notation. Since this is the majority of machines out there, it is difficult to find a different behaviour although the compiler might check it (fat chance). The C specs however must account for 100% of the compilers so have not defined its behaviour.
So it would make things more confusion, which is a good reason to avoid it. However, if you have a really good reason (say, performance boost of factor of 3 for critical part of code), then document it well and use it.
If you use it wisely (well commented and readable), you can benefit from it by having smaller and faster code.
siukurnin makes a good point, that you need to know when overflows will occur. The easiest way to avoid the portability issue he described is to use the fixed-width integer types from stdint.h. uint32_t is an unsigned 32-bit integer on all platforms and OSes, and won't behave differently when compiled for a different system.
I would suggest always having an explicit cast any time one is going to rely upon unsigned numbers wrapping. Otherwise there may be surprises. For example, if "int" is 64 bits, code like:
UInt32 foo,bar;
if ((foo-bar) < 100) // Report if foo is between bar and bar+99, inclusive)
... do something
may fail, since "foo" and "bar" would get promoted to 64-bit signed integers. Adding a typecast back to UInt32 before checking the result against 100 would prevent problems in that case.
Incidentally, I believe the only portable way to directly get the bottom 32 bits of the product of two UInt32's is to cast one of the ints to a UInt64 prior to doing the multiply. Otherwise the UInt32's might be converted to signed Int64's, with the multiplication overflowing and yielding undefined results.
© 2022 - 2024 — McMap. All rights reserved.