Curious arithmetic error- 255x256x256x256=18446744073692774400
Asked Answered
V

5

25

I encountered a strange thing when I was programming under c++. It's about a simple multiplication.

Code:

unsigned __int64 a1 = 255*256*256*256;
unsigned __int64 a2= 255 << 24; // same as the above

cerr()<<"a1 is:"<<a1;
cerr()<<"a2 is:"<<a2;

interestingly the result is:

a1 is: 18446744073692774400 
a2 is: 18446744073692774400 

whereas it should be:(using calculator confirms)

4278190080

Can anybody tell me how could it be possible?

Verticillate answered 20/9, 2012 at 13:46 Comment(14)
It's called undefined behaviourSklar
Serious passions on post rating :) From -3 to 2 and back in 5 seconds.Genuflection
Don't see why this received 8 upvotes...Tusche
@H2CO3: Because it's a "non-obvious language quirk that is super interesting" and not because the question is well formulated or researched...Goings
@H2CO3: Yeah, let's nobody code anything ever in a language we don't 110% understand. You do realize, don't you, that if we took that mindset we'd never learn anything new?Discovert
An annoying thing about C++ compilers is that using them without turning warnings on will let you make tons of mistakes you shouldn't have to. When I compile this code in GCC with g++ -Wall -Wextra -pedantic, the compiler tells me quite clearly: "warning: integer overflow in expression [-Woverflow]". Microsoft Visual Studio also diagnoses this problem: "warning C4307: '*' : integral constant overflow". I hope this serves as a lesson to compile with warnings on in the future.Ellord
Kind of the same thing: #12488423Cullin
@R.MartinhoFernandes For the multiplication, my g++ (4.5.1, yeah, old) warns with default warning level.Cube
Another way to avoid overflow might be using literal suffixes. Depending on your compiler, 255ULL * 256 * 256 * 256 could work.Olivette
@Olivette Not only "could", must, per the standard.Cube
@DanielFischer: Is unsigned long long guaranteed to have at least 64 bits? (Admittedly, it would be silly for a compiler to support a __int64 type and not make it an alias for long long.)Olivette
@DanielFischer: hence "depending on your compiler", since everyone's still busy implementing the current standard. unsigned long long isn't in C++03, so to use it you need either a compiler-specific extension to C++03 or (for the time being) a compiler-specific partial implementation of C++11. In practice everybody has long long.Alarise
@SteveJessop Ah, yes, I forgot how lagging C++ was in that respect.Cube
@aschepler: the minimum conforming value of ULLONG_MAX is 18446744073709551615, which is the standard's way of saying "at least 64 bits".Alarise
C
39
 255*256*256*256

all operands are int you are overflowing int. The overflow of a signed integer is undefined behavior in C and C++.

EDIT:

note that the expression 255 << 24 in your second declaration also invokes undefined behavior if your int type is 32-bit. 255 x (2^24) is 4278190080 which cannot be represented in a 32-bit int (the maximum value is usually 2147483647 on a 32-bit int in two's complement representation).

C and C++ both say for E1 << E2 that if E1 is of a signed type and positive and that E1 x (2^E2) cannot be represented in the type of E1, the program invokes undefined behavior. Here ^ is the mathematical power operator.

Chelyuskin answered 20/9, 2012 at 13:49 Comment(10)
Thanks,It worked with unsigned __int64 a1 = 255*256*256; a1=a1*256;Verticillate
just my curiosity, but is int64 should have 64 bits? and 255*256*256*256 = ff 00 00 00 and is 32bit numberIseabal
@user902383: Those 255s and 256s are ints. So C++ does all the intermediate steps with ints (which are more than likely 32 bits). Numerically, it winds up with 0xFF000000. Convert that to signed 64-bit, and you get 0xFFFFFFFFFF000000 (sign extension). This is just one way it could happen, but it's by far the most common one.Discovert
Isn't this different than in C, where the type of a literal is determined by its size?Rime
@rubenvb: AFAIK, even in C, whole-number literals are ints unless you specifically tell them to be otherwise (via a cast or suffix). They just get converted to the appropriate size as needed, which in this case is after all the math is done.Discovert
@Discovert No, since C99, the type is determined by size, the first in a list (int, long int, long long int for decimal constants without suffix, plus the corresponding unsigned types for octal or hexadecimal constants; minus the types excluded by a suffix) the constant fits in is the type.Cube
Anyway, 256 is guaranteed to have type int since it fits in an int. Multiplying it by something else doesn't change that, the literals are all small.Alarise
One could argue the whole multiplication is a constant literal, but I'm sure I'm wrong. "If only C(++) was designed to be nice" is very applicable here.Rime
@rubenvb: yeah, you couldn't argue that based on the standard, because the definition of a literal in the standard is a simple matter of syntax, and 256*255 isn't it :-)Alarise
@Rime It's a "constant expression", and overflow in that makes the program ill-formed.Cube
D
17

Your literals are int. This means that all the operations are actually performed on int, and promptly overflow. This overflowed value, when converted to an unsigned 64bit int, is the value you observe.

Devanagari answered 20/9, 2012 at 13:50 Comment(8)
The fundamental mistake of the OP is far wider spread than this example: It's the misguided assumption that the type of the left-hand side somehow exudes magic powers on the right hand side and makes it clairvoyantly change behaviour.Kattegat
@KerrekSB Come on. People simply don't expect overflow since the end of the 8 bit era, and it's not unreasonable to expect literals (or expressions of only literals) to be typed automatically.Palanquin
@Potatoswatter: I think it would be fairly insane if the type of an expression depended on the values of its constituents... how would you even specify that? What's the type of 256 * n?Kattegat
@KerrekSB hence, "only literals." But it's common (or good practice, IMHO) to specify a large constant number as a product of several smaller ones. This does happen in real life.Palanquin
@KerrekSB, Doesn't Pyhton do that? I think I saw something about it upgrading the type upon overflow.Cullin
@Potatoswatter: I know that you meant only literals, but again, how would you specify that? It seems to me that you'd get into a mess.Kattegat
@chris: yes, Python does that, by dynamically allocating enough memory for the value. That's "too heavyweight" for C.Alarise
@Kerrek: I don't think it would be difficult to define "expressions only involving literals", no more difficult than defining "integer constant expression" in C++. But even if the standard did do that, someone would want another special case where the type is different, and the rules would get more complicated again. Btw, the lhs does exude magical powers on the rhs in one case, which is assigning a function or member function pointer from an overloaded name. Dig deep enough and even misguided assumptions can become guided in special cases ;-)Alarise
H
16

It is perhaps worth explaining what happened to produce the number 18446744073692774400. Technically speaking, the expressions you wrote trigger "undefined behavior" and so the compiler could have produced anything as the result; however, assuming int is a 32-bit type, which it almost always is nowadays, you'll get the same "wrong" answer if you write

uint64_t x = (int) (255u*256u*256u*256u);

and that expression does not trigger undefined behavior. (The conversion from unsigned int to int involves implementation-defined behavior, but as nobody has produced a ones-complement or sign-and-magnitude CPU in many years, all implementations you are likely to encounter define it exactly the same way.) I have written the cast in C style because everything I'm saying here applies equally to C and C++.

First off, let's look at the multiplication. I'm writing the right hand side in hex because it's easier to see what's going on that way.

255u * 256u               = 0x0000FF00u
255u * 256u * 256u        = 0x00FF0000u
255u * 256u * 256u * 256u = 0xFF000000u (= 4278190080)

That last result, 0xFF000000u, has the highest bit of a 32-bit number set. Casting that value to a signed 32-bit type therefore causes it to become negative as-if 232 had been subtracted from it (that's the implementation-defined operation I mentioned above).

(int) (255u*256u*256u*256u) = 0xFF000000 = -16777216

I write the hexadecimal number there, sans u suffix, to emphasize that the bit pattern of the value does not change when you convert it to a signed type; it is only reinterpreted.

Now, when you assign -16777216 to a uint64_t variable, it is back-converted to unsigned as-if by adding 264. (Unlike the unsigned-to-signed conversion, this semantic is prescribed by the standard.) This does change the bit pattern, setting all of the high 32 bits of the number to 1 instead of 0 as you had expected:

(uint64_t) (int) (255u*256u*256u*256u) = 0xFFFFFFFFFF000000u

And if you write 0xFFFFFFFFFF000000 in decimal, you get 18446744073692774400.

As a closing piece of advice, whenever you get an "impossible" integer from C or C++, try printing it out in hexadecimal; it's much easier to see oddities of twos-complement fixed-width arithmetic that way.

Hexachlorophene answered 20/9, 2012 at 15:51 Comment(1)
The conversion to unsigned types is prescribed by the standard (4.7. in C++03, probably somewhere similar in C++11; 6.3.1.3 in C99 and C11) to be reduction modulo 2^WIDTH.Cube
A
0

The answer is simple -- overflowed.

Aconite answered 21/9, 2012 at 3:29 Comment(0)
C
0

Here Overflow occurred on int and when you are assigning it to unsigned int64 its converted in to 18446744073692774400 instead of 4278190080

Carpio answered 21/9, 2012 at 5:19 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.