Which integer operations are unsafe?

Asked 13/6, 2018 at 12:15 Answered 13/6, 2018 at 16:23

My application evaluates some integer expressions specified by the user. I want to detect all potential errors and report them.

All computations are done in int64_t (signed). Formulas may include almost all C++ binary operators (+, -, *, /, %, |, ||, &, &&, and six comparison operators) and integers (possibly negative).

The question is: what errors can possibly happen while evaluating such expression that can make my program terminate? I came up with two of them:

Division (or modulus) by zero
Division of std::numeric_limits<int64_t>::min() by -1.

Signed integer overflow also may occur, but, to my best knowledge, in such setting it cannot do anything harmful on most CPUs, so we ignore it.

Picture answered 13/6, 2018 at 12:15 Comment(17)

CPUs aren't (directly) relevant. Your CPU doesn't know C++. What matters is what the compiler does with your code, and it may well translate it to something that only works if overflow doesn't occur. – Orcus 13/6, 2018 at 12:19

Signed integer overflow leads to undefined behavior. It might be harmless, or it might summon nasal demons. – Doorway 13/6, 2018 at 12:19

IIRC shifting negative numbers had a tricky part, too, but you don't allow << / >> anyway. – Orcus 13/6, 2018 at 12:20

As do assigning operators /= etc. Which might throw std::bad_alloc and the like. But you dont have them either. – Reformism 13/6, 2018 at 12:21

@KavehVahedipour operators for builtin types? Throw?.. – Trifling 13/6, 2018 at 12:23

@Someprogrammerdude You're definitely right, and I even faced situations where they lead to ridiculous optimization results. Though constant values were almost always involved. In this case I'm satisfied with the result "if overflow happens then any result may be returned". And, while it is of course UB, I cannot see any (reasonable) ways for my program to terminate. – Picture 13/6, 2018 at 12:26

you mentioned division by zero. Don't forget % by 0 – Tonneson 13/6, 2018 at 12:28

@GonenI Thanks, I handled it but didn't add to the post. Edited. – Picture 13/6, 2018 at 12:28

What do you mean by unsafe and errors? Are you asking which operators might participate in invoking UB? – Strachan 13/6, 2018 at 12:34

It's strange you disallow the overflow case std::numeric_limits<int64_t>::min() /-1 but allow all other forms of overflow. – Cultivable 13/6, 2018 at 12:40

@Cultivable Because this kind of overflow terminates the program immediately, while others generally do not. – Picture 13/6, 2018 at 12:45

@IvanSmirnov They might, though. If you're so worried about illegal math operations you absolutely should care about UB, which overflow is. – Flagpole 13/6, 2018 at 12:50

To add to @Gonen I #2 "Division of std::numeric_limits<int64_t>::min() by -1." applies to % also as the result is the remainder of a problematic division. – Metaphysic 13/6, 2018 at 12:55

List omits operators <<, >> which have UB/IDB. – Metaphysic 13/6, 2018 at 12:57

@Metaphysic Yes, this is intentional. – Picture 13/6, 2018 at 13:3

Here is a signed overflow horror story: #48731806 – Monarchism 13/6, 2018 at 13:12

@Monarchism I had this example in mind saying 'constants were involved'. Do you have examples of similar situations but without loops and compile-time constants? – Picture 13/6, 2018 at 13:15

Here is a good reference: https://wiki.sei.cmu.edu/confluence/display/c/INT32-C.+Ensure+that+operations+on+signed+integers+do+not+result+in+overflow

As it explains, signed integer overflow is undefined behavior. You may think this doesn't matter because you have observed that INT64_MAX + x doesn't do anything strange on your particular system. You may also think that it will never do anything strange, because the optimizer can't know the value of x.

But undefined behavior is still undefined, and among many other possible outcomes, some platforms could terminate your program (which you said you want to avoid), because they implement overflow trapping or arithmetic exceptions in hardware.

To write a conforming C++ program that does arithmetic on signed integers, you must check their values first. A cheap and easy way which might be good enough is simply to check that each integers is within [INT64_MIN/2, INT64_MAX/2] before you add or subtract. For a more detailed method, see here: How to detect integer overflow?

Beebread answered 13/6, 2018 at 12:34 Comment(0)

It is not the operations that are unsafe per se. It's the signed integer overflow which is undefined behavior that is the problem. That way (almost) all the operators can participate in causing UB, although you will probably get there using the arithmetic operators. Long story short: don't let the signed integer overflow / invoke UB.

Strachan answered 13/6, 2018 at 12:47 Comment(0)

To summarize the previous observations, there are exactly two possible causes of undefined behavior relating to your list of operations:

division by zero
overflow (whether negative or positive) – your example of negating std::numeric_limits<int64_t>::min() – whether by division or otherwise – is just an example of that.

Only the arithmetic operators (the first five from your list) are affected by either of these issues, all others have well-defined behavior for all inputs.

What I want to do is expand on the dangers of integer overflow and undefined behavior. First, I'd highly recommend you to watch Undefined Behavior is Awesome by Piotr Padlewski, and the Garbage In, Garbage Out talk by Chandler Carruth.

Also, consider how integer overflow is a recurring theme in CVEs (software vulnerability reports). The integer overflow itself does not usually cause direct damage, but many other problems can ensue as a result of the overflow. You could liken the overflow to a pin prick, that by itself is mostly harmless, but can help dangerous toxins and germs to bypass your body's immune system.

There was at least one hole in OpenSSH which was directly related to integer overflow, for example, and this one did not even involve any "crazy" compiler optimizations, or, for that matter, any optimizations at all.

Finally, things like UBSAN (the undefined behavior sanitizer in Clang/GCC) exist. If you allow signed integer overflow in one place, and try to get meaningful results from UBSAN, you may get unexpected traps and/or too many false positives.

TL;DR: Avoid all undefined behavior.

John Zwinck has mentioned adding range checks as a remedy, carefully avoiding any intermediate operations that would overflow. Assuming you only have to support GCC, there are also two command line options that should help you a lot, if you feel lazy:

-ftrapv will cause signed integer overflow to trap.
-fwrapv will cause signed integers to wrap on overflow.

Which one is safer? Actually, this highly depends on your application domain. Your opinion seems to be that less chance of crashing equals "safer". It could be so, however, consider the above-mentioned OpenSSH vulnerability. What would you rather have an SSH server do when fed garbage data, and possibly shellcode, from the remote client?

A) terminate (as would happen with -ftrapv)
B) proceed and possibly execute the shellcode (as would happen with -fwrapv)

I'm pretty sure most admins would go for A), even more so if the process to be terminated is not the process listening on the actual socket(s) but has been specifically fork()ed to handle the current connection, so there isn't even much of a DoS. In other words, while -fwrapv gives you defined behavior, it does not necessarily mean that the behavior is expected at the point of use, and therefore "safe".

Also, I recommend that you avoid making false dichotomies in your mind such as process crash vs. proceeding with garbage data. You can choose from a wide range of error handling strategies if you add the right checks, whether using special return values or exception handling, to safely get out of a tight space without having to stop servicing requests altogether.

Caughey answered 13/6, 2018 at 16:23 Comment(0)

Recommended topics

Hot tags