Is there a C++ implementation that detects all undefined behavior?
Asked Answered
F

10

73

A huge number of operations in C++ result in undefined behavior, where the spec is completely mute about what the program's behavior ought to be and allows for anything to happen. Because of this, there are all sorts of cases where people have code that compiles in debug but not release mode, or that works until a seemingly unrelated change is made, or that works on one machine but not another, etc.

My question is whether there is a utility that looks at the execution of C++ code and flags all instances where the program invokes undefined behavior. While it's nice that we have tools like valgrind and checked STL implementations, these aren't as strong as what I'm thinking about - valgrind can have false negatives if you trash memory that you still have allocated, for example, and checked STL implementations won't catch deleting through a base class pointer.

Does this tool exist? Or would it even be useful to have it lying around at all?

EDIT: I am aware that in general it is undecidable to statically check whether a C++ program may ever execute something that has undefined behavior. However, it is possible to determine whether a specific execution of a C++ produced undefined behavior. One way to do this would be to make a C++ interpreter that steps through the code according to the definitions set out in the spec, at each point determining whether or not the code has undefined behavior. This won't detect undefined behavior that doesn't occur on a particular program execution, but it will find any undefined behavior that actually manifests itself in the program. This is related to how it is Turing-recognizable to determine if a TM accepts some input, even if it's still undecidable in general.

Thanks!

Fideliafidelio answered 30/8, 2011 at 2:1 Comment(11)
Due to the wide variety of U.B. possible, and the sheer difficulty of writing a C++ implementation in the first place, it might be hard to find a tool to detect all kinds of U.B... but it'd be interesting to head what's out there!Mot
This stack overflow question is very relevant: programmers.stackexchange.com/questions/99692/…Aquarelle
I was offered a cake if I'd just make a list of all UB in C++. (by the NEN colleague who'd made the C list - he wasn't impressed with the amount of UB in C++)Philae
It's obviously impossible to detect all kinds of undefined behaviour. As an example, for integers x and n, x << n is undefined for n greater than the number of digits in x. If a tool is supposed to determine whether this condition holds or not, it will in general need to solve the halting problem, which is impossible.Angloirish
@Joren- You can't statically detect this, but you could detect it at runtime if you instrumented the program.Fideliafidelio
@templatetypedef: The halting problem is a very well known problem that has been proven to be undecidable many many years ago. There is not essential difference between analysing a program statically, and simulating its run: in both cases you just take the program as an input and output YES or NO.Rearrange
@templatetypedef: I think I understand what you mean now, thanks a lot for your explanation. Maybe there could be a C++ interpreter, where, e.g., each variable is marked as undefined until it gets assigned some value explicitly. Yes, I think it would be a very useful tool to have. I do not know if one exists. I guess it would be difficult to implement due to the use of pointers.Rearrange
Wouldn't it be easier to create a new version of a C++ compiler, or at least a spec for conformance, that removes as many undefined behaviors as possible? Also, where are some actual comprehensive lists of unspecified/undefined/implementation-defined behaviors? (I'm having trouble finding them.)Tipper
"As an example, for integers x and n, x << n is undefined for n greater than the number of digits in x" -- No, it is undefined for n greater than the number of bits in x's representation, which is implementation-defined. Even if it weren't, this would have nothing to do with the halting problem.Lavinalavine
constexpr is required to detect undefined behaviour in C++11/14, so perhaps that could be turned into a is_ub trait.Whop
The answer to your question is summarised in an excellent blog post: blog.regehr.org/archives/1520Aleksandrovsk
C
22

John Regehr in Finding Undefined Behavior Bugs by Finding Dead Code points out a tool called STACK and I quote from the site (emphasis mine):

Optimization-unstable code (unstable code for short) is an emerging class of software bugs: code that is unexpectedly eliminated by compiler optimizations due to undefined behavior in the program. Unstable code is present in many systems, including the Linux kernel and the Postgres database server. The consequences of unstable code range from incorrect functionality to missing security checks.

STACK is a static checker that detects unstable code in C/C++ programs. Applying STACK to widely used systems has uncovered 160 new bugs that have been confirmed and fixed by developers.

Also in C++11 for the case of constexpr variables and functions undefined behavior should be caught at compile time.

We also have gcc ubsan:

GCC recently (version 4.9) gained Undefined Behavior Sanitizer (ubsan), a run-time checker for the C and C++ languages. In order to check your program with ubsan, compile and link the program with -fsanitize=undefined option. Such instrumented binaries have to be executed; if ubsan detects any problem, it outputs a “runtime error:” message, and in most cases continues executing the program.

and Clang Static Analyzer which includes many checks for undefined behavior. For example clangs -fsanitize checks which includes -fsanitize=undefined:

-fsanitize=undefined: Fast and compatible undefined behavior checker. Enables the undefined behavior checks that have small runtime cost and no impact on address space layout or ABI. This includes all of the checks listed below other than unsigned-integer-overflow.

and for C we can look at his article It’s Time to Get Serious About Exploiting Undefined Behavior which says:

[..]I confess to not personally having the gumption necessary for cramming GCC or LLVM through the best available dynamic undefined behavior checkers: KCC and Frama-C.[...]

Here is a link to kcc and I quote:

[...]If you try to run a program that is undefined (or one for which we are missing semantics), the program will get stuck. The message should tell you where it got stuck and may give a hint as to why. If you want help deciphering the output, or help understanding why the program is undefined, please send your .kdump file to us.[...]

and here are a link to Frama-C, an article where the first use of Frama-C as a C interpreter is described and an addendum to the article.

Catheterize answered 27/1, 2014 at 18:21 Comment(4)
Since you mention Frama-C, I have to add that it is a C interpreter (with all the undefined-behavior detecting ability that Regehr expects from the checkers he mentions in that blog post and in the article cs.utah.edu/~regehr/papers/pldi12-preprint.pdf where such a checker was needed for reduced Csmith-generated programs), and a static analyzer that can detect undefined behavior for billions of possible program inputs (which an interpreter cannot do), and more. Anyone needing help to use it for one thing or another would only have to look at the documentation or ask nicely.Latricialatrina
@PascalCuoq thank you, I could not find a nice summary on the site when I looked otherwise I would have added one. You should feel welcome to edit my post to add details that you feel are helpful.Catheterize
I have added a link to the article that contains the initial motivation for adding to an existing Frama-C analysis the ability to interpret C programs. The initial motivation was to test large parts of the framework by interpreting programs into results that could be compared to results obtained by compilation (with a reference compiler) and execution.Latricialatrina
This excellent blog post basically answers the question at 2017: blog.regehr.org/archives/1520 (Thanks Pascal and John)Aleksandrovsk
C
23

This is a great question, but let me give an idea for why I think it might be impossible (or at least very hard) in general.

Presumably, such an implementation would almost be a C++ interpreter, or at least a compiler for something more like Lisp or Java. It would need to keep extra data for each pointer to ensure you did not perform arithmetic outside of an array or dereference something that was already freed or whatever.

Now, consider the following code:

int *p = new int;
delete p;
int *q = new int;

if (p == q)
    *p = 17;

Is the *p = 17 undefined behavior? On the one hand, it dereferences p after it has been freed. On the other hand, dereferencing q is fine and p == q...

But that is not really the point. The point is that whether the if evaluates to true at all depends on the details of the heap implementation, which can vary from implementation to implementation. So replace *p = 17 by some actual undefined behavior, and you have a program that might very well blow up on a normal compiler but run fine on your hypothetical "UB detector". (A typical C++ implementation will use a LIFO free list, so the pointers have a good chance of being equal. A hypothetical "UB detector" might work more like a garbage collected language in order to detect use-after-free problems.)

Put another way, the existence of merely implementation-defined behavior makes it impossible to write a "UB detector" that works for all programs, I suspect.

That said, a project to create an "uber-strict C++ compiler" would be very interesting. Let me know if you want to start one. :-)

Chardin answered 30/8, 2011 at 5:33 Comment(15)
The code is perfectly legal. The actual rule in C++ is that *p must reference an object. This isn't the case after delete but is the case after if(p==q). Oh, and the actual behavior here is unspecified, not implementation-defined, but that's really a minor difference.Philae
Actually, this code is undefined. You cannot use pointer any way (including checking its address) after you free it.Vesicate
@Philae No, it isn't. p has been deleted, so even comparing it to another pointer results in undefined behavior.Gothic
@H2CO3: Why? See this question which quotes §5.9. There are 6 enumerated cases, and the 6th catch-all clause merely says it's unspecified.Philae
@Philae That only concerns valid pointers. thisGothic
@MSalters: See this answer. It appears the p == q test is undefined in C++03 and implementation-defined in C++11, both of which are news to me.Chardin
@H2CO3: Your linked answer is equally unsourced. Could be true, but for the moment I don't see how your assertion would trump §5.9. Out of context that appears to be a limitative enumeration of possibilities.Philae
@Philae The answer is not unsourced, its source is the C++ standard paper.Gothic
@Philae Edit: except that I was confused about different versions of the standard. So I'm only partially right in that it's UB, since it's only UB in C++03. However, it's not "unspecified behavior" (there's no such thing in the standard, only values can be unspecified) in C++11 either, but implementation-defined. (Also: before asserting quickly that I'm lying or making things up just so as to make it appear that I'm right [because, honestly, that's what you sugarcoated by writing "unsourced"...], please read the comments, where I quote the relevant parts of the Standard as requested.)Gothic
@H2CO3: The reason I ask for a source is because 5.9 is rather prescriptive, and I'd need to evaluate the two statements in relation to each other. There are more cases in the standard where two statements in isolation would appear to contradict each other, and one needs the relevant context to determine which of the two applies.Philae
In C you can't write if (p==q) if p was freed beforeSemibreve
Yes, you can do it in C tooJoachima
This is detectable with the help of scope. In C++ everything is limited to a scope, kind of like what the Rust compiler does except not super-strictly checked like in Rust.Brawn
@xfix: Encountering a compiler that actually enforces this is most likely going to result in me filing a bug report against the compiler that reads something along the lines of don't assume the standard library functions do what you think they do. (I have an implementation of global new and delete I could link this against that makes this code defined.)Populate
I could conceive of an implementation which could detect this behaviour. If a pointer contains an address, a range, and a generation number, then the equality operator could check just the address, but the indirection could check all three.Macnair
C
22

John Regehr in Finding Undefined Behavior Bugs by Finding Dead Code points out a tool called STACK and I quote from the site (emphasis mine):

Optimization-unstable code (unstable code for short) is an emerging class of software bugs: code that is unexpectedly eliminated by compiler optimizations due to undefined behavior in the program. Unstable code is present in many systems, including the Linux kernel and the Postgres database server. The consequences of unstable code range from incorrect functionality to missing security checks.

STACK is a static checker that detects unstable code in C/C++ programs. Applying STACK to widely used systems has uncovered 160 new bugs that have been confirmed and fixed by developers.

Also in C++11 for the case of constexpr variables and functions undefined behavior should be caught at compile time.

We also have gcc ubsan:

GCC recently (version 4.9) gained Undefined Behavior Sanitizer (ubsan), a run-time checker for the C and C++ languages. In order to check your program with ubsan, compile and link the program with -fsanitize=undefined option. Such instrumented binaries have to be executed; if ubsan detects any problem, it outputs a “runtime error:” message, and in most cases continues executing the program.

and Clang Static Analyzer which includes many checks for undefined behavior. For example clangs -fsanitize checks which includes -fsanitize=undefined:

-fsanitize=undefined: Fast and compatible undefined behavior checker. Enables the undefined behavior checks that have small runtime cost and no impact on address space layout or ABI. This includes all of the checks listed below other than unsigned-integer-overflow.

and for C we can look at his article It’s Time to Get Serious About Exploiting Undefined Behavior which says:

[..]I confess to not personally having the gumption necessary for cramming GCC or LLVM through the best available dynamic undefined behavior checkers: KCC and Frama-C.[...]

Here is a link to kcc and I quote:

[...]If you try to run a program that is undefined (or one for which we are missing semantics), the program will get stuck. The message should tell you where it got stuck and may give a hint as to why. If you want help deciphering the output, or help understanding why the program is undefined, please send your .kdump file to us.[...]

and here are a link to Frama-C, an article where the first use of Frama-C as a C interpreter is described and an addendum to the article.

Catheterize answered 27/1, 2014 at 18:21 Comment(4)
Since you mention Frama-C, I have to add that it is a C interpreter (with all the undefined-behavior detecting ability that Regehr expects from the checkers he mentions in that blog post and in the article cs.utah.edu/~regehr/papers/pldi12-preprint.pdf where such a checker was needed for reduced Csmith-generated programs), and a static analyzer that can detect undefined behavior for billions of possible program inputs (which an interpreter cannot do), and more. Anyone needing help to use it for one thing or another would only have to look at the documentation or ask nicely.Latricialatrina
@PascalCuoq thank you, I could not find a nice summary on the site when I looked otherwise I would have added one. You should feel welcome to edit my post to add details that you feel are helpful.Catheterize
I have added a link to the article that contains the initial motivation for adding to an existing Frama-C analysis the ability to interpret C programs. The initial motivation was to test large parts of the framework by interpreting programs into results that could be compared to results obtained by compilation (with a reference compiler) and execution.Latricialatrina
This excellent blog post basically answers the question at 2017: blog.regehr.org/archives/1520 (Thanks Pascal and John)Aleksandrovsk
E
13

Using g++

-Wall -Werror -pedantic-error

(preferably with an appropriate -std argument as well) will pick up quite a few case of U.B.


Things that -Wall gets you include:

-pedantic
Issue all the warnings demanded by strict ISO C and ISO C++; reject all programs that use forbidden extensions, and some other programs that do not follow ISO C and ISO C++. For ISO C, follows the version of the ISO C standard specified by any -std option used.

-Winit-self (C, C++, Objective-C and Objective-C++ only)
Warn about uninitialized variables which are initialized with themselves. Note this option can only be used with the -Wuninitialized option, which in turn only works with -O1 and above.

-Wuninitialized
Warn if an automatic variable is used without first being initialized or if a variable may be clobbered by a "setjmp" call.

and various disallowed things you can do with specifiers to printf and scanf family functions.

Epergne answered 30/8, 2011 at 2:40 Comment(2)
That's true, though it does not insert checks for runtime operations that might end up resulting in undefined behavior.Fideliafidelio
Absolutely this will only detect a subset of all U.B, but you do get aliasing checks which I didn't list above, which should let you know about some of the cases for dynamic failures as well.Epergne
R
13

Clang has a suite of sanitizers that catch various forms of undefined behavior. Their eventual goal is to be able to catch all C++ core language undefined behavior, but checks for a few tricky forms of undefined behavior are missing right now.

For a decent set of sanitizers, try:

clang++ -fsanitize=undefined,address

-fsanitize=address checks for use of bad pointers (not pointing to valid memory), and -fsanitize=undefined enables a set of lightweight UB checks (integer overflow, bad shifts, misaligned pointers, ...).

-fsanitize=memory (for detecting uninitialized memory reads) and -fsanitize=thread (for detecting data races) are also useful, but neither of these can be combined with -fsanitize=address nor with each other because all three have an invasive impact on the program's address space.

Riffe answered 31/7, 2013 at 19:53 Comment(0)
P
5

You might want to read about SAFECode.

This is a research project from the University of Illinois, the goal is stated on the front page (linked above):

The purpose of the SAFECode project is to enable program safety without garbage collection and with minimal run-time checks using static analysis when possible and run-time checks when necessary. SAFECode defines a code representation with minimal semantic restrictions designed to enable static enforcement of safety, using aggressive compiler techniques developed in this project.

What is really interesting to me is the elimination of the runtime checks whenever the program can be proved to be correct statically, for example:

int array[N];
for (i = 0; i != N; ++i) { array[i] = 0; }

Should not incur any more overhead than the regular version.

In a lighter fashion, Clang has some guarantees about undefined behavior too as far as I recall, but I cannot get my hands on it...

Policyholder answered 30/8, 2011 at 6:46 Comment(0)
P
3

The clang compiler can detect some undefined behaviors and warn against them. Probably not as complete as you want, but it's definitely a good start.

Pew answered 30/8, 2011 at 2:7 Comment(0)
T
3

Unfortunately I'm not aware of any such tool. Typically UB is defined as such precisely because it would be hard or impossible for a compiler to diagnose it in all cases.

In fact your best tool is probably compiler warnings: They often warn about UB type items (for example, non-virtual destructor in base classes, abusing the strict-aliasing rules, etc).

Code review can also help catch cases where UB is relied upon.

Then you have to rely on valgrind to capture the remaining cases.

Telstar answered 30/8, 2011 at 3:24 Comment(0)
R
1

Just as a side observation, according to the theory of computability, you cannot have a program that detects all possible undefined behaviours.

You can only have tools that use heuristics and detect some particular cases that follow certain patterns. Or you can in certain cases prove that a program behaves as you want. But you cannot detect undefined behaviour in general.

Edit

If a program does not terminate (hangs, loops forever) on a given input, then its output is undefined.

If you agree on this definition, then determining whether a program terminates is the well-known "Halting Problem", which has been proven to be undecidable, i.e. there exists no program (Turing Machine, C program, C++ program, Pascal program, in whatever language) that can solve this problem in general.

Simply put: there exists no program P that can take as input any program Q and input data I and print as output TRUE if Q(I) terminates, or else print FALSE if Q(I) does not terminate.

For more information you can look at http://en.wikipedia.org/wiki/Halting_problem.

Rearrange answered 30/8, 2011 at 7:25 Comment(13)
While you're correct that you can't statically find all instances of undefined behavior, you can dynamically detect ant time the program causes UB.Fideliafidelio
I can obviously construct a language for which I can prove all possible UB. Define a "Turing+" machine which exhibits UB if any cell contains the value "2". Trivially, it's Turing-complete (any problem can be coded with a finite number of "0" and "1"s) and equally trivial, I can detect all possible UB (value "2" present in a finite program). Which bit of C++ makes this specifically impossible ?Philae
@templatetypedef: No, not even dynamically. Why would you think that this is possible?Rearrange
@MSalters: What is Turing-Complete? The Turing machine which exibits UB if any cell contains the value "2"? Why would it be? What do you mean by Turing complete? That it can compute all computable functions?Rearrange
@Giorgio: Standard term, see en.wikipedia.org/wiki/Turing_completeness. My proposed "Turing+" machine obviously is.Philae
@MSalters: Why is your Turing maching universal? I do not see why it can take the encoding of any Turing machine and simulate its computation. I mean, of course you can encode any Turing Machine with a sequence of 0 and 1, but is your Turing machine able to use such an encoding to simulate it? I do not see this. And also, what does this have to do with the fact that it exhibits UB if any cell contains the value "2"?Rearrange
@Giorgio: My machine processes a tape with only zeroes and ones exxactly like a regular Turing machine. Therefore, any function that is translated into a tape for a Turing machine (i.e. computable function) can be used without modification in my machine, and will produce the same results. Therefore my machine can copute all computable functions. (1st half of the proof). The "2" value in my language proves that there is at least one language in which you can detect all UB. (2nd half of proof). Combined, it proves there is a Turing-complete language in which all UB is detectable.Philae
@MSalters: Maybe we should discuss the details in chat. Anyway, if you were right, your result would contraddict the theorem on the halting problem, which would be quite surprising.Rearrange
@Giorgio- Let me clarify- You cannot dynamically detect whether the program could ever trigger UB, but for any trace of a program execution you can decide whether or not the program had UB. You could do this by writing an interpreter that looked at every line of code that was about to be executed and determining whether it would result in UB. You might not detect that UB could happen in a different execution, but on any particular program run you could say whether or not UB occurred.Fideliafidelio
@templatetypedef: But if a program is in an infinite loop you only have a partial trace: you cannot say that UB "has occurred" because the UB "is occurring" (program never stops), i.e. there is no event after which you can look at the trace and say: "Hey from here on I know the program will never stop, I have UB.". Maybe I am overlooking something important in your argument?Rearrange
Maybe we mean two different things. As UB I intend that the program hangs / loops forever, not the case in which it outputs something random, e.g. because it reads some uninitialized variable.Rearrange
My definition of undefined behavior is what it says in the spec is undefined, which is things like dereferencing bad pointers, doing unsafe arithmetic on pointers, etc.Fideliafidelio
OK: Maybe you can detect this kind of UB for a particular run on a particular input. But can you write a tool that, for any program P, finds out whether P will or will not have UB for all possible inputs? E.g., that each time you reference a pointer, this pointer will have been initialized correctly, no matter what the input is? You have to check all possible inputs (an infinite number thereof): how do you plan to do this?Rearrange
A
0

Undefined behaviour is undefined. The best you can do is conform to the standard pedantically, as others have suggested, however, you can not test for what is undefined, because you don't know what it is. If you knew what it was and standards specified it, it would not be undefined.

However, if you for some reason, do actually rely on what the standard says is undefined, and it results in a particular result, then you may choose to define it, and write some unit tests to confirm that for your particular build, it is defined. It is much better, however, to simply avoid undefined behaviour whenever possible.

Aquarelle answered 30/8, 2011 at 3:0 Comment(8)
Let me clarify the question - my goal is to be able to compile and run a C++ program and have the compiled executable, at each point that undefined behavior could result, immediately reports an error. The idea is that the C++ spec very precisely defines what's undefined, and I was curious if someone had built a tool that detected every instance where this would occur.Fideliafidelio
That is more or less "conform to the standard pedantically", but using other runtime tools such as valgrind can be extremely useful. I highly recommend valgrind.Aquarelle
It's definitely "conforming to the standard pedantically." I guess what I'm looking for is a tool to see if you've done this or not. :-) And valgrind is really awesome, I second that!Fideliafidelio
The "tool" in C++ really is the compiler. C++ is an enormously complicated and context-dependant language, in order to parse it, you pretty much have to compile it.Aquarelle
The issue is that the compiler can only check statically for code that results in undefined behavior. Many types of undefined behavior can (provably!) be only caught at runtime.Fideliafidelio
"In order to parse it you pretty much have to compile it"? No. Parsing, symbol table construction and type checking don't require "compiling"; compiling produces machine code.Supertanker
@Ira: Unless it's C++, where producing the symbol table and type checking is very, very involved.Aquarelle
@Arafangion: Yes, that's what the "front end" of the compiler does. I know, my team offers one of these commercially, including full C++11, with all that symbol table and type checking stuff. See www.semanticdesigns.com/Products/FrontEnds/CppFrontEnd.html. But it isn't a compiler by a long shot (although it has lot of other uses).Supertanker
P
-1

Take a look at PCLint its pretty decent at detecting a lot of bad things in C++.

Here's a subset of what it catches

Procurator answered 30/8, 2011 at 3:1 Comment(6)
They don't even reference a C++ standard, they seem to be a tool that merely checks for, eg, conformance to guidelines such as MISRA. It smells like a tool that really checks a C project, but they support some C++ so evidently claim to support C++ as well. From the looks of the website, they don't even support GCC(!) Their most recent 'bug of the month', would surely be caught by valgrind.Aquarelle
@arafangion - It doesn't need to support gcc, its a static analysis tool. Its support for C++ is, in my experience, good. Try feeding the online demo some bad code and see what it does: gimpel-online.com/OnlineTesting.html (The "Do it yourself" one is a good place to start.). Its very very far from just a MISRA standards type tool - I've never used it for that kind of work.Procurator
The examples they give use pretty bad C++ code, I will be impressed if they were to show examples of undefined behaviour in good C++ code.Aquarelle
Once I ran PCLint on my project's code. It found a method QString foo() which contained a line: return false; This compiles OK but PCLint complained that the line looked suspicious.Rearrange
@Aquarelle - Isn't any C++ code that contains undefined behavior "bad" by definition ;) . But joking aside, why don't you try pasting some seemingly good but U.B. code into it and see what it says?Procurator
@Michael: Most of the examples I can think of are picked up by compilers quite reliably. All of the examples that that PCLint site shows are basically C issues transliterated into C++.Aquarelle

© 2022 - 2024 — McMap. All rights reserved.