What is the subset of problems that static analysis cannot capture?
Asked Answered
K

1

6

I am trying to understand the difference between static analysis and dynamic analysis for the purpose of program flow execution, for detection of security vulnerabilities.

It is fairly clear that dynamic analysis's primary weakness is that it cannot explore all possible states that a program can get into, because it relies on actually running the program with a particular set of inputs.

However, static analysis seems to reason about all possible program states, so I cannot envision a scenario where static analysis might fail, even though I am sure that such a scenario does exist. Most of the references I have looked at seem to vaguely say that the "abstract state analysis" is not as precise as what dynamic analysis can provide, but that is too fluffy for me.

Can anyone provide a simple explanation with concrete examples of where static analysis fails and dynamic analysis would be needed?

Karol answered 8/6, 2014 at 0:18 Comment(5)
What is your definition of "fails?" As in "sometimes gives the wrong answer," or "cannot possibly get the right answer?"Burl
For static analysis to catch everything, it would need to solve the halting problem.Iormina
@templatetypedef, "Cannot get the right answer with higher probability than by randomly guessing"Karol
@user2357112, I can always write a program that will flag every line of code as vulnerable, and it would then "catch everything". It would also catch innocent things.Karol
Okay, let me provide a more formal specification of "catch everything". Suppose you want to take in code and compute whether the relationship between the code's input and output has a certain undesirable (insecure) property. Rice's theorem says there is no algorithm that will do that.Iormina
C
1

Static analysis cannot ever be complete for all programs given a Turing complete input format (this includes almost all programming languages) as one cannot in general determine whether a piece of code is ever executed or not: you cannot determine whether the code prior to it halts — i.e., finishes executing (if it goes into an infinite loop, then any "problem" beyond it is bogus as it is unreachable) — a problem known as the halting problem.

However, it is, in principle, possible to find all possible problems if you also permit the analysis to output "problems" that don't actually exist. This is what virtually all static analysis tools do — large amounts of engineering effort is spent on minimizing the number of false problems they report.

Furthermore, it is worthwhile to note that some state exploration systems essentially do execute the program for every state (typically stopping a new exploration if the state has become equivalent) — however, many programs have impractically large input state spaces (consider any program taking textual input!) making them practically impossible to fully explore all states.

Cassiecassil answered 8/6, 2014 at 0:28 Comment(5)
If you can augment this answer with a concrete example (in any language, or even pseudocode), it would address my question much more thoroughly and directly.Karol
I have read quite a bit of descriptions similar to this, but nothing concrete.Karol
@Karol https://mcmap.net/q/298235/-what-exactly-is-the-halting-problem includes the closest thing you'll get to a practical example of the halting problem; is it clear to you that if the halting problem is unsolvable that you cannot reason about any further bit of the program?Cassiecassil
Yes, it is clear. Are you implying that the inability to determine whether a piece of code is ever executed or not is the only limitation of static analysis?Karol
@Karol It and other challenges that are essentially equivalent are the main challenges; they aren't the only limitation.Cassiecassil

© 2022 - 2024 — McMap. All rights reserved.