to see if branch prediction actually slows down my program or helps
Branch prediction doesn't slow down programs. When people talk about the cost of missed predictions, they're talking about how much more expensive a mispredicted branch is compared to a correctly predicted branch.
If branch prediction didn't exist, all branches would be as expensive as a mispredicted one.
So what "misprediction delay is between 10 and 20 clock cycles" really means is that successful branch prediction saves you 10 to 20 cycles.
Removing the branches not only improves runtime performance of the code, it also helps the compiler to optimize the code.
Why use branch prediction then ?
Why use branch prediction over removing branches? You shouldn't. If a compiler can remove branches, it will (assuming optimizations are enabled), and if programmers can remove branches (assuming it doesn't harm readability or it's a performance-critical piece of code), they should.
That hardly makes branch prediction useless though. Even if you remove as much branches as possible from a program, it will still contain many, many branches. So because of this and because of how expensive unpredicted branches are, branch prediction is essential for good performance.
Is there a way to force the compiler to generate assembly code without branches ?
An optimizing compiler will already remove branches from a program when it can (without changing the semantics of the program), but, unless we're talking about a very simple int main() {return 0;}
-type program, it's impossible to remove all branches. Loops require branches (unless they're unrolled, but that only works if you know the number of iterations ahead of time) and so do most if- and switch-statements. If you can minimize the number of if
s, switch
es and loops in your program, great, but you won't be able to remove all of them.
or to disable branch prediction so that CPU? so I can compare both results ?
To the best of my knowledge it is impossible to disable branch prediction on x86 or x86-64 CPUs. And as I said, this would never improve performance (though it might make it predictable, but that's not usually a requirement in the contexts where these CPUs are used).
branch prediction
is but didn't notice at the bottom he provided some useful information. – Husserl