I came across this question, which compared the performance of various compilers on computing fibonaci numbers the naive way.
I tried doing this with Haskell to see how it compares to C.
C code:
#include <stdio.h>
#include <stdlib.h>
int fib (int n) {
if (n < 2) return 1;
return fib (n-1) + fib (n-2);
}
int main (int argc, char* argv[]) {
printf ("%i\n", fib (atoi(argv[1])));
return 0;
}
Result:
> gcc -O3 main.c -o fib
> time ./fib 40
165580141
real 0m0.421s
user 0m0.420s
sys 0m0.000s
Haskell:
module Main where
import System.Environment (getArgs)
fib :: Int -> Int
fib n | n < 2 = 1
| otherwise = fib (n-1) + fib (n-2)
main = getArgs >>= print . fib . read . head
Result:
> ghc -O3 -fllvm -optlo-O3 Main.hs -o fib
> time ./fib 40
165580141
real 0m1.476s
user 0m1.476s
sys 0m0.000s
Profiling with
> ghc -O3 -fllvm -optlo-O3 -prof -auto-all -caf-all -rtsopts Main.hs -fforce-recomp -o fib
> ./fib 40 +RTS -prof
reveals that fib
takes 100% time and alloc, no surprise. I took some profile of the heap, but don't know what they imply:
> ./fib 40 +RTS -hc
> ./fib 40 +RTS -hd
So my question: Is there anything I can do from my side to make this Haskell program's performance closer to C, or this is just the way GHC does things that happens to make it slower in this micro-benchmark? (I'm not asking for an asymptotically faster algorithm to compute fibs.)
Thank you very much.
[EDIT]
It turned out that ghc -O3
was faster than ghc -O3 -fllvm -optlo-O3
in this case. But optlo-block-placement
made an observable difference for the LLVM backend:
> ghc -O3 Main.hs -o fib -fforce-recomp
> time ./fib 40
165580141
real 0m1.283s
user 0m1.284s
sys 0m0.000s
> ghc -O3 -fllvm -optlo-O3 -o fib -fforce-recomp
> time ./fib 40
165580141
real 0m1.449s
user 0m1.448s
sys 0m0.000s
> ghc -O3 -fllvm -optlo-O3 -optlo-block-placement -o fib -fforce-recomp
> time ./fib 40
165580141
real 0m1.112s
user 0m1.096s
sys 0m0.016s
The reason I wanted to investigate this was because both C and OCaml were significantly faster than Haskell for this program. I sort of couldn't accept that and wanted to learn more to make sure I already did everything I could :D
> ocamlopt main.ml -o fib
> time ./fib 40
165580141
real 0m0.668s
user 0m0.660s
sys 0m0.008s
gcc -O3
propably turns it into a non-recursive loop. I don't know if GHC can do the same. – Kisumu-S
to check if it does optimize it (I'd check myself, but I don't have easy access to a Unix machine ATM). – Kisumu-O3
is pretty impressive for an imperative language compiler. But back to this program, how can thisfib
function possibly be turned into non-recursive? You have 2 resursive calls, and the final pending addition is inevitable. And the original author took care to give the argument at runtime so compilers couldn't perform constant propagation like that ingcc -O3
from the link you gave. – Jipijapa-O2
but still generated the loop). And while the two recursive calls seem to make it harder, I'd just look and see instead of hypothizing. Again, you wouldn't be the first to underestimate the compiler... – Kisumu