why tail recursive gcd is faster than while loop with rubinius
Asked Answered
D

1

6

I have these two implementations of gcd function :

def gcd1(a,b)
  if a==b
    a
  elsif a>b
    if (a%b)==0
      b
    else
      gcd1(a%b,b)
    end
  else
    if (b%a)==0
      a
    else
      gcd1(a,b%a)
    end
  end
end
def gcd2(a,b)
  if(a==b)
    return a
  elsif b>a
    min,max=a,b
  else
    min,max=b,a
  end
  while (max%min)!=0
    min,max=max%min,min
  end
  min
end

The function gcd1 is tail recursive while gcd2 uses a while loop.

I have verified that rubinius does TCO by benchmarking factorial function, only with the factorial function the benchmarks showed that the recursive version and the iteration version are "same-ish"(I used benchmark-ips).

But for the above, benchmarks shows that gcd1 is faster by at least a factor of two than gcd2(recursion twice as fast than iteration, even faster).

The code I used to benchmark is this :

Benchmark.ips do |x|
  x.report "gcd1 tail recursive" do
    gcd1(12016,18016)
  end
  x.report "gcd2 while loop" do
    gcd2(12016,18016)
  end
  x.compare!
end

the result :

Warming up --------------------------------------
 gcd1 tail recursive    47.720k i/100ms
     gcd2 while loop    23.118k i/100ms
Calculating -------------------------------------
 gcd1 tail recursive    874.210k (± 7.1%) i/s -      4.343M
     gcd2 while loop    299.676k (± 6.6%) i/s -      1.503M

Comparison:
 gcd1 tail recursive:   874209.8 i/s
     gcd2 while loop:   299676.2 i/s - 2.92x slower

I'm running Arch linux x64, processor i5-5200 2.2 GHZ quad-core.

The ruby implementation is Rubinius 3.40 .

So how can recursion be faster than the loop ?

Update

Just to say that fibonacci has the same situation : tail recursive version is at least twice as fast as the loop version, the program I used for fibonacci : http://pastebin.com/C8ZFB0FR

Disintegration answered 19/6, 2016 at 21:29 Comment(3)
I don't have time to go through your code, so I can't give you an answer, but I wanted to address one small thing: you ask "So how can recursion be faster than the loop ?" and I just wanted to point out that tail-recursion is a loop, it is exactly equivalent. Maybe Rubinius's inliner works better than its loop unroller? You'll really have to look at the generated native machine code, in order to get a definitive answer.Warmth
@JörgWMittag Yes I know tail-recursion is equivalent to a loop , that's why I'm asking because if equivalent they should have similar performance !!Disintegration
Well, in my book, 2–3× is similar :-D Like I said, one possible reason could be that Rubinius is better at inlining than it is at loop unrolling. This would produce a larger block of straight-line code (i.e. code without conditionals) for the recursive version than for the loop version, and thus a larger block of straight-line code to run optimizations on. But that's just a guess. You'll have to look at the code in various stages of optimization in the compiler, the JITter and at the generated native machine code to figure out what's really going on.Warmth
E
2

In the example you're using it only takes 3 calls/loops to get to the answer, so I don't think you're actually testing the right thing. Try with two consecutive Fibonacci numbers instead (eg 2000th and 2001th) and the results should not differ that much.

(I'm sorry, I don't have reputation to make a comment yet).

EDIT: I finally managed to install [a part of] rubinius and managed to re-create the phenomena you're referring to. This is not about recursion, it's about multiple assignment. If you change

      while n>0
        a,b=b,a+b
        n-=1
      end

to

      while n>0
        t=a
        a=b
        b=t+b
        n-=1
      end

the while loop version should perform (a bit) faster. The same applies to the original GCD program, i.e. replacing

        min,max=max%min,min

with

        t=min
        min=max%t
        max=t

is the thing.

This was not the case with ruby-2.1, i.e. the while loop seemed faster, just in the form you provided.

Hope that helps!

Ernest answered 24/6, 2016 at 3:0 Comment(3)
same result with fib(2000), this is the code I used : pastebin.com/C8ZFB0FRDisintegration
As of Fibonacci numbers I just meant to use them as arguments to gcd functions, as with fib(2000) and fib(2001) they have to perform 2000 steps (testing on 3 steps only is never a good idea), but anyway your fibonacci test was even nicer case.Ernest
Who would have thought of, I also tried gcd(a,b) where a is fib(2000) and b is fib(2001) and gcd with multiple assignment (they aren't calculated inside the benchmark), it reported the loop was slower but by 1.17x onlyDisintegration

© 2022 - 2024 — McMap. All rights reserved.