Are Ana-/Catamorphisms just slower?

Asked 14/5, 2014 at 20:9 Answered 17/5, 2014 at 8:43

Solved haskell kdtree catamorphism recursion-schemes

After writing this article I decided to put my money where my mouth is and started to convert a previous project of mine to use recursion-schemes.

The data structure in question is a lazy kdtree. Please have a look at the implementations with explicit and implicit recursion.

This is mostly a straightforward conversion along the lines of:

data KDTree v a = Node a (Node v a) (Node v a) | Leaf v a

data KDTreeF v a f = NodeF a f f | Leaf v a

Now after benchmarking the whole shebang I find that the KDTreeF version is about two times slower than the normal version (find the whole run here).

Is it just the additional Fix wrapper that slows me down here? And is there anything I could do against this?

Caveats:

At the moment this is specialized to (V3 Double).
This is cata- after anamorphism application. Hylomorphism isn't suitable for kdtrees.
I use cata (fmap foo algebra) several times. Is this good practice?
I use Edwards recursion-schemes package.

Edit 1:

Is this related? https://ghc.haskell.org/trac/ghc/wiki/NewtypeWrappers Is newtype Fix f = Fix (f (Fix f)) not "free"?

Edit 2:

Just did another bunch of benchmarks. This time I tested tree construction and deconstruction. Benchmark here: https://dl.dropboxusercontent.com/u/2359191/2014-05-15-kdtree-bench-03.html

While the Core output indicates that intermediate data structures are not removed completely and it is not surprising that the linear searches dominate now, the KDTreeFs now are slightly faster than the KDTrees. Doesn't matter much though.

Bangweulu answered 14/5, 2014 at 20:9 Comment(3)

I would benchmark that against data KDTree v a = Node a (Identity (Node v a)) (Identity (Node v a)) | Leaf v a, just to see if the extra indirection is the culprit. Make sure Identity is not a newtype. – Byington 14/5, 2014 at 20:24

dl.dropboxusercontent.com/u/2359191/… now the _nn tests are wrapped in data Identity a = MkId { unId :: a} as you suggested. I'd say it is slower now ... but not as slow as the morphism version. – Bangweulu 14/5, 2014 at 20:46

Whoever voted to close this question: care to elaborate? – Bangweulu 19/5, 2014 at 9:1

I have just implemented the Thing + ThingF + Base instance variant of the tree. And guess what ... this one is amazingly fast.

I was under the impression that this one would be the slowest of all variants. I really should have read my own post ... the line where I write:

there is no trace of the TreeF structure to be found

Let the numbers speak for themselves, kdtreeu is the new variant. The results are not always as clear as for these cases, but in most cases they are at least as fast as the explicit recursion (kdtree in the benchmark).

Benchmark results

Bangweulu answered 17/5, 2014 at 8:43 Comment(4)

By now I accelerated the kdtreeu_nn5 case too, turns out that the cata (fmap foo algebra) pattern is not useful in this case. – Bangweulu 19/5, 2014 at 9:3

When you finalize this last variant, will you push it to github or make it available somehow? – Byington 19/5, 2014 at 9:7

This is still WIP, but the concept should be clear here: github.com/fhaust/kdtree/blob/master/src/Data/KDTree.hs#L52-61 – Bangweulu 19/5, 2014 at 9:39

Aaand ... I just remembered that I merged KDTreeU into KDTree. If you want comparative benchmarks you'll have to benchmark a current vs one of the old versions. And the corresponding lines are: github.com/fhaust/kdtree/blob/master/src/Data/… – Bangweulu 19/5, 2014 at 9:46

I wasn't using recursion schemes, but rather my own "hand-rolled" cata, ana, Fix/unFix to do generation of (lists of) and evaluation of programs in a small language in the hope of finding one that matched a list of (input, output) pairs.

In my experience, cata optimized better than direct recursion and gave a speed boost. Also IME, ana prevented stack overflow errors that my naive generator was causing, but that make have centered around generation of the final list.

So, my answer would be that no, they aren't always slower, but I don't see any obvious problems; so they may simply be slower in your case. It's also possible that recursion-schemes itself is just not optimized for speed.

Infare answered 16/5, 2014 at 19:54 Comment(1)

Just did a last test run against a hand rolled cata. Still not as fast as the implicit one nor the one from recursion-schemes. So I think I just give up at this point and concentrate on different things. Thanks for your answer :) – Bangweulu 17/5, 2014 at 7:8

Caveats:

Edit 1:

Edit 2:

Recommended topics

Hot tags