Haskell: List fusion, where is it needed?
Asked Answered
S

2

6

Lets say we have the following:

l = map f (map g [1..100])

And we want to do:

head l

So we get:

head (map f (map g [1..100]))

Now, we have to get the first element of this. map is defined something like so:

map f l = f (head l) : (map f (tail l))

So then we get:

f (head (map g [1..100]))

And then applied again:

f (g (head [1..100]))

Which results in

f (g 1)

No intermediate lists are formed, simply due to laziness.

Is this analysis correct? And with simple structures like this:

foldl' ... $ map f1 $ map f2 $ createlist

are intermediate lists ever created, even without "list fusion"? (I think laziness should eliminate them trivially).


The only place I can see a reason to keep a list is if we did:

l' = [1..100]
l = map f (map g l')

Where we might want to keep l', if it is used elsewhere. However, in the l' case above here, it should be fairly trivial for the compiler to realise its quicker just to recalculate the above list than store it.

Shaunshauna answered 8/6, 2012 at 8:28 Comment(0)
L
7

In your map example the list cells are created and then immediately garbage-collected. With list fusion no GC or thunk manipulation (which are both not free) is needed.

Langsyne answered 8/6, 2012 at 8:45 Comment(0)
I
7

Fusion benefits most when the intermediate data structures are expensive. When the structure being passed is lazy, and accessed in a sequential manner, the structures may be relatively cheap (as is the case with lists).

  • For lazy structures, fusion removes the constant factor of creating a thunk, and immediately garbage collecting it.
  • For strict structures, fusion can remove O(n) work.

The greatest benefits are thus for strict arrays and similar structures. However, even fusing lazy lists is still a win, due to increased data locality, due to fewer intermediate structures being allocated as thunks.

Imitate answered 8/6, 2012 at 11:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.