Lazy vs eager evaluation and double linked list building

Asked 14/2, 2012 at 12:51 Answered 14/2, 2012 at 19:33

I can't sleep! :)

I've written small program building double linked list in Haskell. The basic language's property to make it was lazy evaluation (see the bunch of code below). And my question is can I do the same in a pure functional language with eager evaluation or not? In any case, what properties eager functional language must have to be able to build such structure (impurity?)?

import Data.List

data DLList a = DLNull |
    DLNode { prev :: DLList a
           , x :: a
           , next :: DLList a
           }
  deriving (Show)

walkDLList :: (DLList a -> DLList a) -> DLList a -> [a]
walkDLList _ DLNull = []
walkDLList f n@(DLNode _ x _)  = x : walkDLList f (f n)

-- Returns first and last items.
makeDLList :: [a] -> (DLList a, DLList a)
makeDLList xs = let (first, last) = step DLNull xs in (first, last)
  where
    step prev [] = (DLNull, prev)
                         -- Here I use laziness. 'next' is not built yet, it's a thunk.
    step prev (x : xs) = let this = DLNode prev x next 
                             (next, last) = step this xs
                         in (this, last)

testList :: [Int] -> IO ()
testList l = let
    (first, last) = makeDLList l
    byNext = walkDLList next first
    byPrev = walkDLList prev last
  in do
    putStrLn $ "Testing: " ++ show l
    print byNext
    print byPrev

main = do
  testList []
  testList [1, 2, 3, 4]

Commanding answered 14/2, 2012 at 12:51 Comment(1)

check out this code. :) It returns just the head node, and is exceptionally simple. – Asch 14/2, 2012 at 22:51

A doubly-linked list can be implemented in a purely functional way in an eager language as a zipper on a singly-linked list. See, for example, Rosetta Code > Doubly-linked list > OCaml > Functional.

Odrick answered 14/2, 2012 at 19:33 Comment(8)

I think the point of doubly-linked list is that it does not allocate new storage on traversal, unlike the code you linked to, Dan. In it, prev(next alist) is not the same storage as alist, unlike the usual C, say, implementation, and also the Haskell code from the question above where after let (alist,zlist)=makeDLList xs we have prev(next alist)==alist and next(prev zlist)==zlist where == means "the same actual storage", or in Lisp-parlance "eq-equivalent" (for non-empty xs of course). – Asch 14/2, 2012 at 20:14

@WillNess regarding "not allocat[ing] new storage on traversal", that is an implementation detail, and comes at a cost. You cannot "insert" into an immutable doubly-linked list (in the style given by OP) without making a copy of the entire list. Well, you can usually get away with less copying due to laziness, but the point is the new doubly-linked list cannot share any nodes with the old doubly-linked list. The immutable zipper implementation allows more sharing, though at the cost of more storage required for traversal (but not much more, again due to the ability to share). – Odrick 14/2, 2012 at 22:38

@WillNess n.b. Haskell makes no guarantees about eq-ness. GHC's parallel garbage collector takes advantage of this, and can actually remove sharing that you might expect to be there. This is rare, and barely affects performance, though, and due to immutable data, it will never affect your results. – Odrick 14/2, 2012 at 22:41

I'd expect it that if the same name is used twice, (a pointer to) the same value would be used in each place: in let {this = DLNode prev x next ; (next, last) = step this xs}, this is this, right? Isn't it what laziness means? One name - one (pointer to an) entity, not? You're quite right about whole-list copy on inserts. Maybe zipper is better in some situations, and DL-list in others. Still they're different. – Asch 14/2, 2012 at 23:29

also, if GHC would remove sharing from two uses of same variable, how would it impact the various corecursive definitions which rely on it, like e.g. fibs = 0:scanl (+) 1 fibs? Would it even work, or suffer a drastic loss of efficiency? – Asch 15/2, 2012 at 8:1

@WillNess fib # n is typically shared by fib # n+1 and fib # n+2. Suppose the garbage collector removed this sharing, making two copies of the thunk for fib # n. There would be minimal loss of efficiency, since the two copies of fib # n would probably still share fib # n-1 and fib # n-2. The only loss in efficiency is that that particular node would be calculated twice, and the cost per node of a fib calculation is fairly trivial. The same name used twice always represents the same value, but perhaps surprisingly, does not have to be the same pointer (though it usually is). – Odrick 15/2, 2012 at 17:59

You suggest only some cells in a fib list might get "unshared", once in a long while. But there is a difference between a ptr to node's value cell, and a ptr to node itself. This can't be unshared I suspect, or else the whole list would get split after that point, and all the nodes after that would have to be calculated twice, leading potentially to the terrible inefficiency of the naive doubly-recursive version. Same here, this denotes the node itself, so must get shared. I.e be the same. Without it, any code we write becomes the house of cards. – Asch 16/2, 2012 at 9:39

Consider this e.g.: fixc f = f(fixc f) vs fixs f = x where x = f x. First is "copying", second is "sharing". This is essential e.g. here: fibs = fix ((0:).(1:).next) where next(a:t@(b:_))=(a+b):next t. With the "sharing" fix it runs in linear time as expected, but with the "copying" one, it runs in quadratic time. – Asch 16/2, 2012 at 9:48

As long as a language has something like closures, lambdas etc. you can always simulate lazyness. You could rewrite that code even in Java (without mutating variables etc), you just need to wrap every "lazy" operation in something like

interface Thunk<A> {
   A eval();  
}

Of course this would look terrible, but it is possible.

Bundle answered 14/2, 2012 at 13:18 Comment(3)

You need something updatable to get the effect of lazy evaluation (evaluate at most once), otherwise you'll be emulating call-by-name. – Theocrasy 14/2, 2012 at 15:39

So, augustus, your point is that it's not enough just closures and lambdas? – Commanding 14/2, 2012 at 16:48

I think he just refers to that the result of eval needs to be cached, so the thunk is not recomputed should eval be run multiple times. – Selfsupporting 14/2, 2012 at 17:22

In the non-backtracking subset of Prolog, which can be seen as explicitly set-once eager pure functional language, you can build the doubly-linked lists easily. It's the referential transparency that makes it hard in Haskell, for it forbids the Prolog's explicit setting of the named, explicitly not-yet-set logical variables, and instead forces Haskell to achieve same effect in the warped way of "tying the knot". I think.

Plus, there really isn't much difference between Haskell's guarded recursion under lazy evaluation vs. Prolog's open-ended lists built in tail-recursion modulo cons fashion. IMO. Here's for instance an example of lazy lists in Prolog. The memoized shared storage is used as universal access mediator, so the results of previous calculations can be arranged to be cached.

Come to think of it, you can use C in a restrictive manner as an eager pure functional language, if you never reset any variable nor any pointer once it is set. You still have null pointers, just as Prolog has variables, so it is too, explicitly set-once. And of course you can build doubly-linked lists with it.

So the only question that remains is, do you admit such set-once languages as pure?

Asch answered 14/2, 2012 at 17:52 Comment(0)

Recommended topics

Hot tags