What is spine strictness
Asked Answered
D

1

32

In Haskell, the term spine strictness is often mentioned in relation to lazy evaluation. Though I have a vague understanding of that it means, it would be nice to have a more concrete explanation about:

  • What is the spine of a data structure
  • What does spine strictness mean?
  • What are the benefits when comparing spine strict data structures with lazy ones?
Doretheadoretta answered 7/3, 2014 at 12:2 Comment(0)
H
37

Here's an example

> length (undefined : 3 : 4 : undefined : [])
4
> length (2 : 3 : 4 : 5 : undefined)
<<loop>>

The first list contains bottoms as elements, but the "shape" of the list is fully defined. Roughly speaking, every list cell has a clearly defined "pointer" to its next element. This "shape" is called the spine.

The second list, by comparison, has completely defined elements, yet its spine is not defined. This is because it does not end with the empty list [], but with a non-terminating expression undefined. In this case the spine is not defined.

The function length cares about the spine, not the elements. So it is able to work in the first case (thanks to laziness), but not the second. We say that length is strict in the spine, but not in the elements of the list.

Similarly, in tree data structures, the spine is the "shape" of the tree. Some functions such as tree height can be written without inspecting elements, but only the spine. Such functions are strict in the spine.

While some functions have to be spine-strict (e.g. length), others can be written both in a spine-strict or spine-lazy fashion. For instance map on lists is spine-lazy: it will return the first element of the output before accessing all the spine of its input. A stricter variant can be obtained by

map' :: (a->b) -> [a] -> [b]
map' _ []     = []
map' f (x:xs) = (f x :) $! map' f xs

Whether this is beneficial depends on the context. Consider

-- apply n times function f
iter n f = foldr (.) id $ replicate n f

list1 = iter 1000 (map  succ) [1..10]
list2 = iter 1000 (map' succ) [1..10]

If I demand head list1 I will force the application of the 1000 maps only at the first element of the list. This means that after that there will be 1000 allocated thunks in memory holding up space.

Instead, head list2 will force the application of the 1000 maps on the whole list. So, all the 1000 thunks can be immediately garbage collected, reclaiming memory.

Haunted answered 7/3, 2014 at 12:45 Comment(5)
in the first example however, length (2 : 3 : 4 : 5 : undefined) will not loop but give an exception immediately because undefined in haskell is not a 'not terminating expression' though I get what you mean, but showing it in repl style makes it confusingDoretheadoretta
@Markus1189: If you're still wondering why it's called spine, try the following in GHCi: let x = map (+1) [1..10], length x, :sprint x. Then you will clearly see the spine of the listEugene
@Eugene Thanks for pointing out the sprint command! Very useful! One quick question I have though is why let x = "foo" :sp x gives me x = _ but let x = ['f', 'o', 'o'] :sp x gives me x = "foo"Leidaleiden
@Leidaleiden Interesting. I believe GHC(i) may be playing some optimization here. Instead of allocating the list-of-chars immediately, the string might be stored in a more compact form which is not unfolded until the string is really demanded. This is just a guess, though.Haunted
@semicolon: chi is right. Use :set -ddump-simpl -ddump-ds to see more. "foo" is a CString, whereas ['f','o','o'] is already a list and evaluated at that point. By the way, it's better to ask such questions, since comment notifications might not get noticed.Eugene

© 2022 - 2024 — McMap. All rights reserved.