This is primary a question of runtime value representation.
Most types in Haskell are "lifted", i.e., they may contain bottom
. In reality, this means that even Int
is represented by a pointer, either to a machine Int or a computation (which could diverge, i.e.,bbottom
).
Here is a quick primer on boxed vs. lifted (vs. primitive).
I.e., Array#
is not lifted, as it may not be bottom, but it is boxed, as it may contain lifted values.
So what does that have to do with non-strict evaluation?
A non-evaluated computation is called a "thunk".
Having a linked list of pointers to Ints is way worse for cache locality then having a compact array of machine integers and so is chasing pointers to thunks.
That is why there has been much research and engineering on demand analysis - if a the value of a computation will be needed anyway, it can be made strict.
AFAIK, "boxity" analysis is part of demand analysis. Anyway, GHC will try to get rid of pointers where possible.
But unfortunately I don't have any empirical data or studies on this.
EDIT:
Some more reading on strictness/boxity analysis:
GHC -fstrictness
Demand analyzer in GHC
SO Answer by Richard Eisenberg, giving some more clarification on "levity" (lifted/unlifted), bottom and undefined
.
Edit2:
Found a paper from 2002:
Nethercote, Mycroft; The cache behaviour of large lazy functional programs on stock hardware