"Sharing" means f x
re-uses the x
that it creates; but with _Y g = g . g . g . g . ...
, each g
calculates its output anew (cf. this and this).
In that context, the sharing version has much worse memory usage, leads to a space leak.1
The definition of _Y
mirrors the usual lambda calculus definition's effect for the Y combinator, which emulates recursion by duplication, while true recursion refers to the same (hence, shared) entity.
In
x = f x
(_Y g) = g (_Y g)
both x
s refer to the same entity, but each of (_Y g)
s refer to equivalent, but separate, entity. That's the intention of it, anyway.
Of course thanks to referential transparency there's no guarantee in Haskell the language for any of this. But GHC the compiler does behave this way.
_Y g
is a common sub-expression and it could be "eliminated" by a compiler by giving it a name and reusing that named entity, subverting the whole purpose of it. That's why the GHC has the "no common sub-expressions elimination" -fno-cse
flag which prevents this explicitly. It used to be that you had to use this flag to achieve the desired behaviour here, but not anymore. GHC won't be as aggressive at common sub-expressions elimination anymore, with the more recent (read: several years now) versions.
disclaimer: I'm the author of that part of the page you're referring to. Was hoping for the back-and-forth that's usual on wiki pages, but it never came, so my work didn't get reviewed like that. Either no-one bothered, or it is passable (lacking major errors). The wiki seems to be largely abandoned for many years now.
1 The g
function involved,
(3:) . minus [5,7..] . foldr (\ (x:xs) ⟶ (x:) . union xs) []
. map (\ p ⟶ [p², p² + 2p..])
produces an increasing stream of all odd primes given an increasing stream of all odd primes. To produce a prime N
in value, it consumes its input stream up to the first prime above sqrt(N)
in value, at least. Thus the production points are given roughly by repeated squaring, and there are ~ log (log N)
of such g
functions in total in the chain (or "tower") of these primes producers, each immediately garbage collectible, the lowest one producing its primes given just the first odd prime, 3
, known a priori.
And with the two-staged _Y2 g = g x where { x = g x }
there would be only two of them in the chain, but only the top one would be immediately garbage collectible, as discussed at the referenced link above.