Test if a value has been evaluated to weak head normal form

Asked 24/2, 2015 at 2:51 Answered 25/2, 2015 at 16:45

haskell lazy-evaluation thunk weak-head-normal-form

In Haskell, is it possible to test if a value has been evaluated to weak head normal form? If a function already exists, I would expect it to have a signature like

evaluated :: a -> IO Bool

There are a few places that similar functionality lives.

A previous answer introduced me to the :sprint ghci command, which will print only the portion of a value that has already been forced to weak head normal form. :sprint can observe whether or not a value has been evaluated:

> let l = ['a'..]
> :sprint l
l = _
> head l
'a'
> :sprint l
l = 'a' : _

It's possible in IO to examine properties that would otherwise be off-limits. For example, it's possible to compare in IO to see if two values came from the same declaration. This is provided by the StableNames in System.Mem.StableName and used famously to solve the observable sharing problem in data-reify. The related StablePtr does not provide a mechanism to check if the referenced value is in weak head normal form.

Aribold answered 24/2, 2015 at 2:51 Comment(3)

Amusingly, this is already the second result in Google for "haskell check if whnf", at least for me :). – Astroid 24/2, 2015 at 2:59

On a hopeful note, the GHC Commentary suggests that each heap object info table includes the closure type, which looks like it should give you the kind of thing you're looking for. On a less-hopeful note, I have the feeling that GHC's unboxing transformation will make this whole idea rather slippery. – Yokoyokohama 24/2, 2015 at 3:56

@Yokoyokohama Thanks. The GHC Commentary was very helpful in writing an approximate answer. – Aribold 24/2, 2015 at 16:57

I'm not sure that there's anything pre-packaged for this. However, one can code it up:

import Data.IORef
import System.IO.Unsafe

track :: a -> IO (a, IO Bool)
track val = do
    ref <- newIORef False
    return
        ( unsafePerformIO (writeIORef ref True) `seq` val
        , readIORef ref
        )

Here's an example usage in ghci:

*NFTrack> (value, isEvaluated) <- track (undefined:undefined)
*NFTrack> isEvaluated
False
*NFTrack> case value of _:_ -> "neat!"
"neat!"
*NFTrack> isEvaluated
True

Of course, this will be tracking whether the wrapped write-and-then-return-the-original-value thunk is evaluated to WHNF, not whether the thing passed to track is evaluated to WHNF, so you'll want to put this as close to the thunk you're interested in as possible -- e.g. it will not be able to tell you whether a thunk made by somebody else has already been evaluated by somebody else before the tracking started. And of course consider using MVar instead of IORef if you need thread-safety.

Sample answered 24/2, 2015 at 3:36 Comment(2)

I believe NOINLINE is usually needed around unsafePerformIO (along with asbestos underwear) to keep it from being optimized away. – Yokoyokohama 24/2, 2015 at 3:59

AFAICS, here False means that the value is not in whnf, while True means that it is either in whnf or it is being evaluated right now (and has not yet produced a whnf). Maybe using val `seq` unsafePerformIO (writeIORef ref True) `seq` val can lead to the opposite guarantee (True guaranteeing whnf, and False non-whnf/evaluation in progress). With a 3-states state one could represent this even more precisely: unevaluated/in progress/whnf. – Insider 24/2, 2015 at 8:58

The ghci implementation for :sprint ultimately uses unpackClosure# from ghc-prim to inspect a closure. This can be combined with knowledge of the format of heap objects to determine if a closure has been evaluated all the way to weak head normal form.

There are a few ways to reproduce the inspection done by the ghci implementation for :sprint. The GHC api exposes getClosureData :: DynFlags -> a -> IO Closure in RtClosureInspect. The vacuum package, which depends only on ghc-prim, reproduces the code from RtClosureInspect and exposes getClosure :: a -> IO Closure. It's not immediately obvious how to examine either of these Closure representations to, for example, follow an indirection. The ghc-heap-view package inspects closures and exposes both a getClosureData :: a -> IO Closure and a detailed view of the Closure. ghc-heap-view depends on the GHC api.

We can write evaluated in terms of getBoxedClosureData from ghc-heap-view.

import GHC.HeapView

evaluated :: a -> IO Bool
evaluated = go . asBox
    where
        go box = do
            c <- getBoxedClosureData box
            case c of
                ThunkClosure     {} -> return False
                SelectorClosure  {} -> return False
                APClosure        {} -> return False
                APStackClosure   {} -> return False
                IndClosure       {indirectee = b'} -> go b'
                BlackholeClosure {indirectee = b'} -> go b'
                _ -> return True

This handling of blackhole closures may be incorrect while the blackhole is being evaluated. The handling of selector closures may be incorrect. The assumption that AP closures aren't in weak head normal form may be incorrect. The assumption that all other closures are in WHNF is almost certainly incorrect.

Example

Our example will require two concurrent threads to observe in one thread that the other is evaluating expressions.

import Data.Char
import Control.Concurrent

We can communicate information sideways out of a function without resorting to anything unsafe by selectively forcing evaluation. The following builds a stream of pairs of thunks in which we can choose to force one or the other of the pair.

mkBitStream :: Integer -> [(Integer, Integer)]
mkBitStream a = (a+2, a+3) : mkBitStream (a+1)

zero forces the first one and one forces the second one.

zero :: [(x, y)] -> [(x, y)]
zero ((x, _):t) = x `seq` t

one :: [(x, y)] -> [(x, y)]
one ((_, y):t) = y `seq` t

copy is an evil identity function that has the side effect of forcing bits in a stream based on inspecting the data.

copy :: (a -> Bool) -> [(x, y)] -> [a] -> [a]
copy f bs []     = []
copy f bs (x:xs) = let bs' = if f x then one bs else zero bs
                   in bs' `seq` (x:copy f bs' xs)

readBs reads our bit stream by checking if each of the thunks in a pair has been evaluated.

readBs :: [(x, y)] -> IO ()
readBs bs@((f, t):bs') = do
    f' <- evaluated f
    if f'
    then putStrLn "0" >> readBs bs'
    else do
        t' <- evaluated t
        if t'
        then putStrLn "1" >> readBs bs'
        else readBs bs

Forcing copy when printing it has the side effect of printing the information observed about the read string.

main = do
    let bs = mkBitStream 0
    forkIO (readBs bs)
    text <- getLine
    putStrLn (copy isAlpha bs text)
    getLine

If we run the program and provide the input abc123 we observe the side effect corresponding to checking if each of the characters isAlpha

abc123
abc123
1
1
1
0
0
0

Aribold answered 24/2, 2015 at 16:55 Comment(1)

I think you could probably improve this answer by adding a brief description of at least the most relevant closure types. – Yokoyokohama 24/2, 2015 at 21:35

A negative answer, for the record: It does not appear to be feasible to reuse the mechanism of sprint, because it is tightly tied to interpreted interactive evaluation as opposed to primitive runtime structures — as far as I can tell; I've never looked at GHC internals before.

I started by searching for “sprint” in the GHC source on GitHub, which turns out to share an implementation with the “print” command but for a Bool flag called force, and followed definitions until I got to RtClosureInspect.cvObtainTerm which appears to be a specialized evaluator.

Electrodeposit answered 24/2, 2015 at 3:52 Comment(1)

Thanks. This was helpful in writing an approximate answer. The closure inspection code used in RtClosureInspect is exposed in the GHC api. – Aribold 24/2, 2015 at 17:11

There has been a proposal recently, maybe it's somewhere implemented already https://mail.haskell.org/pipermail/libraries/2015-February/024917.html

Malagasy answered 25/2, 2015 at 16:45 Comment(0)

Example

Recommended topics

Hot tags