NFData instance for the Coyoneda type

E

2

7

I have a Haskell project that uses fmap on data structures very intensively. In order to avoid traversing the same data structure again and again, while retaining the possibility to use fmap liberally, I decided to use the Coyoneda-type to guard some of the bigger structures.

The Coyoneda type has constructor Coyoneda :: (x -> y) -> f x -> Coyoneda f y. The idea is that by parametricity, more precisely by the co-Yoneda lemma, the types f a and Coyoneda f a are isomorphic, but the advantage of the Coyoneda type is that it postpones the actual structure traversal.

For example, in the following code, the first expression traverses the underlying structure 3 times, and the second one only once:

fmap k $ fmap h $ fmap g $ (x :: f a)
lowerCoyoneda $ fmap k $ fmap h $ fmap g $ liftCoyoneda $ (x :: f a)

Indeed, the second line reduces as follows:

lowerCoyoneda $ fmap k $ fmap h $ fmap g $ liftCoyoneda $ x
lowerCoyoneda $ fmap k $ fmap h $ fmap g $ Coyoneda id x
lowerCoyoneda $ fmap k $ fmap h $ Coyoneda g x
lowerCoyoneda $ fmap k $ Coyoneda (h . g) x
lowerCoyoneda $ Coyoneda (k . h . g) x
fmap (k . h . g) x

so it effectively postpones the actual structure traversal until you apply lowerCoyoneda.

This had a massive impact on time and a big impact on space performance, but I am still not satisfied. For this reason, I want to start using deepseq (or similar) as (indirectly) provided by the NFData typeclass. So I want to implement NFData for my functors, which now have Coyoneda-guards in their definitions.

The trouble is that reducing lambdas to normal form doesn't shrink the size of the data in their closure.

Mathematically, it would be sound to simply reduce Coyoneda g x to Coyoneda id (fmap g x). I am wondering if there is some unsafe hack - some sort of runtime rewrite rule - to implement this. Other solutions are also appreciated.

Emalee answered 13/8, 2019 at 17:53 Comment(4)

(Nevermind, I see why it would be a problem now - the primitive is rnf, not deepseq.) I'm not certain about whether it would actually pay off for performance anyway. Using rnf/deepseq is usually very bad for performance. It tends to change O(n) algorithms to O(n^2). – Skyjack 13/8, 2019 at 18:48

"Mathematically, it would be sound to simply reduce Coyoneda g x to Coyoneda id (fmap g x)" - isn't that just liftCoyoneda . lowerCoyoneda? – Venuti 13/8, 2019 at 18:58

Benjamin Hodgson: Yes, but the way I understand, rnf is not a function with an input and an output, but calling it reduces its argument "wherever it is in memory" so to say, i.e. in such a way that anything sharing that argument will also work on the reduced version in the future. I would like to apply liftCoyoneda . lowerCoyoneda as part of the reduction performed by rnf, so not as an ordinary Haskell function. If there is any Haskell feature that allows me to do this, it is going to be an exceptionally unsafe feature. – Emalee 13/8, 2019 at 22:21

These "slides" seem relevant. – Emalee 27/8, 2019 at 9:10

C

1

Yes, you can do something like that, and yes, it's kind of ugly.

module Data.Functor.Coyoneda.NF where

import qualified Data.Functor.Coyoneda as S
import Data.IORef
import Control.DeepSeq
import System.IO.Unsafe
import Control.Exception

newtype Coyoneda f a =
  UnsafeCoyonedaFromRef {unsafeCoyonedaToRef :: IORef (S.Coyoneda f a)}

newCoyonedaIO :: S.Coyoneda f a -> IO (Coyoneda f a)
newCoyonedaIO c = UnsafeCoyonedaFromRef <$> newIORef c

newCoyoneda :: S.Coyoneda f a -> Coyoneda f a
newCoyoneda = unsafePerformIO . newCoyonedaIO

unsafeReadCoyonedaIO :: Coyoneda f a -> IO (S.Coyoneda f a)
unsafeReadCoyonedaIO (UnsafeCoyonedaFromRef ref) = readIORef ref

unsafeReadCoyoneda :: Coyoneda f a -> S.Coyoneda f a
unsafeReadCoyoneda = unsafePerformIO . unsafeReadCoyonedaIO

{-| `fmap` happens under the reference, but does NOT traverse `f`. -}
instance Functor (Coyoneda f) where
  fmap f c = unsafePerformIO $ do
    q <- unsafeReadCoyonedaIO c
    newCoyonedaIO $ fmap f q

instance (Functor f, NFData (f a)) => NFData (Coyoneda f a) where
  rnf (UnsafeCoyonedaFromRef ref) = unsafePerformIO $ do
    co <- readIORef ref
    let fx = S.lowerCoyoneda co
    -- We use evaluate because we want to be really sure the reduction to NF
    -- succeeds and we don't install bottom in the IORef.
    evaluate (rnf fx)
    writeIORef ref (S.liftCoyoneda fx)

liftCoyoneda :: f a -> Coyoneda f a
liftCoyoneda = newCoyoneda . S.liftCoyoneda

lowerCoyoneda :: Functor f => Coyoneda f a -> f a
lowerCoyoneda = S.lowerCoyoneda . unsafeReadCoyoneda

What makes unsafeReadCoyoneda "unsafe"? It subtly breaks referential transparency. If someone can extract the wrapped f a, then they may be able to do something like traverse the value, forcing the supposedly hidden elements. Or if f has a somewhat bogus fmap that changes its structure then it's possible to observe a change from supposedly pure code (ouch).

Currin answered 14/8, 2019 at 12:31 Comment(5)

For other readers: various precautions are apparently necessary when using unsafePerformIO. – Emalee 14/8, 2019 at 12:48

@dremodaris, while I swam some laps, I realized I made a couple mistakes, one of which is important. I hope everything is right now. This particular application is sufficiently pure in spirit that most of the nastiest unsafePerformIO caveats shouldn't apply. – Currin 14/8, 2019 at 13:17

@dremodaris, I believe it's safe in a world without fully-polymorphic seq in which every Functor instance is lawful. In real Haskell, it's a little unsafe. – Currin 14/8, 2019 at 13:34

@dremodaris, actually, I think there's no way to be totally safe here..... Suppose fmap f (xs :: [a]) = drop 1 (map f xs). Now fmap f will drop 1 element but fmap f . force will drop 2. – Currin 14/8, 2019 at 14:23

Yes, I've defined a copy of unsafeReadCoyoneda called readCoyoneda which takes a Functor instance and am pretending that all such instances are well-behaved. I'm building further on that. Among other things, I've defined a bidirectional Coyoneda pattern as you have suggested in an earlier version of your answer. – Emalee 14/8, 2019 at 14:51

U

1

The trouble is that reducing lambdas to normal form doesn't reduce the thunk in their body

Functions do not contain "thunks". They contain a pointer to immutable code, possibly along with a closure of values of captured variables. Thunks might evaluate to functions, but functions themselves are always considered to be fully evaluated.

Also, since functions contain pointers to immutable machine code, function bodies cannot be updated or modified at runtime in any way. rnf is a bit of a misnomer, since GHC can't evaluate under function binders, thus can't reduce to normal forms in the usual sense in the theory of lambda calculi.

rnf does exactly the same as seq on functions, and seq itself only does anything noteworthy if you have a thunk which reduces to a function. This is not common in Haskell code; one such example would be a lazy lookup from a structure which contains functions.

rnf should be used sparingly, since it fully traverses data structures. The reason people usually use rnf is to get rid of space leaks, not to (directly) make things faster.

In summary, you could try using seq on the function part of Coyoneda, and probably should not use rnf, but I doubt that seq would help in a measurable way.

Ungenerous answered 13/8, 2019 at 21:29 Comment(4)

Thanks for rectifying the terminology. After applying the Coyoneda trick, the remaining space leak is more detrimental than the remaining time leak, and I hope that rnfing a data structure right before storing it in the state might resolve things. The problem with applying seq instead is - heh - that it doesn't go deep enough, so I would need to apply seq all over, but I suspect that this would require me to duplicate a lot of code where I might need a strict and a lazy version. – Emalee 13/8, 2019 at 22:26

I also think you're right about seq not helping much when applying it to a function: it doesn't make the closure go away. – Emalee 13/8, 2019 at 22:27

@Emalee if you have space leaks in a data structure, you can try making the data structure strict, besides using rnf. – Cunctation 14/8, 2019 at 10:18

The data structures go through a computation-heavy phase, and are then stored somewhere. During computation, I think laziness and Coyoneda are quite useful, but I would like to shrink the data before storing it. – Emalee 14/8, 2019 at 12:16

C

1