Tying the knot on mutually recursive ADTs with well-typed error handling
Asked Answered
R

1

10

(Note: this post is a literate-haskell file. You can copy-paste it into a text buffer, save it as someFile.lhs, and then run it using ghc.)

Problem description: I want ot create a graph with two different node types that reference each other. The example below is very simplified. The two data types A and B, are virtually identical here, but there's a reason for them to be different in the original program.

We'll get the boring stuff out of the way.

> {-# LANGUAGE RecursiveDo, UnicodeSyntax #-}
> 
> import qualified Data.HashMap.Lazy as M
> import Data.HashMap.Lazy (HashMap)
> import Control.Applicative ((<*>),(<$>),pure)
> import Data.Maybe (fromJust,catMaybes)

The data type definitions are themselves trivial:

> data A = A String B
> data B = B String A

In order to symbolize a difference between the two, we'll give them a different Show instance.

> instance Show A where
>   show (A a (B b _)) = a ++ ":" ++ b
> 
> instance Show B where
>   show (B b (A a _)) = b ++ "-" ++ a

And then tying the knot is of course trivial.

> knot ∷ (A,B)
> knot = let a = A "foo" b
>            b = B "bar" a
>        in (a,b)

This results in:

ghci> knot
(foo:bar,bar-foo)

That's exactly what I want!

Now the tricky part. I want to create this graph at runtime from user input. This means I need error handling. Let's simulate some (valid but nonsensical) user input:

> alist ∷ [(String,String)]
> alist = [("head","bot"),("tail","list")]
> 
> blist ∷ [(String,String)]
> blist = [("bot","tail"),("list","head")]

(the user would of course not input these lists directly; the data would first be massaged into this form)

It is trivial to do this without error handling:

> maps ∷ (HashMap String A, HashMap String B)
> maps = let aMap = M.fromList $ makeMap A bMap alist
>            bMap = M.fromList $ makeMap B aMap blist
>        in (aMap,bMap)
> 
> makeMap ∷ (String → b → a) → HashMap String b
>           → [(String,String)] → [(String,a)]
> makeMap _ _ [] = []
> makeMap c m ((a,b):xs) = (a,c a (fromJust $ M.lookup b m)):makeMap c m xs

This will obviously fail as soon as the input list of Strings references something that isn't found in the respective maps. The "culprit" is fromJust; we just assume that the keys will be there. Now, I could just ensure that the user input is actually valid, and just use the above version. But this would require two passes and wouldn't be very elegant, would it?

So I tried using the Maybe monad in a recursive do binding:

> makeMap' ∷ (String → b → a) → HashMap String b
>           → [(String,String)] → Maybe (HashMap String a)
> makeMap' c m = maybe Nothing (Just . M.fromList) . go id
>   where go l [] = Just (l [])
>         go l ((a,b):xs) = maybe Nothing (\b' → go (l . ((a, c a b'):)) xs) $
>                                 M.lookup b m
> 
> maps' ∷ Maybe (HashMap String A, HashMap String B)
> maps' = do rec aMap ← makeMap' A bMap alist
>                bMap ← makeMap' B aMap blist
>            return (aMap, bMap)

But this ends up looping indefinitely: aMap needs bMap to be defined, and bMap needs aMap. However, before I can even begin to access the keys in either map, it needs to be fully evaluated, so that we know whether it is a Just or a Nothing. The call to maybe in makeMap' is what bites me here, I think. It contains a hidden case expression, and thus a refutable pattern.

The same would be true for Either so using some ErrorT transformer won't help us here.

I don't want to fall back to run-time exceptions, as that would bounce me back to the IO monad, and that would be admitting defeat.

The minimal modification to the above working example is to just remove fromJust, and only take the results that actually work.

> maps'' ∷ (HashMap String A, HashMap String B)
> maps'' = let aMap = M.fromList . catMaybes $ makeMap'' A bMap alist
>              bMap = M.fromList . catMaybes $ makeMap'' B aMap blist
>          in (aMap, bMap)
> 
> makeMap'' ∷ (String → b → a) → HashMap String b → [(String,String)] → [Maybe (String,a)]
> makeMap'' _ _ [] = []
> makeMap'' c m ((a,b):xs) = ((,) <$> pure a <*> (c <$> pure a <*> M.lookup b m))
>                            :makeMap'' c m xs

This doesn't work either, and, curiously, results in slightly different behaviour!

ghci> maps' -- no output
^CInterrupted.
ghci> maps'' -- actually finds out it wants to build a map, then stops.
(fromList ^CInterrupted

Using the debugger showed that these aren't even infinite loops (as I would have expected) but execution just stops. With maps' I get nothing, with the second attempt, I at least get to the first lookup, and then stall.

I'm stumped. In order to create the maps, I need to validate the input, but in order to validate it, I need to create the maps! The two obvious answers are: indirection, and pre-validation. Both of these are practical, if somewhat inelegant. I would like to know whether it is possible to do the error-catching in-line.

Is what I'm asking possible with Haskell's type system? (It probably is, and I just can't find out how.) It is obviously possible by percolating exceptions to the toplevel at fromJust, then catching them in IO, but that's not how I'd like to do it.

Raddled answered 3/1, 2014 at 2:5 Comment(6)
I actually really like the idea of just doing a quick preprocessing check and have a feeling its going to end up much simpler than the error-handling approaches you are thinking about. Additionally, are you sure you want to represent your graph using a tied-knot representation? The structure can unravel if you map over it and its harder to check nodes for identity or avoid loops during traversals. Directly using the adjacency lists you made or using mutable references might be alternative solutions to consider.Implicit
I've done something similar in a ReaderT maps (Writer (Either errs assocs)) a monad. The tricky bit is that when you tell the writer about an association (or error) you have to unconditionally tell (ie, tell a thunk that later will decide if it's an error or an association). Then when you tie the knot, run the writer and at that point force the log while constructing the maps that will be fed back into the reader. The whole thing turned out to be quite fragile, so while it was an interesting exercise, it's worth rewriting as two separate passes, IMO.Indifference
@Indifference - isn't what you describe effectively a two-pass algorithm? One pass in the Writer and one pass making the maps for the ReaderT`.Ewe
What do you actually want to do for invalid input - just produce Nothing or Left error or similar for the whole calculation?Ewe
@GaneshSittampalam ultimately, I would like to produce a Left error, but since both Maybe and Either fall under the same abstraction classes, I went with the simpler one in the reduced minimal example here.Raddled
@Lambdageek, I think I know what you mean. Here's a similar idea. I also think that it could be accomplished in an imperative-style fashion using ST. I do fear that you're correct, and the entire thing might turn out to be too fragile. I'll have to see if it works well within the context of the program I'm writing (an interpreter, and this is the type graph, where A are like records, and B are the actual types.)Raddled
I
6

The problem is that when you tie the knot you don't "build" the structures of A and B but rather just declare how they are supposed to be built and then they get evaluated when needed. This naturally means that if the validation is done "in-line" with evaluation then the error checking must happen in IO because that's the only thing that can trigger evaluation (in your case, it's when you print the output of show).

Now, if you want to detect the error earlier you must declare the structure so that we can validate each node without traversing the whole infinite, cyclic structure. This solution is semantically the same as pre-validating the input, but hopefully you'll find it syntactically more elegant

import Data.Traversable (sequenceA)

maps' :: Maybe (HashMap String A, HashMap String B)
maps' =
  let maMap = M.fromList $ map (makePair A mbMap) alist
      mbMap = M.fromList $ map (makePair B maMap) blist
      makePair c l (k,v) = (k, c k . fromJust <$> M.lookup v l)
  in (,) <$> sequenceA maMap <*> sequenceA mbMap

This first defines the mutually recursive maps maMap and mbMap which have the types HashMap String (Maybe A) and HashMap String (Maybe B) respectively, meaning that they'll contain all the node keys but the keys are associated with Nothing if the node was invalid. The "cheating" happens in

c k . fromJust <$> M.lookup v l

This will essentially just look up the referenced node with M.lookup and if that succeeds we just assume that the returned node is valid and use fromJust. This prevents the infinite loop that would otherwise occur if we tried to validate the Maybe layers all the way down. If the lookup fails then this node is invalid i.e. Nothing.

Next we turn the HashMap String (Maybe a) maps "inside out" into Maybe (HashMap String a) maps using sequenceA from Data.Traversable. The resulting value is Just _ only if every node inside the map was Just _ and Nothing otherwise. This guarantees that the fromJust we used above cannot fail.

Ias answered 3/1, 2014 at 12:6 Comment(1)
Thanks, I believe this might be the best method. Pre-validating the Strings themselves via Sets would be more cumbersome. I had thought of the possibility of generating a HashMap String (Maybe A), but shied away because it meant using two passes. However, I believe the Reader/Writer method might be too complicated.Raddled

© 2022 - 2024 — McMap. All rights reserved.