Validations in Haskell
Asked Answered
P

4

11

I have a few nested records that I need to validate, and I wonder what is an idiomatic Haskell way to do it.

To simplify:

data Record = Record {
  recordItemsA :: [ItemA],
  recordItemB :: ItemB
} deriving (Show)

data ItemA {
  itemAItemsC :: [ItemC]
} deriving (Show)

Requirements are:

  • Collect and return all validation errors
  • Some validations may be across items, e.g. ItemsA against ItemB
  • Strings are sufficient to represent errors

I currently have code that feels awkward:

type ErrorMsg = String

validate :: Record -> [ErrorMsg]
validate record =
  recordValidations ++ itemAValidations ++ itemBValidations
  where
    recordValidations :: [ErrorMsg]
    recordValidations = ensure (...) $
      "Invalid combination: " ++ (show $ recordItemsA record) ++ " and " ++ (show $ recordItemsB record)
    itemAValidations :: [ErrorMsg]
    itemAValidations = concat $ map validateItemA $ recordItemsA record
    validateItemA :: ItemA -> [ErrorMsg]
    validateItemA itemA = ensure (...) $
      "Invalid itemA: " ++ (show itemA)
    itemBValidations :: [ErrorMsg]
    itemBValidations = validateItemB $ recordItemB record
    validateItemB :: ItemB -> [ErroMsg]
    validateItemB itemB = ensure (...) $
      "Invalid itemB: " ++ (show itemB)

ensure :: Bool -> ErrorMsg -> [ErrorMsg]
ensure b msg = if b then [] else [msg]
Phial answered 4/1, 2012 at 3:16 Comment(6)
have you considered bitbucket.org/dibblego/validation ?Commutation
Thanks for a suggestion, looks very interesting. The same project uses uu-parsinglib for parsing, so applicative style validation would be a good fit.Phial
noob question here: what is the (...) notation?Halpern
(...) was just omitted boring parts, not some fancy operator.Phial
@MauricioScheffer that link is no longer valid :( EDIT: I found this, is this what you meant? hackage.haskell.org/package/ValidationReformism
@NoICE yes, it moved to github: github.com/tonymorris/validationCommutation
I
5

What you have already is basically fine, it just needs some clean-up:

  • The sub-validations should be top-level definitions, as they're fairly involved. (By the way, type signatures on where clause definitions are usually omitted.)
  • Lack of consistent naming convention
  • Lots of (++)s in sequence can get ugly — use concat (or perhaps unwords) instead
  • Minor formatting quirks (there are some superfluous parentheses, concat . map f is concatMap f, etc.)

The product of all this:

validateRecord :: Record -> [ErrorMsg]
validateRecord record = concat
  [ ensure (...) . concat $
      [ "Invalid combination: ", show (recordItemsA record)
      , " and ", show (recordItemB record)
      ]
  , concatMap validateItemA $ recordItemsA record
  , validateItemB $ recordItemB record
  ]

validateItemA :: ItemA -> [ErrorMsg]
validateItemA itemA = ensure (...) $ "Invalid itemA: " ++ show itemA

validateItemB :: ItemB -> [ErrorMsg]
validateItemB itemB = ensure (...) $ "Invalid itemB: " ++ show itemB

I think that's pretty good. If you don't like the list notation, you can use the Writer [ErrorMsg] monad:

validateRecord :: Record -> Writer [ErrorMsg] ()
validateRecord record = do
  ensure (...) . concat $
    [ "Invalid combination: ", show (recordItemsA record)
    , " and ", show (recordItemB record)
    ]
  mapM_ validateItemA $ recordItemsA record
  validateItemB $ recordItemB record

validateItemA :: ItemA -> Writer [ErrorMsg] ()
validateItemA itemA = ensure (...) $ "Invalid itemA: " ++ show itemA

validateItemB :: ItemB -> Writer [ErrorMsg] ()
validateItemB itemB = ensure (...) $ "Invalid itemB: " ++ show itemB

ensure :: Bool -> ErrorMsg -> Writer [ErrorMsg] ()
ensure b msg = unless b $ tell [msg]
Izak answered 4/1, 2012 at 3:40 Comment(5)
Is this true? See #8732358Botulinus
@pat: Huh, right you are. I've removed the staetment from my answer.Izak
You should use Data.Sequence and replace [ErrorMsg] with (Seq ErrorMsg) as the Monoid. Then, when the Writer has finished, you can turn the Seq ErrorMsg into a [ErrorMsg] with Data.Foldable.toList.Botulinus
A Seq would probably not be ideal due to constant factors, but a difference list would be ideal here. Still, premature optimisation and all that :)Izak
Yes, you're right; see here for more info. The difference list package is hereBotulinus
P
5

Read the 8 ways to report errors in Haskell article. For your particular case, as you need to collect all errors and not only the first one, the approach with Writer monad suggested by @ehird seems to fit best, but it's good to know other common approaches.

Patio answered 4/1, 2012 at 11:2 Comment(0)
B
0

Building on @ehird's answer, you could introduce a Validate typeclass:

class Validate a where
  validate :: a -> [ErrorMsg]

instance Validate a => Validate [a] where
  validate = concatMap validate

instance Validate Record where
  validate record = concat
    [ ensure (...) . concat $
      [ "Invalid combination: ", show (recordItemsA record)
      , " and ", show (recordItemB record)
      ]
    , validate $ recordItemsA record
    , validate $ recordItemB record
    ]

instance Validate ItemA where
  validate itemA = ensure (...) $ "Invalid itemA: " ++ show itemA

instance Validate ItemB where
  validate itemB = ensure (...) $ "Invalid itemB: " ++ show itemB
Botulinus answered 4/1, 2012 at 3:47 Comment(4)
I don't think this is necessarily a good idea; plain functions keep things simpler, and if there's ever two different kinds of validation that can be applied to a single type, this falls down. The lifting to lists is clever, though.Izak
True, but couldn't one make the same argument about any typeclass...? i.e. what if there are ever two different kinds of show that can be applied to a single type?Botulinus
Indeed, that's why I'm conservative about using typeclasses :) Show has the constraint that it's basically just for debugging and quick hacks, its output should be syntactically-valid Haskell, and it should preferably be semantically-valid Haskell that evaluates to a value equal to the argument passed to show. Most wishes for "alternate Show instances" are trying to go against these informal constraints. It's about trade-offs; e.g. there isn't much desire to use two sets of numeric functions on the same type is and if there is, it's heavily outweighed by the convenience of Num.Izak
Yes, I see. The real power of type classes is in being able to write generic functions that use them as contexts in their type signatures. With the exception of container instances that simply broadcast the function to sub-items (as in the list instance of Validate above), it would probably be pretty rare to find a function that would want to validate some piece of data, knowing only that it was an instance of Validate. Usually, such a function would know the specific type of the data, which renders the typeclass moot.Botulinus
B
0

One thing you might consider trying is, rather than validating your data afterwards, use lenses from the excellent fclabels package as your interface to your data (rather than pattern-matching/type constructors) to ensure that your data is always correct.

Check out the variant that supports failure here and build your lens by passing a setter and getter that do some validation on the datatype to the lens function.

If you need some more complicated error reporting or whatnot, take a look at the implementation of the Maybe variant of lens and define your lens in terms of the abstract interface.

Boardman answered 4/1, 2012 at 22:16 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.