Testing functions in Haskell that do IO
Asked Answered
W

5

34

Working through Real World Haskell right now. Here's a solution to a very early exercise in the book:

-- | 4) Counts the number of characters in a file
numCharactersInFile :: FilePath -> IO Int
numCharactersInFile fileName = do
    contents <- readFile fileName
    return (length contents)

My question is: How would you test this function? Is there a way to make a "mock" input instead of actually needing to interact with the file system to test it out? Haskell places such an emphasis on pure functions that I have to imagine that this is easy to do.

Winniewinnifred answered 10/9, 2011 at 6:37 Comment(8)
What then will remain to test after "mocking" filesystem? length function?Parapsychology
Haskell may place an emphasis on purity, but the IO monad is NOT pure.Immobility
If you did something more interesting than length on the contents of the file, you could easily test that function, which would be String -> a for some a.Flick
@Dietrich Yes it is. Its unsafePerformIO that is impure ;)Shulman
@monadic: IO is impure even without unsafePerformIO or any other unsafe* function.Immobility
@Dietrich No. >>= and >> and return are pure.Shulman
See also Testing IO actions with Monadic QuickCheckWithhold
@Derek You may be interested in trying out hs-di as it makes it possible to mock dependencies of functions that do IO. It's a dependency injection library for Haskell. A benefit I see of using a DI library is that it makes code testable with very little and much simpler modifications -- compared to some other approaches I see listed in the answers, e.g. introducing type classes or identifying and manually trying to factor out pure parts of a computation.Felicio
B
19

As Alexander Poluektov already pointed out, the code you are trying to test can easily be separated into a pure and an impure part. Nevertheless I think it is good to know how to test such impure functions in haskell.
The usual approach to testing in haskell is to use quickcheck and that's what I also tend to use for impure code.

Here is an example of how you might achieve what you are trying to do which gives you kind of a mock behavior * :

import Test.QuickCheck
import Test.QuickCheck.Monadic(monadicIO,run,assert)
import System.Directory(removeFile,getTemporaryDirectory)
import System.IO
import Control.Exception(finally,bracket)

numCharactersInFile :: FilePath -> IO Int
numCharactersInFile fileName = do
    contents <- readFile fileName
    return (length contents)

Now provide an alternative function (Testing against a model):

numAlternative ::  FilePath -> IO Integer
numAlternative p = bracket (openFile p ReadMode) hClose hFileSize

Provide an Arbitrary instance for the test environment:

data TestFile = TestFile String deriving (Eq,Ord,Show)
instance Arbitrary TestFile where
  arbitrary = do
    n <- choose (0,2000)
    testString <- vectorOf n $ elements ['a'..'z'] 
    return $ TestFile testString

Property testing against the model (using quickcheck for monadic code):

prop_charsInFile (TestFile string) = 
  length string > 0 ==> monadicIO $ do
    (res,alternative) <- run $ createTmpFile string $
      \p h -> do
          alternative <- numAlternative p
          testRes <- numCharactersInFile p
          return (testRes,alternative)
    assert $ res == fromInteger alternative

And a little helper function:

createTmpFile :: String -> (FilePath -> Handle -> IO a) -> IO a
createTmpFile content func = do
      tempdir <- catch getTemporaryDirectory (\_ -> return ".")
      (tempfile, temph) <- openTempFile tempdir ""
      hPutStr temph content
      hFlush temph
      hClose temph
      finally (func tempfile temph) 
              (removeFile tempfile)

This will let quickCheck create some random files for you and test your implementation against a model function.

$ quickCheck prop_charsInFile 
+++ OK, passed 100 tests.

Of course you could also test some other properties depending on your usecase.


* Note about the my usage of the term mock behavior:
The term mock in the object oriented sense is perhaps not the best here. But what is the intention behind a mock?
It let's you test code that needs access to a resource that usually is

  • either not available at testing time
  • or is not easily controllable and thus not easy to verify.

By shifting the responsibility of providing such a resource to quickcheck, it suddenly becomes feasible to provide an environment for the code under test that can be verified after a test run.
Martin Fowler describes this nicely in an article about mocks :
"Mocks are ... objects pre-programmed with expectations which form a specification of the calls they are expected to receive."
For the quickcheck setup I'd say that files generated as input are "pre-programmed" such that we know about their size (== expectation). And then they are verified against our specification (== property).

Borborygmus answered 10/9, 2011 at 8:16 Comment(2)
This doesn't act, speak, or talk like a mock. A mock should not be touching the file system. I do think the OP might need to introduce a typeclass to abstract the use of IO in order to use mock object, just as any other language does. I.e in C++ virtual functions are required, in Java, interfaces are required to use mocking frameworks.Malta
@user239558: good remark. I added a clarification to my usage of the term mock.Borborygmus
C
46

You can make your code testable by using a type-class-constrained type variable instead of IO.

First, let's get the imports out of the way.

{-# LANGUAGE FlexibleInstances #-}
import qualified Prelude
import Prelude hiding(readFile)
import Control.Monad.State

The code we want to test:

class Monad m => FSMonad m where
    readFile :: FilePath -> m String

-- | 4) Counts the number of characters in a file
numCharactersInFile :: FSMonad m => FilePath -> m Int
numCharactersInFile fileName = do
    contents <- readFile fileName
    return (length contents)

Later, we can run it:

instance FSMonad IO where
    readFile = Prelude.readFile

And test it too:

data MockFS = SingleFile FilePath String

instance FSMonad (State MockFS) where 
               -- ^ Reader would be enough in this particular case though
    readFile pathRequested = do
        (SingleFile pathExisting contents) <- get
        if pathExisting == pathRequested
            then return contents
            else fail "file not found"


testNumCharactersInFile :: Bool
testNumCharactersInFile =
    evalState
        (numCharactersInFile "test.txt") 
        (SingleFile "test.txt" "hello world")
      == 11

This way your code under test needs very little modification.

Charyl answered 10/9, 2011 at 21:16 Comment(7)
This is the only correct approach IMO. It is unfortunate that these isn't a set of standard mockable typeclasses for this, along with standard mockability.Malta
Agreed. Best attempt at addressing a "real" mock.Matteroffact
This solution uses TypeSynonymInstances. Is this frowned upon?Matteroffact
@Dave, that extension is only used in the testing code for brewity and you can easily avoid that by making a newtype.Charyl
I know I'm asking a noob question (and I'm a noob haskeller), but I've tried not to use the language extension and I wasn't able to get it working, could you share the code again please?Haematoid
@Milad, I added the import section so the code is now self-contained. If you want to get rid of FlexibleInstances, you'd have to wrap State MockFS into a newtype.Charyl
Hi, thanks for the response. I've removed FlexibleInstances, and I have added the newtype, but it doesn't compile -- see gist.github.com/miladhub/e039b38d1dcb24f4234da79d8780fe58 The error is: ` testing.hs:20:18: error: • Expected kind ‘* -> ’, but ‘StateMockFS’ has kind ‘’ • In the first argument of ‘FSMonad’, namely ‘StateMockFS’ In the instance declaration for ‘FSMonad StateMockFS’ | 20 | instance FSMonad StateMockFS where | ^^^^^^^^^^^ `Haematoid
B
19

As Alexander Poluektov already pointed out, the code you are trying to test can easily be separated into a pure and an impure part. Nevertheless I think it is good to know how to test such impure functions in haskell.
The usual approach to testing in haskell is to use quickcheck and that's what I also tend to use for impure code.

Here is an example of how you might achieve what you are trying to do which gives you kind of a mock behavior * :

import Test.QuickCheck
import Test.QuickCheck.Monadic(monadicIO,run,assert)
import System.Directory(removeFile,getTemporaryDirectory)
import System.IO
import Control.Exception(finally,bracket)

numCharactersInFile :: FilePath -> IO Int
numCharactersInFile fileName = do
    contents <- readFile fileName
    return (length contents)

Now provide an alternative function (Testing against a model):

numAlternative ::  FilePath -> IO Integer
numAlternative p = bracket (openFile p ReadMode) hClose hFileSize

Provide an Arbitrary instance for the test environment:

data TestFile = TestFile String deriving (Eq,Ord,Show)
instance Arbitrary TestFile where
  arbitrary = do
    n <- choose (0,2000)
    testString <- vectorOf n $ elements ['a'..'z'] 
    return $ TestFile testString

Property testing against the model (using quickcheck for monadic code):

prop_charsInFile (TestFile string) = 
  length string > 0 ==> monadicIO $ do
    (res,alternative) <- run $ createTmpFile string $
      \p h -> do
          alternative <- numAlternative p
          testRes <- numCharactersInFile p
          return (testRes,alternative)
    assert $ res == fromInteger alternative

And a little helper function:

createTmpFile :: String -> (FilePath -> Handle -> IO a) -> IO a
createTmpFile content func = do
      tempdir <- catch getTemporaryDirectory (\_ -> return ".")
      (tempfile, temph) <- openTempFile tempdir ""
      hPutStr temph content
      hFlush temph
      hClose temph
      finally (func tempfile temph) 
              (removeFile tempfile)

This will let quickCheck create some random files for you and test your implementation against a model function.

$ quickCheck prop_charsInFile 
+++ OK, passed 100 tests.

Of course you could also test some other properties depending on your usecase.


* Note about the my usage of the term mock behavior:
The term mock in the object oriented sense is perhaps not the best here. But what is the intention behind a mock?
It let's you test code that needs access to a resource that usually is

  • either not available at testing time
  • or is not easily controllable and thus not easy to verify.

By shifting the responsibility of providing such a resource to quickcheck, it suddenly becomes feasible to provide an environment for the code under test that can be verified after a test run.
Martin Fowler describes this nicely in an article about mocks :
"Mocks are ... objects pre-programmed with expectations which form a specification of the calls they are expected to receive."
For the quickcheck setup I'd say that files generated as input are "pre-programmed" such that we know about their size (== expectation). And then they are verified against our specification (== property).

Borborygmus answered 10/9, 2011 at 8:16 Comment(2)
This doesn't act, speak, or talk like a mock. A mock should not be touching the file system. I do think the OP might need to introduce a typeclass to abstract the use of IO in order to use mock object, just as any other language does. I.e in C++ virtual functions are required, in Java, interfaces are required to use mocking frameworks.Malta
@user239558: good remark. I added a clarification to my usage of the term mock.Borborygmus
S
9

For that you will need to modify the function such that it becomes:

numCharactersInFile :: (FilePath -> IO String) -> FilePath -> IO Int
numCharactersInFile reader fileName = do
                         contents <- reader fileName
                         return (length contents)

Now you can pass any mock function that takes a file path and return IO string such as:

fakeFile :: FilePath -> IO String
fakeFile fileName = return "Fake content"

and pass this function to numCharactersInFile.

Sire answered 10/9, 2011 at 7:46 Comment(1)
Replace IO with Monad m => m and it becomes Haskell-ish enough.Charyl
P
8

The function consists from two parts: impure (reading part content as String) and pure (calculating the length of String).

The impure part cannot be "unit"-tested by definition. The pure part is just call to the library function (and of course you can test it if you want :) ).

So there is nothing to mock and nothing to unit-test in this example.

Put it another way. Consider you have an equal C++ or Java implementation (*): reading content and then calculating its length. What would you really want to mock and what would remain for testing afterwards?


(*) which is of course not the way you will do in C++ or Java, but that's offtopic.

Parapsychology answered 10/9, 2011 at 6:56 Comment(6)
I have downvoted because this is not an answer. This answer can be summarized to "you can't", which is not really true and thus not useful. Also, I fail to see what definition of unit-testing disallows testing of impure functions (they do it all the time in Java :D).Charyl
I agree and disagree with this post. The key insight is right -- test the pure part and impure parts separately. The "mock" of the results of a readFile is just a string -- precisely what you pass to length to begin with! My disagreement is just that one can of course unit-test readFile -- its just that when doing so, one of course should not mock it.Pebrook
@Pebrook All in all, it depends on what definition of the "unit-testing" you use. I use one that prohibits touching functions that do IO things.Parapsychology
FWIW, the question as it stands does not contain the word 'unit'.Clyve
@steamer25: But the text that follows implied it: why would one want to mock something if s/he does (non-unit) testing?Parapsychology
@Alexander Poluekov: Maybe automated behavioral/functional testing? I.e., "Given a file that doesn't exist, the system should..."Clyve
M
6

Based on my layman's understanding of Haskell, I've come to the following conclusions:

  1. If a function makes use of the IO monad, mock testing is going to be impossible. Avoid hard-coding the IO monad in your function.

  2. Make a helper version of your function that takes in other functions that may do IO. The result will look like this:

numCharactersInFile' :: Monad m => (FilePath -> m String) -> FilePath -> m Int
numCharactersInFile' f filePath = do
    contents <- f filePath
    return (length contents)

numCharactersInFile' is now testable with mocks!

mockFileSystem :: FilePath -> Identity String
mockFileSystem "fileName" = return "mock file contents"

Now you can verify that numCharactersInFile' returns the the expected results w/o IO:

18 == (runIdentity .  numCharactersInFile' mockFileSystem $ "fileName")

Finally, export a version of your original function signature for use with IO

numCharactersInFile :: IO Int
numCharactersInFile = NumCharactersInFile' readFile

So, at the end of the day, numCharactersInFile' is testable with mocks. numCharactersInFile is just a variation of numCharactersInFile'.

Matteroffact answered 4/2, 2013 at 19:42 Comment(1)
FYI, this is a combination of Ankur's solution combined incorporating Rotsor's comment.Matteroffact

© 2022 - 2024 — McMap. All rights reserved.