First up, a simplified version of the task I want to accomplish: I have several large files (amounting to 30GB) that I want to prune for duplicate entries. To this end, I establish a database of hashes of the data, and open the files one-by-one, hashing each item, and recording it in the database and the output file iff its hash wasn't already in the database.
I know how to do this with iteratees, enumerators, and I wanted to try conduits. I also know how to do it with conduits, but now I want to use conduits & persistent. I'm having problems with the types, and possibly with the entire concept of ResourceT
.
Here's some pseudo code to illustrate the problem:
withSqlConn "foo.db" $ runSqlConn $ runResourceT $
sourceFile "in" $= parseBytes $= dbAction $= serialize $$ sinkFile "out"
The problem lies in the dbAction
function. I would like to access the database here, naturally. Since the action it does is basically just a filter, I first thought to write it like that:
dbAction = CL.mapMaybeM p
where p :: (MonadIO m, MonadBaseControl IO (SqlPersist m)) => DataType -> m (Maybe DataType)
p = lift $ putStrLn "foo" -- fine
insert $ undefined -- type error!
return undefined
The specific error I get is:
Could not deduce (m ~ b0 m0)
from the context (MonadIO m, MonadBaseControl IO (SqlPersist m))
bound by the type signature for
p :: (MonadIO m, MonadBaseControl IO (SqlPersist m)) =>
DataType -> m (Maybe DataType)
at tools/clean-wac.hs:(33,1)-(34,34)
`m' is a rigid type variable bound by
the type signature for
p :: (MonadIO m, MonadBaseControl IO (SqlPersist m)) =>
DataType -> m (Maybe (DataType))
at tools/clean-wac.hs:33:1
Expected type: m (Key b0 val0)
Actual type: b0 m0 (Key b0 val0)
Note that this might be due to wrong assumptions I made in designing the type signature. If I comment out the type signature and also remove the lift
statement, the error message turns into:
No instance for (PersistStore ResourceT (SqlPersist IO))
arising from a use of `p'
Possible fix:
add an instance declaration for
(PersistStore ResourceT (SqlPersist IO))
In the first argument of `CL.mapMaybeM', namely `p'
So this means that we can't access the PersistStore
at all via ResourceT
?
I cannot write my own Conduit either, without using CL.mapMaybeM
:
dbAction = filterP
filterP :: (MonadIO m, MonadBaseControl IO (SqlPersist m)) => Conduit DataType m DataType
filterP = loop
where loop = awaitE >>= either return go
go s = do lift $ insert $ undefined -- again, type error
loop
This resulted in yet another type error I don't fully understand.
Could not deduce (m ~ b0 m0)
from the context (MonadIO m, MonadBaseControl IO (SqlPersist m))
bound by the type signature for
filterP :: (MonadIO m,
MonadBaseControl IO (SqlPersist m)) =>
Conduit DataType m DataType
`m' is a rigid type variable bound by
the type signature for
filterP :: (MonadIO m,
MonadBaseControl IO (SqlPersist m)) =>
Conduit DataType m DataType
Expected type: Conduit DataType m DataType
Actual type: Pipe
DataType DataType DataType () (b0 m0) ()
In the expression: loop
In an equation for `filterP'
So, my question is: is it possible to use persistent like I intended to inside a conduit at all? And if, how? I am aware that since I can use liftIO
inside the conduit, I could just go and use, say HDBC
, but I wanted to use persistent explicitly in order to understand how it works, and because I like its db-backend agnosticism.
lift
instead ofliftIO
? – DocliftIO
imposes a constraint on the entiredo
block. But that only explains why the first error message differs from the second. I'll update the post in a sec, to reflect what'll happen if you remove the liftIO statement. – Kathrinekathrynlift
already imposesIO
restrictions on the monad type. I noted you have to remove thelift
statement altogether to reach that error message. If you don't (but keeplift $ print ""
in) you instead getCouldn't match expected type 'SqlPersist m0 a0' with actual type 'IO ()'
. – KathrinekathrynfilterP :: (MonadIO m, MonadBaseControl IO (SqlPersist m)) => Conduit DataType m DataType
. What you probably want isConduit DataType (SqlPersist m) DataTpe
. I think that might clear up a fair amount of the problems. – DocConduit
is run byrunResourceT
which requires its argument to be instantiated to at leastResourceT m
, notSqlPersist m
. It also imposes onm
the constraintMonadBaseControl IO m
, so that has to be in the conduit's type signature. – Kathrinekathryn