Functions that only work with one constructor of a type
Asked Answered
P

2

7

I'm writing a lib for message queues. Queues can be either Direct or Topic. Direct queues have a static binding key, while Topic queues can have dynamic ones.

I want to write a function publish that only works on Direct queues. This works:

{-# LANGUAGE DataKinds #-}

type Name = Text
type DirectKey = Text
type TopicKey = [Text]

data QueueType
  = Direct DirectKey
  | Topic TopicKey

data Queue (kind :: a -> QueueType)
  = Queue Name QueueType

This requires two separate constructors

directQueue :: Name -> DirectKey -> Queue 'Direct

topicQueue :: Name -> TopicKey -> Queue 'Topic

But when I go to write publish, there's an extra pattern that I need to match which should be impossible

publish :: Queue 'Direct -> IO ()
publish (Queue name (Direct key)) =
   doSomething name key
publish _ =
   error "should be impossible to get here"

Is there a better way to model this problem so that I don't need that pattern match? Direct queues should always have that Text metadata, and Topic queues should always have that [Text] metadata. Is there a better way to enforce this at both the type and value level?

Photochromy answered 12/10, 2016 at 21:0 Comment(4)
Is there a reason you need a parameterized Queue type, instead of two distinct types newtype DirectQueue = DQ Text and newtype TopicQueue = TQ [Text]?Therine
@Therine - They have a lot in common with each other. I took out the extra info to simplify the problem. I'll add it back in to the example to demonstrate.Photochromy
I added back in a Name for the queues to show that they have information in common. In my real application, there are several functions that should work with any queue, and a Queue has two non-distinct fields in addition to queue type.Photochromy
Have a read of this, it is similar to your question: #39531532Leshia
S
6

How about making Queue a plain polymorphic type

data Queue a = Queue Name a

And then defining separate Queue DirectKey and Queue TopicKey types? Then you wouldn't need to pattern-match in publish :: Queue DirectKey -> IO ().

If, apart from that, you need functions that should work in any Queue, perhaps you could define some common operations in a typeclass of which DirectKey and TopicKey would be instances, and then have signatures like

commonFunction :: MyTypeclass a => Queue a -> IO ()

Maybe you could put such functions directly in the typeclass

class MyTypeclass a where
    commonFunction :: Queue a -> IO ()
Sanguinary answered 12/10, 2016 at 21:33 Comment(2)
This is great. I was definitely overthinking it. Thanks!Photochromy
This is a good talk on how adding a type parameter to another type can be useful youtube.com/watch?v=BHjIl81HgfESanguinary
E
3

Your code doesn't compile as-is (it requires PolyKinds to be turned on as well) so I don't know if it's only an accident, but it looks like you were trying to go for the approach where you know from the type of a queue which constructors could be involved, and so can statically guarantee that a function can only be called on a certain kind of queue.

You can in fact get that approach to work, using multiple constructors of a GADT (as opposed to using multiple completely separate types, with a type class to bring them together when necessary, in teh approach suggested in @danidiaz' answer).

But first why your current code doesn't work. In your queue type:

data Queue (kind :: a -> QueueType)
  = Queue Name QueueType

you're parameterising the Queue type by a type variable (called kind), allowing you to tag a Queue at the type level by what kind of QueueType you want to be in it. But only constructor Queue Name QueueType makes no reference to kind at all; it's a phantom type. That QueueType slot can be filled by any valid queue type regardless of what kind is in the Queue kind type of the queue.

This means that GHC was correct in wanting you to add a case to publish that would match a topic key inside a Queue 'Direct; your data type definition says that such values can exist.

What GADTs allow you to do is to explicitly declare the full type of each constructor separately, including the return type. So you can set up a relation between the type of the value you're constructing and the constructors (or their parameters) that could possibly be used to make a value of that type.

In concrete terms, we can make a type for your queues such that Queue 'Direct can only contain a direct queue type, and Queue 'Topic can only contain a topic queue type, and you can handle either by polymorphically accepting a Queue a.

It's simplest to make QueueType just be used for the tag, and have a separate GADT storing the data. In your original code you were able to reuse the data-holding constructors lifted to the type level and unapplied, but that makes your kind signatures unnecessarily complicated (bringing in the need for PolyKinds), and if you need to add more (and different numbers of!) parameters to the data constructors it will become increasingly difficult to shoehorn their unapplied types to fit the same kind when lifted to type level. So:

data QueueType
  = Direct
  | Topic

data QueueData (a :: QueueType)
  where DirectData :: DirectKey -> QueueData 'Direct
        TopicData :: TopicKey -> QueueData 'Topic

So we've got QueueType just to lift with DataKinds (there's often no need to ever actually use such a type at the value level). Then we've got the type QueueData parametersised by a type-level QueueType. One constructor takes a DirectKey and constructs a QueueData 'Direct, the other takes a TopicKey and constructs a QueueData 'Topic.

Then it's simple to have a Queue type that similarly is tagged with the type of queue being represented:

data Queue (a :: QueueType)
  = Queue Name (QueueData a)

Now if a function works on any queue (say because it only needs access to the name outside of the QueueData), it can take a Queue a:

getName :: Queue a -> Text
getName (Queue name _) = name

You can also take a Queue a if you can explicitly handle all cases, and you'll get warnings when you miss a case:

getKeyText :: Queue a -> Text
getKeyText (Queue _ (DirectData key)) = key
getKeyText (Queue _ (TopicData keys)) = mconcat keys

And finally, when taking a Queue 'Direct as in your publish function, GHC knows that DirectData is the only possible constructor for the QueueData. So you don't need to add an error case as in the OP, and it would in fact be detected as a type error if you tried to handle a TopicData inside there.

Full example:

{-# LANGUAGE DataKinds, GADTs, KindSignatures #-}

import Data.Text (Text)

type Name = Text
type DirectKey = Text
type TopicKey = [Text]

data QueueType
  = Direct
  | Topic

data QueueData (a :: QueueType)
  where DirectData :: DirectKey -> QueueData 'Direct
        TopicData :: TopicKey -> QueueData 'Topic

data Queue (a :: QueueType)
  = Queue Name (QueueData a)


getName :: Queue a -> Text
getName (Queue name _) = name

getKeyText :: Queue a -> Text
getKeyText (Queue _ (DirectData key)) = key
getKeyText (Queue _ (TopicData keys)) = mconcat keys

publish :: Queue 'Direct -> IO ()
publish (Queue name (DirectData key))
  = doSomething name key
  where doSomething = undefined
Evieevil answered 13/10, 2016 at 1:5 Comment(2)
I think PolyKinds is rather more than necessary. KindSignatures should be sufficient.Giaour
@Giaour The OP's code as written has a kind variable a in the kind of kind :: a -> QueueType. My system (ghc 8.0.1) certainly won't accept it with KindSignatures and no PolyKinds.Evieevil

© 2022 - 2024 — McMap. All rights reserved.