This question is actually a small lattice of very closely related questions; I don't think it makes much sense to break it up as yet.
One of the fundamental ways to create a Vector
is using unsafeFreeze
. As its name suggests, unsafeFreeze
is truly unsafe. In particular, nothing prevents the MVector
passed to unsafeFreeze
from being modified after it is frozen. This leads to two different problems:
It can make "immutable" vectors change in value. This is just the sort of spooky action at a distance Haskell generally shuns.
Modifying a frozen vector can (at least potentially) confuse the garbage collector. There is no documented guarantee that the garbage collector will scan frozen arrays to ensure their contents are evacuated. More generally, mutating vectors while they are frozen is absolutely forbidden, and the result of doing so is utterly unspecified.
The vector
package[1] offers two efficient, seemingly safe, primitives for creating immutable vectors: create
and createT
:
create :: (forall s. ST s (MVector s a)) -> Vector a
createT :: Traversable t => (forall s. ST s (t (MVector s a))) -> t (Vector a)
Ignoring vector fusion business, the basic implementations look like
create m = runST $ m >>= unsafeFreeze
createT m = runST $ m >>= traverse unsafeFreeze
create
is pretty clearly safe. It runs the given ST s
action, which must create a fresh MVector s
(the type of runST
ensures it can't use an existing one, and also ensures that fixST
can't play any funny tricks), freezes it, and returns the frozen vector.
createT
is pretty clearly safe when the Traversable
instance is lawful. With lists, for example, createT m
runs an action producing a list of MVector
s, then freezes them all. Parametricity in s
then seems sufficient, as for create
, to ensure nothing bad happens. Note that the action may create a list with multiple copies of the same MVector
. These will be frozen twice, but there shouldn't be any harm in that. Lawful Traversable
instances all look pretty much like decorated lists, so they should behave similarly. Now I finally reach my first question:
Is createT
safe when used with an unlawful Traversable
instance?
Unlawfully dropping, duplicating or rearranging some elements or changing decorations (violating the identity law) doesn't pose any obvious difficulty. Parametricity prevents any interesting violations of the naturality law, so that's out. I haven't been able to find a way to cause trouble by violating the composition law or totality, but that is no guarantee that there isn't one.
One obvious way to generalize createT
is to allow the user to pass their own traversal function:
createTOf
:: (forall f x y. Applicative f => (x -> f y) -> t x -> f (u y))
-> (forall s. ST s (t (MVector s a))) -> u (Vector a)
createTOf trav m = runST $ m >>= trav unsafeFreeze
Note that I've allowed the traversal to change the container type from t
to u
. This allows the user to, for example, produce a Vector (MVector s a)
but get back a [Vector a]
. When t ~ u
, this is obviously just as safe as createT
with an unlawful Traversable
instance. Does the extra flexibility of changing the "container" type reduce safety? Edit: I just realized I can answer this one: no, it makes no difference. See [2] below for an explanation.
When we use createT
, we might not actually want a container of vectors; perhaps we want to traverse that container to get something else. We can write something like
traverse f <$> createT m
The extra type flexibility of createTOf
means we don't necessarily have a Traversable
on our hands, and can't necessarily do this. But using the composition law for Traversable
, we can integrate this traversal into the creation function:
createTOfThen
:: Applicative g
=> (forall f x y. Applicative f => (x -> f y) -> t x -> f (u y))
-> (Vector a -> g b)
-> (forall s. ST s (t (MVector s a)))
-> g (u b)
createTOfThen trav f m =
runST $ m >>= getCompose . trav (Compose . fmap f . unsafeFreeze)
Is createTOfThen
safe if trav
is not a lawful traversal?
I did say I'd be talking about a lattice, right? The next question is how much (if at all) we can weaken the polymorphism of the traversal without causing trouble. Everything will typecheck even if the traversal is only required to be polymorphic in s
, but that's obviously unsafe, as it can interleave freezing with modification however it likes.
Revealing to the that the end result holds Vector
values seems likely to be harmless enough, but we surely can't let the traversal know both that it is operating in ST s
and that it is handling MVector s a
values. But could we let it know one of those facts? Fixing the Applicative
would certainly be useful:
createTOf'
:: (forall s x y. (x -> ST s y) -> t x -> ST s (u y))
-> (forall s. ST s (t (MVector s a))) -> u (Vector a)
createTOfThen'
:: Applicative g
=> (forall s x y. (x -> Compose (ST s) g y) -> t x -> Compose (ST s) g (u y))
-> (Vector a -> g b)
-> (forall s. ST s (t (MVector s a)))
-> g (u b)
This would offer more efficient creation of things like vectors of vectors, as vectors can be traversed more efficiently in ST
than in arbitrary Applicative
functors. It would also reduce dependence on inlining, as we'd avoid dealing with the Applicative
dictionary.
On the flip side, I suspect we can let the traversal know it's handling MVector
s ... as long as we don't let it know what state thread they're associated with. This is sufficient to unbox them, and also (perhaps unfortunately) to get their sizes.
Edit! If the traversal is allowed to know that it's producing Vector
s (which seems like the very least likely thing to be problematic), then createTOfThen
can be implemented in terms of createTOf
:
createTOfThen trav post m = getConst $
createTOf (\f ->
fmap Const . getCompose . (trav (Compose . fmap post . f))) m
Taking the lattice in a third direction, let's move on to rank-2 traversals. The rank2classes
package offers its own Traversable
class, which I'll call R2.Traversable
:
class (R2.Functor g, R2.Foldable g) => R2.Traversable g where
R2.traverse :: Applicative m
=> (forall a. p a -> m (q a))
-> g p -> m (g q)
We can play exactly the same games with this to produce heterogeneous containers of Vector
s:
createTHet
:: R2.Traversable t
=> (forall s. ST s (t (MVector s)))
-> t Vector
createTHet m = runST $ m >>= R2.traverse unsafeFreeze
createTHetOf
:: (forall h f g.
(Applicative h => (forall x. f x -> h (g x)) -> t f -> h (u g)))
-> (forall s. ST s (t (MVector s)))
-> u Vector
createTHetOf trav m = runST $ m >>= trav unsafeFreeze
createTHetOfThen
:: Applicative q
=> (forall h f g.
(Applicative h => (forall x. f x -> h (g x)) -> t f -> h (u g)))
-> (forall x. Vector x -> q (r x))
-> (forall s. ST s (t (MVector s)))
-> q (u r)
createTHetOfThen trav post m =
runST $ m >>= getCompose . trav (Compose . fmap post . unsafeFreeze)
along with similar versions where the traversal is allowed to know it's working in ST s
. I would imagine that the safety properties of the rank-2 versions are identical to those of the corresponding rank-1 versions, but I don't have a clue how one might go about proving such.
Just for fun, I think the top of my lattice is the following monstrosity. If any of these ideas are unsafe, this one probably is:
createTHetOfThen'
:: (forall s1 s2.
((forall x. MVector s2 x -> Compose (ST s1) q (r x)) -> t (MVector s2) -> Compose (ST s1) q (u r)))
-> (forall x. Vector x -> q (r x))
-> (forall s. ST s (t (MVector s)))
-> q (u r)
createTHetOfThen' trav post m =
runST $ m >>= getCompose . trav (Compose . fmap post . unsafeFreeze)
[1] I've linked to Stackage because Hackage is down today. If I remember and have time, I'll fix up the links later.
[2] The proof comes from Data.Functor.Sum
. Given a non-type-changing createTOfSame
, we can write
createTOf
:: (forall f a b. Applicative f => (a -> f b) -> t a -> f (u b))
-> (forall s. ST s (t (MVector s a)))
-> u (Vector a)
createTOf trav m =
case createTOfSame (\f (InL xs) -> InR <$> trav f xs)
(InL <$> m) of
InR u -> u
This will actually be total, although the "traversal" is partial: we always case match on what we will surely find.