Truly unordered foldable bag
Asked Answered
S

1

6

I want a Bag container which hides its 'real' order from its clients.

It also must be fully polymorphic, that is shouldn't require any constraints over its element type.

I found at least three implementations of bags: Bag module from ghc package, Data.Bag from bag and Math.Combinatorics.Multiset from multiset-comb.

However, they all have toList and fold* operations which expose the internal order of elements which may depend on implementation details or the order of bag construction.

toList is impossible, at least with type Bag a -> [a]. However, folding does not always expose the order.

For example, fold (+) 0 does not expose.

The question is, how should I design the folding interface? Is there a necessary and sufficient condition for safety of the a -> a -> a folding function? As fmap does not expose the order, is there a loss of genericity from folding with a -> b -> b?

I'm thinking of commutative monoids - they seem sufficient, but I'm not sure if associativity and identity element are necessary.

Savino answered 5/9, 2012 at 12:32 Comment(4)
You can implement toList in terms of foldl or foldr with type signature a -> b -> b (i.e. a -> [a] -> [a],) which definitely does expose the inherent order. However, I'm not quite sure why you would really not want to do that. Even with a fold like a -> a -> a, you're not safe from somebody exposing the inner order with unsafePerformIO.Wizard
I was thinking that perhaps reduce :: Monoid m => (a -> m) -> Bag a -> m would do the trick, with the idea is that the only thing that the caller of reduce can see is individual Bag elements in isolation. The problem is that the caller can also implement the Monoid instance, and thus is able to observe some concrete order of the elements. I'm afraid it's impossible to have an operation that enumerates elements and yet forbid clients from coupling themselves to the contingent order in which it produces the elements. The only solution may be to produce elements in a well-defined order.Porous
Use answers for such extended comments. CommutativeMonoid m => (a -> m) -> Bag a -> m is a good idea as its type passes enough information to the caller to avoid ill-behaved functions and design a library of well-defined functions. I don't want to restrain callers from hurting themselves if they want to, I just want to avoid unintended bugs. For example, if I use only stock commutative monoids to reduce I get a pretty strong guarantee that my code doesn't depend on the order. I asked the question because I encountered a list monad which is not quite a monad as it violates laws by permuting.Savino
A well-defined order is impossible to achieve without adding constraints.Savino
S
6

An identity is probably necessary if your bags can be empty - you have to return something in that case, and if you want your fold to be a homomorphism (so that combining the results of folding some bags is the same as folding a bag made by combining the bags, which is a pretty natural property to expect), it must be an identity element.

Associativity, likewise, is a good idea. Suppose I have a type and operation like so:

data T a = Zero | One a | Two (T a) (T a)
  deriving (Eq, Ord, Show)

(+-+) :: Ord a => T a -> T a -> T a
Zero +-+ x = x
x +-+ Zero = x
a +-+ b = Two (min a b) (max a b)

Clearly (+-+) is commutative and has an identity, but is non-associative. Suppose I then implement a bag as a list:

newtype Bag a = Bag [a]

-- pre-condition: f is commutative and z is an identity for it
foldBag :: (a -> a -> a) -> a -> Bag a -> a
foldBag f z (Bag xs) = foldr f z xs

foldNonAssoc :: (Ord a) => Bag (T a) -> T a
foldNonAssoc = foldBag (+-+) Zero

Even if I demand the stated precondition, I can still use my foldNonAssoc to distinguish between Bag [One 1,One 2,One 3], which will fold to Two (One 1) (Two (One 2) (One 3)) and Bag [One 3,One 2,One 1], which will fold to Two (One 3) (Two (One 1) (One 2)) (notice that not all of the structure is preserved, but on a long list I'll get the entire list order back except for the ordering of the last two elements).

A priori, if you combine all your items with an operation, you'll have a tree of applications, something like a +-+ (b +-+ (c +-+ d)). Commutativity will let you do some rearrangement, but no matter what you do, c will always be combined with d. So if you want that to be the same as (a +-+ c) +-+ (b +-+ d), you really need associativity, too.

Straticulate answered 5/9, 2012 at 13:17 Comment(2)
It is however worth mentioning that you have to trust your users to establish those preconditions - there isn't really any way of enforcing them statically.Straticulate
I understand that it's not possible without putting enormous proof burden on the users. I only need some way to tell the expected invariants to the user. Using class Monoid m => CommutativeMonoid m with no 'methods' seems appropriate - users can declare their monoids commutative if they are sure they are.Savino

© 2022 - 2024 — McMap. All rights reserved.