Why does length return 1 for a tuple with 2 elements, and gives an error for a tuple with more elements?
Asked Answered
P

2

29

I'm studying Haskell using the book "Haskell Programming from First Principles", and towards the end of Chapter 4, "Basic datatypes", I've come across something that confused me. The book mentions a function length and says that it works on Listss. Everything is fine with that, but when I try this length function with various Tuples, what I see confused me:

First, let's see the type of length:

:t length
length :: Foldable t => t a -> Int

OK, so I read above as "takes a Foldable, which I think as a list for convenience, and returns an Int, that is the number of elements in the list." Hence my first confusion: Why does the the following work at all:

length (1, 1)
1

Because to me, it seems like I've just passed a tuple with two elements to length, and it returned 1. Is tuple a list? Is tuple Foldable? And of course, why 1?

Now I go one step further:

length (1, 1, 1)

<interactive>:6:1:
    No instance for (Foldable ((,,) t0 t1))
      arising from a use of ‘length’
    In the expression: length (1, 1, 1)
    In an equation for ‘it’: it = length (1, 1, 1)

<interactive>:6:9:
    No instance for (Num t0) arising from the literal ‘1’
    The type variable ‘t0’ is ambiguous
    Note: there are several potential instances:
      instance Num Integer -- Defined in ‘GHC.Num’
      instance Num Double -- Defined in ‘GHC.Float’
      instance Num Float -- Defined in ‘GHC.Float’
      ...plus two others
    In the expression: 1
    In the first argument of ‘length’, namely ‘(1, 1, 1)’
    In the expression: length (1, 1, 1)

<interactive>:6:12:
    No instance for (Num t1) arising from the literal ‘1’
    The type variable ‘t1’ is ambiguous
    Note: there are several potential instances:
      instance Num Integer -- Defined in ‘GHC.Num’
      instance Num Double -- Defined in ‘GHC.Float’
      instance Num Float -- Defined in ‘GHC.Float’
      ...plus two others
    In the expression: 1
    In the first argument of ‘length’, namely ‘(1, 1, 1)’
    In the expression: length (1, 1, 1)

Another try:

length (1::Int, 1::Int, 1::Int)

<interactive>:7:1:
    No instance for (Foldable ((,,) Int Int))
      arising from a use of ‘length’
    In the expression: length (1 :: Int, 1 :: Int, 1 :: Int)
    In an equation for ‘it’: it = length (1 :: Int, 1 :: Int, 1 :: Int)

But the following works:

length (1::Int, 1::Int)
1

Is there any good explanation for the behavior I'm observing above? Am I misreading the type of length? Or is there something else going on behind the scenes?

Provenance answered 6/4, 2016 at 19:38 Comment(4)
yes that's a common complaint - it's because of the way Foldable works with only the last part of the tuple and the fact that there is a Foldable instance for 2-tuples and only for those - I recommend you just take it as is - if you want you can find plenty of discussions around this - here is an example: Haskell Foldable WatsCerenkov
Even more confusion awaits me here! I haven't seen this coming.Purely
And some empathy for my poor mind. But still, confusion persists :(Purely
Related questionScoggins
T
28

You have encountered a Haskell cause célèbre that has sparked much discussion and gnashing of teeth.

Basically, for the purposes of Foldable (the typeclass that provides length), 2-tuples are not considered a container of two elements, but a container of one element accompanied by some context.

You can extract a list of elements of type a from any Foldable a. Notice that for 2-tuples the type variable of the Foldable is that of the second element of the tuple, and it can be different from the type of the first element.

If you had a ('c',2) :: (Char,Int) tuple, it would be no mystery that you couldn't extract two Ints in that case! But when the types are equal it becomes confusing.

As for why length (1::Int, 1::Int, 1::Int) fails, 3-tuples don't have a Foldable instance defined, but perhaps they should have one, for consistency. 3-tuples would also have length 1.

By the way, the Identity functor, that could be considered a kind of 1-tuple, is also Foldable and of course has length 1 as well.

Should the Foldable instance for tuples exist at all? I think the underlying philosophy in favor of yes is one of, shall we call it, "plenitude". If a type can be made an instance of a typeclass in a well defined, lawful way, it should have that instance. Even if it doesn't seem very useful and, in some cases, may be confusing.

Twitty answered 6/4, 2016 at 19:59 Comment(14)
I'm trying to interpret what you wrote by also examining the fact that minimum (1, 2) returns 2, minimum (2,1) returns 1, and minimum ('c', 2) returns 2.Purely
@Emre Sevinç When you have a tuple of numbers, it is often the case that they are semantically different, even if the types are equal... Some integer paired with the number of times it appears somewhere, for example. If the numbers are semantically equal, they should better be in a "uniform" container like a list or a vector.Twitty
@EmreSevinç I wonder what you would expect minimum ('c', 2) to return.Sympathize
Just to make things clear, I'm not trying to use a tuple where using a list makes much more sense. All I'm trying to understand is: what's happening behind-the-scenes when functions such as length and minimum are applied to 2-tuples, 3-tuples, etc.Purely
Daniel, for minimum ('c', 2), I can't say anything right now, but for length (1, 1) my initial reaction would be: It should not work, (1, 1) is not a list, length is for lists. But after reading same things again and again, and obviously looking at the fact that length is defined for Foldable, it is obvious that I have to learn much more about Foldable to really understand the answers given so far.Purely
@EmreSevinç Foldable is basically the "convertible into list" typeclass. All its operations could be performed with the same results on the list obtained from the Foldable value using toList. They are defined directly in Foldable for efficiency reasons (for example, if a container keeps track of its own length, that is more efficient that traversing the full length of the list). Also, if a container has more than one type parameter (like 2-tuples do, and maps) for the purposes of Foldable its "element type" will always be its last parameter.Twitty
Checking this answer shared above by the author, and seeing that in Data.Foldable Foldable ((,) a) instance has a function such as toList :: (a, b) -> [b], my current understanding is that, somehow, a 2-tuple is converted to a list that has the tuple's last item, and in turn, functions such as length and minimum is applied to that list. This seems to answer "how?". The "why?" is still not clear for me (and apparently there's some controversy (?))Purely
@Emre Sevinç It's more a question of "why not?" than "why?".Twitty
@EmreSevinç do you mean “why does toList only use the second element” (I think I've explained that in the linked post), or “why does length operate on the list generated by the Foldable instance” – that is basically a matter of necessity, to keep the semantics consistent. For instance, you might want to compute the average of all numbers in some container. If you do this as sum xs / length xs, but the sum only goes over one element yet the length is two, you get a complete garbage result. (Admittedly, that would be a rather naïve thing to do, anyway.)Scoggins
@Scoggins my "why" was more like "why make a design / naming decision that leads to such a great confusion". I mean something like that. I think I'll need to read this and this to understand it better.Purely
Yeah. I think it would have been better to never give (a,) a Foldable instance,Scoggins
"If a type can be made an instance of a typeclass in a well defined, lawful way, it should have that instance. Even if it doesn't seem very useful and, in some cases, may be confusing." I completely disagree with this. What makes you think this is a good idea?Giroux
@Giroux This is highly subjective, but I would dislike the lack of uniformity in how some types are deemed worthy of instances and others not, when they could have them just as well. Someone somewhere might need them, why force an orphan instance upon him? Less subjectively, the more instances a type has the more code you can potentially get "for free" when composing types. The composition of two Foldable functors is Foldable for example. Opting out by defining your own tuple-like type is relatively easy as well, I often do it for folds with complex accumulators to make code more legible.Twitty
I have a, I think fairly important, follow up question. In what situations, if any, is it useful to have a Foldable instance for (a,)? Can someone give me a small and concrete(!) example?Torytoryism
S
14

I like danidiaz's answer because it provides the high-level intuition about how the Foldable instance for tuples works and what it intuitively means. However it seems there is still some confusion about the mechanics of it; so in this answer I will focus on the "behind-the-scenes" bits. The full text of the Foldable instance in question is available online and looks like this:

instance Foldable ((,) a) where
    foldMap f (_, y) = f y
    foldr f z (_, y) = f y z

You can already see from this instance that the first part of each tuple is completely ignored in all Foldable methods. However, to complete the picture, we need to look at the definitions for minimum and length. Since this instance does not include definitions for minimum and length, we should look at the default definitions for these. The class declaration for Foldable looks like this (with irrelevant bits elided):

class Foldable t where
    length :: t a -> Int
    length = foldl' (\c _ -> c+1) 0

    foldl' :: (b -> a -> b) -> b -> t a -> b
    foldl' f z0 xs = foldr f' id xs z0
      where f' x k z = k $! f z x

    minimum :: forall a . Ord a => t a -> a
    minimum = fromMaybe (error "minimum: empty structure") .
       getMin . foldMap (Min #. (Just :: a -> Maybe a))

So now, let's expand these definitions and see where they get us.

length (a, b)
= { definition of length }
foldl' (\c _ -> c+1) 0 (a, b)
= { definition of foldl' }
foldr (\x k z -> k $! (\c _ -> c+1) z x) id (a, b) 0
= { definition of foldr }
(\x k z -> k $! (\c _ -> c+1) z x) b id 0
= { beta reduction }
id $! (\c _ -> c+1) 0 b
= { id $! e = e }
(\c _ -> c+1) 0 b
= { beta reduction }
1

Note that the conclusion holds regardless of what we plug in for a and b. Now let's do minimum. For our purposes, we will replace (#.) with (.) -- the only difference is efficiency, which we don't care about for this particular line of reasoning.

minimum (a, b)
= { definition of minimum }
( fromMaybe (error "minimum: empty structure")
. getMin
. foldMap (Min . Just)
) (a, b)
= { definition of (.) }
( fromMaybe (error "minimum: empty structure")
. getMin
) (foldMap (Min . Just) (a, b))
= { definition of foldMap }
( fromMaybe (error "minimum: empty structure")
. getMin
) ((Min . Just) b)
= { definition of (.) }
fromMaybe (error "minimum: empty structure")
(getMin (Min (Just b)))
= { definition of getMin }
fromMaybe (error "minimum: empty structure") (Just b)
= { definition of fromMaybe }
b
Sympathize answered 6/4, 2016 at 20:38 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.