In Functional Programming, why the Maybe functor is not defined as data Maybe a = Nothing | a?
Asked Answered
M

4

5

I was wondering why in Haskell (and FP in general) functors such as the Maybe functor are defined as

data Maybe a = Nothing | Just a

Instead of simply

data Maybe a = Nothing | a

The second option would still be functorial, as the Maybe functor takes a type a to a type Nothing + a. What am I missing?

Minh answered 31/5, 2023 at 17:49 Comment(6)
The second option isn't valid syntax.Helpmate
What's the difference between a and a? (Hint: One is a Maybe can you tell which?)Hillery
Closely related: Why do we need sum types?Shrubby
This syntax is not valid in Haskell, but in "FP in general" there are no common rules. You are free to invent a language where such a syntax works.Travistravus
@Shrubby Why do Maybe/Optional types use a Just/Some type instead of the actual type? is linked from that question, and seems still more closely related.Sumbawa
@Sumbawa Indeed -- that's close enough to count as a duplicate, I guess, though I do like the short and sweet answers given here so far.Shrubby
J
13

This is what's called a "non-discriminated union", as opposed to a "discriminated" one - the Just thingy is the "discriminator". And some languages do have them - e.g. TypeScript.

But they come with their own difficulties. Take type inference: if I write let x = "foo", what type is x? Is it String or Maybe String?

Ultimately, though, it's just a choice that language designers make. In practice discriminated unions turn out to be more useful than non-discriminated ones, so Haskell has those.

Jiujitsu answered 31/5, 2023 at 18:6 Comment(0)
K
5

The right-hand side of a data definition must define data constructors. A bare a is just a type variable.

data Maybe a = Nothing | Just a

Nothing is nullary data constructor that has type Maybe a. Just is unary data constructor that has type a -> Maybe a. Similar to a function, it takes an argument of type a to produce a value of type Maybe a. (Unlike a function, you can use it for pattern matching.)

   > :t Nothing
   Nothing :: Maybe a
   > :t Just
   Just :: a -> Maybe a
   > :t Just 'c'
   Just 'c' :: Maybe Char
   > :t Just 3
   Just 3 :: Num a => Maybe a
Kneepad answered 31/5, 2023 at 18:5 Comment(0)
R
5

Let's start from something fundamental: you can pattern match on a Maybe a. That means that at runtime the evaluator needs to be able to look at a value of type Maybe a in memory and tell which branch to take in the following code:

case (mx :: Maybe a) of
  Nothing -> ...
  Just x -> ...

And the evaluator needs to be able to make that distinction without knowing anything about the type a. So at runtime the structure in memory that represents mx :: Maybe a has to have enough information in it to make this distinction. Nothing is easy, since it's just a constant and contains no other values, so it can just be a known fixed thing; for example's sake let's say we decide it should be a machine word with all 0 bits. Then when pattern matching is looking at mx if it finds an all-zero word we take the Nothing branch and if we find something non-zero we take the Just branch. Easy, right?

For that to work we've assumed that a Just x value will be different from the Nothing value no matter what x is. But this has to work without knowing the type a, so x could be any value of any type, and we have no way to control how those are represented in memory. If we just represent our Just x value as the memory structure for x with no additional wrapping, then any time the representation for x happens to be (or start with) an all-zero word we'll think we had Nothing when that's actually not the case.

In particular the type a could be Maybe b, which means our Just x could be Just Nothing; the value inside could be Nothing so automatically by definition there are some x values that have the same representation in memory as Nothing. But nested maybes are by no means the only case where the nested value might have a memory representation that collides with that of Nothing, so don't think it's just this one obscure case that's a problem; False is also an empty constructor and there's no reason the algorithms assigning memory representations wouldn't assign it the same bit pattern as Nothing; the same probably goes for LT :: Ordering, or the first value of any enum type. There's no reason primitive number and character types would avoid that specific bit pattern either (whether it's all-zero or we pick something else).

There are only 2 ways you can make this work, really.

One is to give up on the Maybe type being a compositional type that works like every other ADT. Instead you make it a special case thing where every type has a "nullable" version, but it doesn't nest. And you represent the null case by something globally identifiable (like a null pointer) rather than using memory layout used by all other values. This is the approach taken by a number of imperative languages that have approached this from the direction of improving "null safety", but it's not the approach Haskell takes. Maybe is a perfectly ordinary ADT that doesn't need to be baked into the compiler, it is nestable and compositional, doesn't impose any extra rules that have to be memorised by learners.

The second is to not try to represent Just x with the same memory structure as x. There needs to be a memory record for the Just data constructor itself, with at least one bit distinguishing it from Nothing, and room for it to contain x (rather than simply being x). Now we can rely on being able to distinguish Nothing from Just, and once we know we've got a Just we know where to look for the x part regardless of what type it is. This is what Haskell does.

(There is a third imaginable strategy, which is to give up on the representations of types being decided independently. The language implementation would instead have to arrange for some global registry so that no two values are ever represented by the same structure in memory, regardless of the types involved; effectively keeping track of which bit patterns are "already assigned" globally across all modules in the program. But even if you could implement it, that rule itself dictates that we can't have Just x :: Maybe a represented the same as x :: a, so it still requires a wrapper around the x).


So basically the Just constructor is going to exist in the runtime system. You have to make quite dramatic changes to how Haskell works before you can avoid that. Even if the language syntax for defining Maybe allowed data Maybe a = Nothing | a, the second case would still have a wrapper constructor there, we just wouldn't have a name for it in the surface syntax.

And given that it has to be there for the language to work the way it does, it dramatically simplifies the syntax of the language to always have an explicit name for the data constructor, even when it only contains one value. For example, in a pattern match, the compiler can tell what the programmer intended to match by looking at the constructors. Otherwise how do you tell which Nothing branch is which in this code:

-- Your proposed syntax
case (mmx :: Maybe (Maybe Int)) of
  Nothing -> ...
  Nothing -> ...
  x -> ...

-- Standard Haskell sytax
case (mmx :: Maybe (Maybe Int)) of
  Nothing -> ...
  Just Nothing -> ...
  Just (Just x) -> ...

For that matter how do I know that final x branch is supposed intended to be matching the Int that might be contained in the nested maybe? It's a bare variable, so it's a valid pattern at any of the levels? If I saw that pattern match without the explicit Maybe (Maybe Int) type annotation, how would I even know how many levels there are supposed to be? The x could just as validly be matching the Int contained in a Maybe (Maybe (Maybe (Maybe (Maybe Int)))) (with all the other levels of Nothing being left to error out as failed pattern matches).

And if a bare value without an explicit constructor is a valid option for a normal data declaration for Maybe, it would have to be for any other type. A number of packages define a type that looks something like data Result a = OK a | Error String; why wouldn't they be defined as data Result = a | String? Now if I pass a value "hello" to a function that requires a Result a, am I giving it the error case? Or am I giving it the success case when a is String? Or am I giving it the success case when a is Maybe (Identity (SomeOtherTypeWithABareStringCase))? They would all look the same.

Or even worse, why shouldn't data Either a b = Left a | Right b be written as data Either a b = a | b? Now any expression could be interpreted as any path through any number of nestings of Either. It would be impossible for anyone (or the compiler) to tell anything about the type of any code by looking at it.

Obviously nobody would design the language allowing all of my dumb examples above. Either there would be a lot of extra restrictions on when you could use this feature, or there would be extra syntactic markers added to disambiguate all of the difficult cases (or both).

But remember again that the constructors still need to really exist in the system at runtime. So this feature of not having to write Just doesn't gain the system any capabilities. It only saves us a few key-presses writing the code. And in return we either have to lose capabilities (if the compiler restricts us to avoiding situations that would create an impossible ambiguous mess), or we have to write markers all over the place (which is basically equivalent to going back to explicit constructors anyway).

The closest I can see working is for this to only be allowed for a few built in types (not syntax that can be used in user-defined types), and add some special rules for those types to disambiguate common cases. Basically a somewhat less restricted version of the "nullable types" idea, so that a Maybe a is still a "normal" enough type to allow instances of user-defined classes and such, even though Maybe a wouldn't be definable by users.

But I don't want that. Haskell's beauty and power comes from having simple and consistent building blocks that are designed to be compositional. I don't want weird special cases to save a few key-presses. An ADT is a set of constructors, each of which has a list of fields. You build or pattern match values of an ADT by writing the constructor names. That fundamental language feature already works to cover all of the functionality gained by a nullable type feature so we don't need such a feature at all. Just a perfectly ordinary ADT called Maybe.

(I do want to save a few key-presses because Haskellers are just as lazy as programmers in any other language! But I want to do so via a feature that can work generally across the whole language, not a special-case hack just for Maybe. If you can't make this "bare value with no constructor" work as a general feature of ADTs I don't want it at all.)

Rucker answered 1/6, 2023 at 5:40 Comment(0)
S
1

What am I missing?

That x | y isn’t a type.

It’s just part of the short syntax of data definitions. You can write out the full definition in GADT syntax without it.

data Maybe a = Nothing | Just a
data Maybe (a0 :: Type) :: Type where
  Nothing :: forall (a1 :: Type). Maybe a1
  Just    :: forall (a2 :: Type). a2 -> Maybe a2

The vertical bar is meant to illustrate the syntax of valid values, like in BNF, and their typing rules:

  • For any a :: Type, if n = Nothing, then n :: Maybe a

  • For any a :: Type, if j = Just x, and x :: a, then j :: Maybe a

Type systems are extremely sensitive to seemingly small changes. If you interpret data Maybe a = Nothing | a in the same way as above, it says:

  • For any a :: Type, if x :: a, then x :: Maybe a

And that would drastically change the meaning:

  • Every type t is a subtype of Maybe t
  • | denotes untagged union (upper bound / join)
  • Maybe (Maybe t) is equivalent to Maybe t (union is idempotent)
  • Maybe t means “nullable t
Stagehand answered 4/6, 2023 at 23:26 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.