Let's start from something fundamental: you can pattern match on a Maybe a
. That means that at runtime the evaluator needs to be able to look at a value of type Maybe a
in memory and tell which branch to take in the following code:
case (mx :: Maybe a) of
Nothing -> ...
Just x -> ...
And the evaluator needs to be able to make that distinction without knowing anything about the type a
. So at runtime the structure in memory that represents mx :: Maybe a
has to have enough information in it to make this distinction. Nothing
is easy, since it's just a constant and contains no other values, so it can just be a known fixed thing; for example's sake let's say we decide it should be a machine word with all 0 bits. Then when pattern matching is looking at mx
if it finds an all-zero word we take the Nothing
branch and if we find something non-zero we take the Just
branch. Easy, right?
For that to work we've assumed that a Just x
value will be different from the Nothing
value no matter what x
is. But this has to work without knowing the type a
, so x
could be any value of any type, and we have no way to control how those are represented in memory. If we just represent our Just x
value as the memory structure for x
with no additional wrapping, then any time the representation for x
happens to be (or start with) an all-zero word we'll think we had Nothing
when that's actually not the case.
In particular the type a
could be Maybe b
, which means our Just x
could be Just Nothing
; the value inside could be Nothing
so automatically by definition there are some x
values that have the same representation in memory as Nothing
. But nested maybes are by no means the only case where the nested value might have a memory representation that collides with that of Nothing
, so don't think it's just this one obscure case that's a problem; False
is also an empty constructor and there's no reason the algorithms assigning memory representations wouldn't assign it the same bit pattern as Nothing
; the same probably goes for LT :: Ordering
, or the first value of any enum type. There's no reason primitive number and character types would avoid that specific bit pattern either (whether it's all-zero or we pick something else).
There are only 2 ways you can make this work, really.
One is to give up on the Maybe
type being a compositional type that works like every other ADT. Instead you make it a special case thing where every type has a "nullable" version, but it doesn't nest. And you represent the null case by something globally identifiable (like a null pointer) rather than using memory layout used by all other values. This is the approach taken by a number of imperative languages that have approached this from the direction of improving "null safety", but it's not the approach Haskell takes. Maybe
is a perfectly ordinary ADT that doesn't need to be baked into the compiler, it is nestable and compositional, doesn't impose any extra rules that have to be memorised by learners.
The second is to not try to represent Just x
with the same memory structure as x
. There needs to be a memory record for the Just
data constructor itself, with at least one bit distinguishing it from Nothing
, and room for it to contain x
(rather than simply being x
). Now we can rely on being able to distinguish Nothing
from Just
, and once we know we've got a Just
we know where to look for the x
part regardless of what type it is. This is what Haskell does.
(There is a third imaginable strategy, which is to give up on the representations of types being decided independently. The language implementation would instead have to arrange for some global registry so that no two values are ever represented by the same structure in memory, regardless of the types involved; effectively keeping track of which bit patterns are "already assigned" globally across all modules in the program. But even if you could implement it, that rule itself dictates that we can't have Just x :: Maybe a
represented the same as x :: a
, so it still requires a wrapper around the x
).
So basically the Just
constructor is going to exist in the runtime system. You have to make quite dramatic changes to how Haskell works before you can avoid that. Even if the language syntax for defining Maybe
allowed data Maybe a = Nothing | a
, the second case would still have a wrapper constructor there, we just wouldn't have a name for it in the surface syntax.
And given that it has to be there for the language to work the way it does, it dramatically simplifies the syntax of the language to always have an explicit name for the data constructor, even when it only contains one value. For example, in a pattern match, the compiler can tell what the programmer intended to match by looking at the constructors. Otherwise how do you tell which Nothing
branch is which in this code:
-- Your proposed syntax
case (mmx :: Maybe (Maybe Int)) of
Nothing -> ...
Nothing -> ...
x -> ...
-- Standard Haskell sytax
case (mmx :: Maybe (Maybe Int)) of
Nothing -> ...
Just Nothing -> ...
Just (Just x) -> ...
For that matter how do I know that final x
branch is supposed intended to be matching the Int
that might be contained in the nested maybe? It's a bare variable, so it's a valid pattern at any of the levels? If I saw that pattern match without the explicit Maybe (Maybe Int)
type annotation, how would I even know how many levels there are supposed to be? The x
could just as validly be matching the Int
contained in a Maybe (Maybe (Maybe (Maybe (Maybe Int))))
(with all the other levels of Nothing
being left to error out as failed pattern matches).
And if a bare value without an explicit constructor is a valid option for a normal data declaration for Maybe
, it would have to be for any other type. A number of packages define a type that looks something like data Result a = OK a | Error String
; why wouldn't they be defined as data Result = a | String
? Now if I pass a value "hello"
to a function that requires a Result a
, am I giving it the error case? Or am I giving it the success case when a
is String
? Or am I giving it the success case when a
is Maybe (Identity (SomeOtherTypeWithABareStringCase))
? They would all look the same.
Or even worse, why shouldn't data Either a b = Left a | Right b
be written as data Either a b = a | b
? Now any expression could be interpreted as any path through any number of nestings of Either
. It would be impossible for anyone (or the compiler) to tell anything about the type of any code by looking at it.
Obviously nobody would design the language allowing all of my dumb examples above. Either there would be a lot of extra restrictions on when you could use this feature, or there would be extra syntactic markers added to disambiguate all of the difficult cases (or both).
But remember again that the constructors still need to really exist in the system at runtime. So this feature of not having to write Just
doesn't gain the system any capabilities. It only saves us a few key-presses writing the code. And in return we either have to lose capabilities (if the compiler restricts us to avoiding situations that would create an impossible ambiguous mess), or we have to write markers all over the place (which is basically equivalent to going back to explicit constructors anyway).
The closest I can see working is for this to only be allowed for a few built in types (not syntax that can be used in user-defined types), and add some special rules for those types to disambiguate common cases. Basically a somewhat less restricted version of the "nullable types" idea, so that a Maybe a
is still a "normal" enough type to allow instances of user-defined classes and such, even though Maybe a
wouldn't be definable by users.
But I don't want that. Haskell's beauty and power comes from having simple and consistent building blocks that are designed to be compositional. I don't want weird special cases to save a few key-presses. An ADT is a set of constructors, each of which has a list of fields. You build or pattern match values of an ADT by writing the constructor names. That fundamental language feature already works to cover all of the functionality gained by a nullable type feature so we don't need such a feature at all. Just a perfectly ordinary ADT called Maybe
.
(I do want to save a few key-presses because Haskellers are just as lazy as programmers in any other language! But I want to do so via a feature that can work generally across the whole language, not a special-case hack just for Maybe
. If you can't make this "bare value with no constructor" work as a general feature of ADTs I don't want it at all.)
a
anda
? (Hint: One is aMaybe
can you tell which?) – Hillery