Why can the Monad interface not be declared in Java?
Asked Answered
H

4

38

Before you start reading: This question is not about understanding monads, but it is about identifying the limitations of the Java type system which prevents the declaration of a Monad interface.


In my effort to understand monads I read this SO-answer by Eric Lippert on a question which asks about a simple explanation of monads. There, he also lists the operations which can be executed on a monad:

  1. That there is a way to take a value of an unamplified type and turn it into a value of the amplified type.
  2. That there is a way to transform operations on the unamplified type into operations on the amplified type that obeys the rules of functional composition mentioned before
  3. That there is usually a way to get the unamplified type back out of the amplified type. (This last point isn't strictly necessary for a monad but it is frequently the case that such an operation exists.)

After reading more about monads, I identified the first operation as the return function and the second operation as the bind function. I was not able to find a commonly used name for the third operation, so I will just call it the unbox function.

To better understand monads, I went ahead and tried to declare a generic Monad interface in Java. For this, I first looked at the signatures of the three functions above. For the Monad M, it looks like this:

return :: T1 -> M<T1>
bind   :: M<T1> -> (T1 -> M<T2>) -> M<T2>
unbox  :: M<T1> -> T1

The return function is not executed on an instance of M, so it does not belong into the Monad interface. Instead, it will be implemented as a constructor or factory method.

Also for now, I omit the unbox function from the interface declaration, since it is not required. There will be different implementations of this function for the different implementations of the interface.

Thus, the Monad interface only contains the bind function.

Let's try to declare the interface:

public interface Monad {
    Monad bind();
}

There are two flaws:

  • The bind function should return the concrete implementation, however it does only return the interface type. This is a problem, since we have the unbox operations declared on the concrete subtypes. I will refer to this as problem 1.
  • The bind function should retrieve a function as a parameter. We will address this later.

Using the concrete type in the interface declaration

This addresses problem 1: If my understanding of monads is correct, then the bind function always returns a new monad of the same concrete type as the monad where it was called on. So, if I have an implementation of the Monad interface called M, then M.bind will return another M but not a Monad. I can implement this using generics:

public interface Monad<M extends Monad<M>> {
    M bind();
}

public class MonadImpl<M extends MonadImpl<M>> implements Monad<M> {
    @Override
    public M bind() { /* do stuff and return an instance of M */ }
}

At first, this seems to work, however there are at least two flaws with this:

  • This breaks down as soon as an implementing class does not provide itself but another implementation of the Monad interface as the type parameter M, because then the bind method will return the wrong type. For example the

    public class FaultyMonad<M extends MonadImpl<M>> implements Monad<M> { ... }
    

    will return an instance of MonadImpl where it should return an instance of FaultyMonad. However, we can specify this restriction in the documentation and consider such an implementation as a programmer error.

  • The second flaw is more difficult to resolve. I will call it problem 2: When I try to instantiate the class MonadImpl I need to provide the type of M. Lets try this:

    new MonadImpl<MonadImpl<MonadImpl<MonadImpl<MonadImpl< ... >>>>>()
    

    To get a valid type declaration, this has to go on infinitely. Here is another attempt:

    public static <M extends MonadImpl<M>> MonadImpl<M> create() {
        return new MonadImpl<M>();
    }
    

    While this seems to work, we just defered the problem to the called. Here is the only usage of that function that works for me:

    public void createAndUseMonad() {
        MonadImpl<?> monad = create();
        // use monad
    }
    

    which essentially boils down to

    MonadImpl<?> monad = new MonadImpl<>();
    

    but this is clearly not what we want.

Using a type in its own declaration with shifted type parameters

Now, let's add the function parameter to the bind function: As described above, the signature of the bind function looks like this: T1 -> M<T2>. In Java, this is the type Function<T1, M<T2>>. Here is the first attempt to declare the interface with the parameter:

public interface Monad<T1, M extends Monad<?, ?>> {
    M bind(Function<T1, M> function);
}

We have to add the type T1 as generic type parameter to the interface declaration, so we can use it in the function signature. The first ? is the T1 of the returned monad of type M. To replace it with T2, we have to add T2 itself as a generic type parameter:

public interface Monad<T1, M extends Monad<T2, ?, ?>,
                       T2> {
    M bind(Function<T1, M> function);
}

Now, we get another problem. We added a third type parameter to the Monad interface, so we had to add a new ? to the usage of it. We will ignore the new ? for now to investigate the now first ?. It is the M of the returned monad of type M. Let's try to remove this ? by renaming M to M1 and by introducing another M2:

public interface Monad<T1, M1 extends Monad<T2, M2, ?, ?>,
                       T2, M2 extends Monad< ?,  ?, ?, ?>> {
    M1 bind(Function<T1, M1> function);
}

Introducing another T3 results in:

public interface Monad<T1, M1 extends Monad<T2, M2, T3, ?, ?>,
                       T2, M2 extends Monad<T3,  ?,  ?, ?, ?>,
                       T3> {
    M1 bind(Function<T1, M1> function);
}

and introducing another M3 results in:

public interface Monad<T1, M1 extends Monad<T2, M2, T3, M3, ?, ?>,
                       T2, M2 extends Monad<T3, M3,  ?,  ?, ?, ?>,
                       T3, M3 extends Monad< ?,  ?,  ?,  ?, ?, ?>> {
    M1 bind(Function<T1, M1> function);
}

We see that this will go on forever if we try to resolve all ?. This is problem 3.

Summing it all up

We identified three problems:

  1. Using the concrete type in the declaration of the abstract type.
  2. Instantiating a type which receives itself as generic type parameter.
  3. Declaring a type which uses itself in its declaration with shifted type parameters.

The question is: What is the feature that is missing in the Java type system? Since there are languages which work with monads, these languages have to somehow declare the Monad type. How do these other languages declare the Monad type? I was not able to find information about this. I only find information about the declaration of concrete monads, like the Maybe monad.

Did I miss anything? Can I properly solve one of these problems with the Java type system? If I cannot solve problem 2 with the Java type system, is there a reason why Java does not warn me about the not instantiable type declaration?


As already stated, this question is not about understanding monads. If my understanding of monads is wrong, you might give a hint about it, but don't attempt to give an explanation. If my understanding of monads is wrong the described problems remain.

This question is also not about whether it is possible to declare the Monad interface in Java. This question already received an answer by Eric Lippert in his SO-answer linked above: It is not. This question is about what exactly is the limitation that prevents me from doing this. Eric Lippert refers to this as higher types, but I can't get my head around them.

Most OOP languages do not have a rich enough type system to represent the monad pattern itself directly; you need a type system that supports types that are higher types than generic types. So I wouldn't try to do that. Rather, I would implement generic types that represent each monad, and implement methods that represent the three operations you need: turning a value into an amplified value, turning an amplified value into a value, and transforming a function on unamplified values into a function on amplified values.

Hallee answered 11/3, 2016 at 23:36 Comment(2)
Higher kinded types.Ratiocinate
@BrianGoetz: Good morning Brian, thanks for commenting. Do feel free to correct any errors I've made about Java generics and whatnot!Mccain
M
44

What is the feature that is missing in the Java type system? How do these other languages declare the Monad type?

Good question!

Eric Lippert refers to this as higher types, but I can't get my head around them.

You are not alone. But they are actually not as crazy as they sound.

Let's answer both of your questions by looking at how Haskell declares the monad "type" -- you'll see why the quotes in a minute. I have simplified it somewhat; the standard monad pattern also has a couple other operations in Haskell:

class Monad m where
  (>>=) :: m a -> (a -> m b) -> m b
  return :: a -> m a

Boy, that looks both incredibly simple and completely opaque at the same time, doesn't it?

Here, let me simplify that a bit more. Haskell lets you declare your own infix operator for bind, but we'll just call it bind:

class Monad m where
  bind :: m a -> (a -> m b) -> m b
  return :: a -> m a

All right, now at least we can see that there are the two monad operations in there. What does the rest of this mean?

The first thing to get your head around, as you note, is "higher kinded types". (As Brian points out, I somewhat simplified this jargon in my original answer. Also quite amusing that your question attracted the attention of Brian!)

In Java, a "class" is a kind of "type", and a class may be generic. So in Java we've got int and IFrob and List<IBar> and they're all types.

From this point on throw away any intuition you have about Giraffe being a class that is a subclass of Animal, and so on; we won't need that. Think about a world with no inheritance; it will not come into this discussion again.

What are classes in Java? Well, the easiest way to think of a class is that it is a name for a set of values that have something in common, such that any one of those values can be used when an instance of the class is required. You have a class Point, lets say, and if you have a variable of type Point, you can assign any instance of Point to it. The Point class is in some sense just a way to describe the set of all Point instances. Classes are a thing that is higher than instances.

In Haskell there are also generic and non-generic types. A class in Haskell is not a kind of type. In Java, a class describes a set of values; any time you need an instance of the class, you can use a value of that type. In Haskell a class describes a set of types. That is the key feature that the Java type system is missing. In Haskell a class is higher than a type, which is higher than an instance. Java only has two levels of hierarchy; Haskell has three. In Haskell you can express the idea "any time I need a type that has certain operations, I can use a member of this class".

(ASIDE: I want to point out here that I am making a bit of an oversimplification . Consider in Java for example List<int> and List<String>. These are two "types", but Java considers them to be one "class", so in a sense Java also has classes which are "higher" than types. But then again, you could say the same in Haskell, that list x and list y are types, and that list is a thing that is higher than a type; it's a thing that can produce a type. So it would in fact be more accurate to say that Java has three levels, and Haskell has four. The point remains though: Haskell has a concept of describing the operations available on a type that is simply more powerful than Java has. We'll look at this in more detail below.)

So how is this different than interfaces? This sounds like interfaces in Java -- you need a type that has certain operations, you define an interface that describes those operations. We'll see what is missing from Java interfaces.

Now we can start making sense of this Haskell:

class Monad m where

So, what is Monad? It's a class. What is a class? It's a set of types that have something in common, such that whenever you need a type that has certain operations, you can use a Monad type.

Suppose we have a type that is a member of this class; call it m. What are the operations that must be on this type in order for that type to be a member of the class Monad?

  bind :: m a -> (a -> m b) -> m b
  return :: a -> m a

The name of the operation comes to the left of the ::, and the signature comes to the right. So to be a Monad, a type m must have two operations: bind and return. What are the signatures of those operations? Let's look at return first.

  a -> m a

m a is Haskell for what in Java would be M<A>. That is, this means m is a generic type, a is a type, m a is m parametrized with a.

x -> y in Haskell is the syntax for "a function which takes type x and returns type y". It's Function<X, Y>.

Put it together, and we have return is a function that takes an argument of type a and returns a value of type m a. Or in Java

static <A>  M<A> Return(A a);

bind is a little bit harder. I think the OP well understands this signature, but for readers who are unfamiliar with the terse Haskell syntax, let me expand on this a bit.

In Haskell, functions only take one argument. If you want a function of two arguments, you make a function that takes one argument and returns another function of one argument. So if you have

a -> b -> c

Then what have you got? A function that takes an a and returns a b -> c. So suppose you wanted to make a function that took two numbers and returned their sum. You would make a function that takes the first number, and returns a function that takes a second number and adds it to the first number.

In Java you'd say

static <A, B, C>  Function<B, C> F(A a)

So if you wanted a C and you had and A and a B, you could say

F(a)(b)

Make sense?

All right, so

  bind :: m a -> (a -> m b) -> m b

is effectively a function that takes two things: an m a, and a a -> m b and it returns an m b. Or, in Java, it is directly:

static <A, B> Function<Function<A, M<B>>, M<B>> Bind(M<A>)

Or, more idiomatically in Java:

static <A, B> M<B> Bind(M<A>, Function<A, M<B>>) 

So now you see why Java cannot represent the monad type directly. It does not have the ability to say "I have a class of types that have this pattern in common".

Now, you can make all the monadic types you want in Java. The thing you can't do is make an interface that represents the idea "this type is a monad type". What you would need to do is something like:

typeinterface Monad<M>
{
  static <A>    M<A> Return(A a);
  static <A, B> M<B> Bind(M<A> m, Function<A, M<B>> f);
}

See how the type interface talks about the generic type itself? A monadic type is any type M that is generic with one type parameter and has these two static methods. But you can't do that in the Java or C# type systems. Bind of course could be an instance method that takes an M<A> as this. But there is no way to make Return anything but static. Java gives you no ability to (1) parameterize an interface by an unconstructed generic type, and (2) no ability to specify that static members are part of the interface contract.

Since there are languages which work with monads, these languages have to somehow declare the Monad type.

Well you'd think so but actually not. First off, of course any language with a sufficient type system can define monadic types; you can define all the monadic types you want in C# or Java, you just can't say what they all have in common in the type system. You can't make a generic class that can only be parameterized by monadic types, for instance.

Second, you can embed the monad pattern in the language in other ways. C# has no way to say "this type matches the monad pattern", but C# has query comprehensions (LINQ) built into the language. Query comprehensions work on any monadic type! It's just that the bind operation has to be called SelectMany, which is a little weird. But if you look at the signature of SelectMany, you'll see that it is just bind:

  static IEnumerable<R> SelectMany<S, R>(
    IEnumerable<S> source,
    Func<S, IEnumerable<R>> selector)

That's the implementation of SelectMany for the sequence monad, IEnumerable<T>, but in C# if you write

from x in a from y in b select z

then a's type can be of any monadic type, not just IEnumerable<T>. What is required is that a is M<A>, that b is M<B>, and that there is a suitable SelectMany that follows the monad pattern. So that's another way of embedding a "monad recognizer" in the language, without representing it directly in the type system.

(The previous paragraph is actually a lie of oversimplification; the binding pattern used by this query is slightly different than the standard monadic bind for performance reasons. Conceptually this recognizes the monad pattern; in actuality the details differ slightly. Read about them here http://ericlippert.com/2013/04/02/monads-part-twelve/ if you're interested.)

A few more small points:

I was not able to find a commonly used name for the third operation, so I will just call it the unbox function.

Good choice; it is usually called the "extract" operation. A monad need not have an extract operation exposed, but of course somehow bind needs to be able to get the A out of the M<A> in order to call the Function<A, M<B>> on it, so logically some sort of extraction operation usually exists.

A comonad -- a backwards monad, in a sense -- requires an extract operation to be exposed; extract is essentially return backwards. A comonad as well requires an extend operation that is sort of bind turned backwards. It has the signature static M<B> Extend(M<A> m, Func<M<A>, B> f)

Mccain answered 12/3, 2016 at 15:57 Comment(12)
Thanks for this great answer! Especially the explanation about Haskells three level type system and your made up notation for a typeinterface helped my understanding a lot.Hallee
@EricLippert, could a "MonadFactory" base class be used to workaround this limitation in C# or Java? What couldn't the MonadFactory do relative to a true higher type system other than a concise syntax?Brotherinlaw
@EduardoS: I don't know; give it a shot and report back your findings!Mccain
In the typeinterface notation: Is the type parameter A the same for the Return and Bind functions? The notation implies that they can be different.Hallee
Well, they're generic. If you have an int in hand then you can get an M<int>. If you have an M<int> in hand and a Func<int, M<String>> then you can get an M<string>. I'm not quite sure I'm following your train of thought here though.Mccain
@StefanDollase: If your question is whether we can alpha-rename it to M<A> Return(A); M<C> Bind(M<B>, Func<B, M<C>>) then yes, that renaming is perfectly sensible.Mccain
@EricLippert: Indeed, your last comment answers the question: The A in Return can be different from the A in Bind for one concrete type of the Monad class. Thanks for the verification.Hallee
@EricLippert The key feature of higher order types is to be able to pass a generic type G<A, B> as a type constructor to another generic type H<G<_, _>>. By this, I mean that you omit the type parameters A and B while passing G to H. Now, H can use G multiple times with different type parameters. In this example, G is a type while H is a type class. Is this correct?Hallee
@EricLippert Also, since G<_, _> is not a fully constructed type, it cannot directly be used as a type. As opposed to this, H<G<_, _>> is a fully constructed type, because the missing type information for G is specified by H. Thus H<G<_, _>> can directly be used as a type. Again, is this correct?Hallee
@stefandollase That all sounds reasonable, yes.Mccain
@EricLippert Thanks, I think now I understand the concept :-)Hallee
@EricLippert bind doesn't need to be static, since we're in OOP world it should be an instance method. Like flatMap on Optional. Monad<T> bind(Function<T, Monad<T>> f); This of course doesn't negate your argument for the unit or as you call it return method. That's basically the constructor for the monad and we can't mandate that. Also the type system would never check if the implementing type is truly a monad, but that's of course as usual.Oligopsony
C
4

If you look at what the AspectJ project is doing, it is similar to applying monads to Java. The way they do it is to post-process the byte code of the classes to add the additional functionality-- and the reason they have to do that is because there is no way within the language without the AspectJ extensions to do what they need to do; the language is not expressive enough.

A concrete example: say you start with class A. You have a monad M such that M(A) is a class that works just like A, but all method entrances and exits get traced to log4j. AspectJ can do this, but there is no facility within the Java language itself that would let you.

This paper describes how Aspect-Oriented Programming as in AspectJ might be formalized as monads

In particular, there is no way within the Java language to specify a type programmatically (short of byte-code manipulation a la AspectJ). All types are pre-defined when the program starts.

Charteris answered 12/3, 2016 at 0:14 Comment(0)
A
2

Good question indeed! :-)

As @EricLippert pointed out, the type of polymorphism that is known as "type classes" in Haskell is beyond the grasp of Java's type system. However, at least since the introduction of the Frege programming language it has been shown that a Haskell-like type system can indeed be implemented on top of the JVM.

If you want to use higher-kinded types in the Java language itself you have to resort to libraries like highJ or Cyclops. Both libraries do provide a monad type class in the Haskell sense (see here and here, respectively, for the sources of the monad type class). In both cases, be prepared for some major syntactic inconveniences; this code will not look pretty at all and carries a lot of overhead to shoehorn this functionality into Java's type system. Both libraries use a "type witness" to capture the core type separately from the data type, as John McClean explains in his excellent introduction. However, in neither implementation you will find anything as simple and straightforward as Maybe extends Monad or List extends Monad.

The secondary problem of specifying constructors or static methods with Java interfaces can be easily overcome by introducing a factory (or "companion") interface that declares the static method as a non-static one. Personally, I always try to avoid anything static and use injected singletons instead.

Long story short, yes, it is possible to represent HKTs in Java but at this point it is very inconvenient and not very user friendly.

Audition answered 6/8, 2019 at 1:17 Comment(0)
B
1

Yes, we cannot override static method in class, and we cannot write constructor in interface.

  • use abstract class to simulate Monad type class in Haskell
import java.util.function.Function;

public abstract class Monad<T> {
    public static <T> Monad<T> Unit(T a){
        throw new UnsupportedOperationException("Call Unit in abstract class: Monad");
    }
    public <R> Monad<R> OUnit(R a){
        throw new UnsupportedOperationException("Call OUnit in abstract class: Monad");
    }
    public <B> Monad<B> bind(Function<T, Monad<B>> func){
        throw new UnsupportedOperationException("Call bind in abstract class: Monad");
    }
    public <B> Monad<B> combine(Monad<B> b){
        return this.bind(unused -> b);
    }
}
public class Maybe<T> extends Monad<T> {
    public boolean has;
    public T val;
    public Maybe(T value) {
        this.has = true;
        this.val = value;
    }
    public Maybe(){
        has = false;
    }
    public static <T> Maybe<T> Unit(T a) {
        return new Maybe<T>(a);
    }
    public static <T> Maybe<T> Unit() {
        return new Maybe<T>();
    }
    @Override
    public <R> Maybe<R> OUnit(R a) {
        return new Maybe<R>(a);
    }
    public <T> Maybe<T> OUnit() {
        return new Maybe<T>();
    }
    @Override
    public <B> Monad<B> bind(Function<T, Monad<B>> func){
        if (this.has){
            return func.apply(this.val);
        }
        return new Maybe<B>();
    }
    @Override
    public String toString(){
        if (this.has){
            return "Maybe " + val.toString();
        }
        return "Nothing";
    }
}
public class Main {
/*
example :: (Monad m, Show (m n), Num n) => m n -> m n -> IO ()
example a b = do
  print $ a >> b
  print $ b >> a
  print $ a >>= (\x -> return $ x+x)
  print $ b >>= (\x -> return $ x+x)

main = do
  example (Just 10) (Just 5)
  example (Right 10) (Left 5)
*/
    public static void example(Monad<Integer> a, Monad<Integer> b){
        System.out.println(a.bind(x -> b));
        System.out.println(b.bind(x -> b));

        System.out.println(a.bind(x -> a.OUnit(x*2)));
        System.out.println(b.bind(x -> b.OUnit(x*2)));

        System.out.println(a.combine(a));
        System.out.println(a.combine(b));
        System.out.println(b.combine(a));
        System.out.println(b.combine(b));
    }
    // Monad can also used in any Objects
    public static void example2(Monad<Object> a, Monad<Object> b){
        System.out.println(a.bind(x -> b));
        System.out.println(b.bind(x -> b));

        System.out.println(a.combine(a));
        System.out.println(a.combine(b));
        System.out.println(b.combine(a));
        System.out.println(b.combine(b));
    }
    public static void main(String[] args){
        System.out.println("Example 1:");
        example(Maybe.<Integer>Unit(10), Maybe.<Integer>Unit());
        System.out.println("\n\nExample 2:");
        example(Maybe.<Integer>Unit(1), Maybe.<Integer>Unit(3));
        System.out.println("\n\nExample 3:");
        example2(Maybe.<Object>Unit(10), Maybe.<Object>Unit());
    }
}
  • use interface to simulate Monad type class in Haskell
import java.util.function.Function;

public interface Monad<T> {
    public static <T> Monad<T> Unit(T a){
        throw new UnsupportedOperationException("call Unit in Monad interface");
    }
    public <R> Monad<R> OUnit(R a);
    public <B> Monad<B> bind(Function<T, Monad<B>> func);
    default public <B> Monad<B> combine(Monad<B> b){
        return bind(x-> b);
    };
}
// in class Maybe, replace extends with implements
// in class Main, unchanged

and the output is the same

Backfire answered 2/8, 2022 at 6:27 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.