Why are new java.util.Arrays methods in Java 8 not overloaded for all the primitive types?
Asked Answered
G

1

58

I'm reviewing the API changes for Java 8 and I noticed that the new methods in java.util.Arrays are not overloaded for all primitives. The methods I noticed are:

Currently these new methods only handle int, long, and double primitives.

int, long, and double are probably the most widely used primitives so it makes sense that if they had to limit the API that they would choose those three, but why did they have to limit the API?

Gyrostatics answered 7/4, 2014 at 17:11 Comment(2)
Is this different from the question "Why aren't there {Boolean,Byte,Char,Float,Short}UnaryOperator interfaces in java.util.function"? One would need those for the extra overloadings to work without unboxing and null checks on the operator output.Rundell
It is a nuanced difference in my mind. The decision to support primitives or not, vs. having support for some primitives, but not all, where previously there was full primitive support.Gyrostatics
B
86

To address the questions as a whole, and not just this particular scenario, I think we all want to know...

Why There's Interface Pollution in Java 8

For instance, in a language like C#, there is a set of predefined function types accepting any number of arguments with an optional return type (Func and Action each one going up to 16 parameters of different types T1, T2, T3, ..., T16), but in the JDK 8 what we have is a set of different functional interfaces, with different names and different method names, and whose abstract methods represent a subset of well known function arities (i.e. nullary, unary, binary, ternary, etc). And then we have an explosion of cases dealing with primitive types, and there are even other scenarios causing an explosion of more functional interfaces.

The Type Erasure Issue

So, in a way, both languages suffer from some form of interface pollution (or delegate pollution in C#). The only difference is that in C# they all have the same name. In Java, unfortunately, due to type erasure, there is no difference between Function<T1,T2> and Function<T1,T2,T3> or Function<T1,T2,T3,...Tn>, so evidently, we couldn't simply name them all the same way and we had to come up with creative names for all possible types of function combinations. For further reference on this, please refer to How we got the generics we have by Brian Goetz.

Don't think the expert group did not struggle with this problem. In the words of Brian Goetz in the lambda mailing list:

[...] As a single example, let's take function types. The lambda strawman offered at devoxx had function types. I insisted we remove them, and this made me unpopular. But my objection to function types was not that I don't like function types -- I love function types -- but that function types fought badly with an existing aspect of the Java type system, erasure. Erased function types are the worst of both worlds. So we removed this from the design.

But I am unwilling to say "Java never will have function types" (though I recognize that Java may never have function types.) I believe that in order to get to function types, we have to first deal with erasure. That may, or may not be possible. But in a world of reified structural types, function types start to make a lot more sense [...]

An advantage of this approach is that we can define our own interface types with methods accepting as many arguments as we would like, and we could use them to create lambda expressions and method references as we see fit. In other words, we have the power to pollute the world with yet even more new functional interfaces. Also, we can create lambda expressions even for interfaces in earlier versions of the JDK or for earlier versions of our own APIs that defined SAM types like these. And so now we have the power to use Runnable and Callable as functional interfaces.

However, these interfaces become more difficult to memorize since they all have different names and methods.

Still, I am one of those wondering why they didn't solve the problem as in Scala, defining interfaces like Function0, Function1, Function2, ..., FunctionN. Perhaps, the only argument I can come up with against that is that they wanted to maximize the possibilities of defining lambda expressions for interfaces in earlier versions of the APIs as mentioned before.

Lack of Value Types Issue

So, evidently type erasure is one driving force here. But if you are one of those wondering why we also need all these additional functional interfaces with similar names and method signatures and whose only difference is the use of a primitive type, then let me remind you that in Java we also lack of value types like those in a language like C#. This means that the generic types used in our generic classes can only be reference types and not primitive types.

In other words, we can't do this:

List<int> numbers = asList(1,2,3,4,5);

But we can indeed do this:

List<Integer> numbers = asList(1,2,3,4,5);

The second example, though, incurs in the cost of boxing and unboxing of the wrapped objects back and forth from/to primitive types. This can become really expensive in operations dealing with collections of primitive values. So, the expert group decided to create this explosion of interfaces to deal with the different scenarios. To make things "less worse" they decided to only deal with three basic types: int, long and double.

Quoting the words of Brian Goetz in the lambda mailing list:

[...] More generally: the philosophy behind having specialized primitive streams (e.g., IntStream) is fraught with nasty tradeoffs. On the one hand, it's lots of ugly code duplication, interface pollution, etc. On the other hand, any kind of arithmetic on boxed ops sucks, and having no story for reducing over ints would be terrible. So we're in a tough corner, and we're trying to not make it worse.

Trick #1 for not making it worse is: we're not doing all eight primitive types. We're doing int, long, and double; all the others could be simulated by these. Arguably we could get rid of int too, but we don't think most Java developers are ready for that. Yes, there will be calls for Character, and the answer is "stick it in an int." (Each specialization is projected to ~100K to the JRE footprint.)

Trick #2 is: we're using primitive streams to expose things that are best done in the primitive domain (sorting, reduction) but not trying to duplicate everything you can do in the boxed domain. For example, there's no IntStream.into(), as Aleksey points out. (If there were, the next question(s) would be "Where is IntCollection? IntArrayList? IntConcurrentSkipListMap?) The intention is many streams may start as reference streams and end up as primitive streams, but not vice versa. That's OK, and that reduces the number of conversions needed (e.g., no overload of map for int -> T, no specialization of Function for int -> T, etc.) [...]

We can see that this was a difficult decision for the expert group. I think few would agree that this is elegant, but most of us would most likely agree it was necessary.

For further reference on the subject you may want to read The State of Value Types by John Rose, Brian Goetz, and Guy Steele.

The Checked Exceptions Issue

There was a third driving force that could have made things even worse, and it is the fact that Java supports two types of exceptions: checked and unchecked. The compiler requires that we handle or explicitly declare checked exceptions, but it requires nothing for unchecked ones. So, this creates an interesting problem, because the method signatures of most of the functional interfaces do not declare to throw any exceptions. So, for instance, this is not possible:

Writer out = new StringWriter();
Consumer<String> printer = s -> out.write(s); //oops! compiler error

It cannot be done because the write operation throws a checked exception (i.e. IOException) but the signature of the Consumer method does not declare it throws any exception at all. So, the only solution to this problem would have been to create even more interfaces, some declaring exceptions and some not (or come up with yet another mechanism at the language level for exception transparency. Again, to make things "less worse" the expert group decided to do nothing in this case.

In the words of Brian Goetz in the lambda mailing list:

[...] Yes, you'd have to provide your own exceptional SAMs. But then lambda conversion would work fine with them.

The EG discussed additional language and library support for this problem, and in the end felt that this was a bad cost/benefit tradeoff.

Library-based solutions cause a 2x explosion in SAM types (exceptional vs not), which interact badly with existing combinatorial explosions for primitive specialization.

The available language-based solutions were losers from a complexity/value tradeoff. Though there are some alternative solutions we are going to continue to explore -- though clearly not for 8 and probably not for 9 either.

In the meantime, you have the tools to do what you want. I get that you prefer we provide that last mile for you (and, secondarily, your request is really a thinly-veiled request for "why don't you just give up on checked exceptions already"), but I think the current state lets you get your job done. [...]

So, it's up to us, the developers, to craft yet even more interface explosions to deal with these in a case-by-case basis:

interface IOConsumer<T> {
   void accept(T t) throws IOException;
}

static<T> Consumer<T> exceptionWrappingBlock(IOConsumer<T> b) {
   return e -> {
    try { b.accept(e); }
    catch (Exception ex) { throw new RuntimeException(ex); }
   };
}

In order to do:

Writer out = new StringWriter();
Consumer<String> printer = exceptionWrappingBlock(s -> out.write(s));

Probably, in the future when we get Support for Value Types in Java and Reification, we will be able to get rid of (or at least no longer need to use anymore) some of these multiple interfaces.

In summary, we can see that the expert group struggled with several design issues. The need, requirement or constraint to keep backward compatibility made things difficult, then we have other important conditions like the lack of value types, type erasure and checked exceptions. If Java had the first and lacked the other two the design of JDK 8 would probably have been different. So, we all must understand that these were difficult problems with lots of tradeoffs and the EG had to draw a line somewhere and make decisions.

Borehole answered 7/4, 2014 at 17:25 Comment(5)
When I saw the Java 8 features a couple of months ago, i thought "Well, eight years later than the rest of the world, but seems like Java is becoming modern: Lambdas, first class functions, a LINQ-like api..." But now... This answer shows why Java has no future at all, and have to be redesigned entirely. Almost all Java 8 "modern" features suffer for problems related to the stupid dessign decisions which Java has: Checked exceptions, everything is an interface/class hierarchy, no value types, type erasure, etc. Sincerely Java sounds to me like a jokeClaman
@Claman I think you'll find this interesting: A Discussion With Neal Gafter on the Future of JavaBorehole
@Claman I daresay 8 years later is not accurate. For that to be true, lambda expressions should have been invented 8 years ago, but they are the oldest trick in the book. We could either say ~84 years after Alonso Church's Lambda Calculus or ~56 years since McCarthy et all developed lisp, which was probably the first to have them :-)Borehole
Thanks for the great explanation. Being new to this type of java problem, and from a long .net and scripting background, I couldn't really understand why some of these choices had to be made even though I understand the basic constraints java is laboring under. I am a little more willing to live with the daily pain now ;) The one thing that would be nice is extension methods, again like c#, so I could wrap this up and hide it in a repeatable way. But then, I probably couldn't because of type erasure. Man, they really have to fix that. Until then, I think I'm sticking to a foreach :DNetta
This is over-complicating things. There is stream ops for int[], long[]` and double[]. So yes, naturally there should be ops for the other primitive types. I find it rather humorous actually - Java at its finest! LOL.Tweed

© 2022 - 2024 — McMap. All rights reserved.