How does the Java compiler choose the runtime type for a parameterized type with multiple bounds?
Asked Answered
A

2

26

I would like to understand better what happens when the Java compiler encounters a call to a method like the one below.

<T extends AutoCloseable & Cloneable>
void printType(T... args) {
    System.out.println(args.getClass().getComponentType().getSimpleName());
}

// printType() prints "AutoCloseable"

It is clear to me that there is no type <T extends AutoCloseable & Cloneable> at runtime, so the compiler makes the least wrong thing it can do and creates an array with the type of one of the two bounding interfaces, discarding the other one.

Anyway, if the order of the interfaces is switched, the result is still the same.

<T extends Cloneable & AutoCloseable>
void printType(T... args) {
    System.out.println(args.getClass().getComponentType().getSimpleName());
}

// printType() prints "AutoCloseable"

This led me to do some more investigation and see what happens when the interfaces change. It seems to me that the compiler uses some kind of strict order rule to decide which interface is the most important, and the order the interfaces appear in code plays no role.

<T extends AutoCloseable & Runnable>                             // "AutoCloseable"
<T extends Runnable & AutoCloseable>                             // "AutoCloseable"
<T extends AutoCloseable & Serializable>                         // "Serializable"
<T extends Serializable & AutoCloseable>                         // "Serializable"
<T extends SafeVarargs & Serializable>                           // "SafeVarargs"
<T extends Serializable & SafeVarargs>                           // "SafeVarargs"
<T extends Channel & SafeVarargs>                                // "Channel"
<T extends SafeVarargs & Channel>                                // "Channel"
<T extends AutoCloseable & Channel & Cloneable & SafeVarargs>    // "Channel"

Question: How does the Java compiler determine the component type of a varargs array of a parameterized type when there are multiple bounds?

I'm not even sure if the JLS says anything about this, and none of the information I found by googling covers this particular topic.

Anis answered 4/5, 2018 at 14:28 Comment(7)
More info: intersection types in the JLS.Lecia
Part of the answer to this is found in §15.12.4.2, which is that the actual type of the array created is the erasure of T[], where T is the actual type inferred for the invocation of printType(). This would normally be straightforward, since the erasure of a type variable is its left-most bound, but I think that the bounds are being reordered inadvertently at some point by the processes described in Chapter 18 (type inference).Hatcher
@Hatcher wouldn't this be a bug? docs.oracle.com/javase/specs/jls/se9/html/jls-4.html#jls-4.4 says The order of types in a bound is only significant in that the erasure of a type variable is determined by the first type in its boundLong
@Long My reasoning is that since inference works with bound "sets", the ordering is probably unspecified. This could happen as early as creating the initial bound set from type parameters where "for each type T delimited by & in the TypeBound, the bound αl <: T[P1:=α1, ..., Pp:=αp] appears in the set". The phrase "appears in the set" doesn't seem to specify any sort of indexing scheme. I think that personally I would have to review the entire chapter to feel comfortable being sure about this, though.Hatcher
@Hatcher I don't know if you see the irony here, one part of the JLS says leftmost while some other says appears in the set (and you are right about no indexes what-so-ever)Long
@Long Yeah, if my theory is correct, then I think it's poorly defined. "Erasure of an inference variable" seems like a pretty rare thing to worry about, but this could cause some pretty incomprehensible errors. ideone.com/B4oNznHatcher
@Hatcher the reason why a simple (Runnable[])arr doesn’t throw in your example while (Runnable[])(Object[])arr does, is that the compiler removes the cast in the former expression, as the formal type already is Runnable[]. Casting to a wider type first makes the narrowing cast necessary in the compiled code. As my answer shows, you can get even worse things than an ArrayStoreException (and I’m currently investigating whether there are more things to exploit)…Consolidation
U
12

Typically, when the compiler encounters a call to a parameterised method, it can infers the type (JSL 18.5.2) and can create a correctly typed vararg array in the caller.

The rules are mostly technical ways of saying "find all possible input types and check them" (cases like void, ternary operator, or lambda). The rest is common sense, such as using the most specific common base class (JSL 4.10.4). Example:

public class Test {
   private static class A implements AutoCloseable, Runnable {
         @Override public void close () throws Exception {}
         @Override public void run () {} }
   private static class B implements AutoCloseable, Runnable {
         @Override public void close () throws Exception {}
         @Override public void run () {} }
   private static class C extends B {}

   private static <T extends AutoCloseable & Runnable> void printType( T... args ) {
      System.out.println( args.getClass().getComponentType().getSimpleName() );
   }

   public static void main( String[] args ) {
      printType( new A() );          // A[] created here
      printType( new B(), new B() ); // B[] created here
      printType( new B(), new C() ); // B[] which is the common base class
      printType( new A(), new B() ); // AutoCloseable[] - well...
      printType();                   // AutoCloseable[] - same as above
   }
}
  • JSL 18.2 dictates how to process the constrains for type inference, such as AutoCloseable & Channel is reduced to just Channel. But the rules do not help answer this question.

Getting AutoCloseable[] from the call may look weird, of course, because we can't do that with Java code. But in reality the actual type doesn't matter. At the language level, args is T[], where T is a "virtual type" that is both A and B (JSL 4.9).

The compiler just needs to make sure its usages meet all constrains, and then it knows the logic is sound and there will be no type error (this is how Java generic is designed). Of course the compiler still need to make a real array, and for the purpose it creates a "generic array". Thus the warning "unchecked generic array creation" (JLS 15.12.4.2).

In other words, as long as you pass in only AutoCloseable & Runnable, and calls only Object, AutoCloseable, and Runnable methods in printType, the actual array type does not matter. In fact, printType's bytecodes would be the same, regardless of what kind of array is passed in.

Since printType doesn't care the vararg array type, getComponentType() doesn't and shouldn't matter. If you want to get the interfaces, try getGenericInterfaces() which returns an array.

  • Because of type erasure (JSL 4.6), the order of interfaces of T does affect (JSL 13.1) compiled method signature and bytecode. The first interface AutoClosable will be used, e.g. no type check will be done when AutoClosable.close() is called in printType.
  • But this is unrelated with type interference of method calls of the question, i.e. why AutoClosable[] is created and passed. Many type safeties are checked before erasure, thus the order does not affect type safety. This I think is part of what JSL means by "The order of types... is only significant in that the erasure ... is determined by the first type" (JSL 4.4). It means the order is otherwise insignificant.
  • Regardless, this erasure rule does cause corner cases such as adding printType(AutoCloseable[]) triggers compile error, when adding printType( Runnable[]) does not. I believe this is an unexpected side effect and is really out of scope.
  • P.S. Digging too deep may cause insanity, considering that I think I am Ovis aries, view source into assembly, and struggles to answer in English instead of J̶́S͡L̴̀. My sanity score is b҉ȩyon̨d͝ r̨̡͝e̛a̕l̵ numb͟ers͡. T͉͎̫͠u͍r̟̦͝n̪͓͓̭̯̕ ̱̱̞̠̬ͅb̯̠̞̩͎a̘̜̯c̠̮k. ̠̝͕b̭̳͠͡ͅẹ̡̬̦̙f͓͉̼̻o̼͕̱͎̬̟̪r҉͏̛̣̼͙͍͍̠̫͙ȩ̵̮̟̱̫͚ ̢͚̭̹̳̣̩̱͠..t̷҉̛̫͔͉̥͎̬ò̢̱̪͉̲͎͜o̭͈̩̖̭̬.. ̮̘̯̗l̷̞͍͙̻̻͙̯̣͈̳͓͇a̸̢̢̰͓͓̪̳͉̯͉̼͝͝t̛̥̪̣̹̬͔̖͙̬̩̝̰͕̖̮̰̗͓̕͢ę̴̹̯̟͉̲͔͉̳̲̣͝͞.̬͖͖͇͈̤̼͖́͘͢.͏̪̱̝̠̯̬͍̘̣̩͉̯̹̼͟͟͠.̨͠҉̬̘̹ͅ
Underlaid answered 7/5, 2018 at 7:4 Comment(6)
how about this part? The order of types in a bound is only significant in that the erasure of a type variable is determined by the first type in its bound from docs.oracle.com/javase/specs/jls/se9/html/jls-4.html#jls-4.4. I've read your answer multiple times, but can't see it explaining that the leftmost part is not taken..Long
@Long Yes. When type erasure happens, T is erased to the leftmost type. I updated the answer to explain some of its effects including a corner case. Please understand that type erasure enters the picture only when a step specifically call for it. Which does happens quite a lot if you follow all the little steps, often recursively, sometimes infinitely (See last note of JSL 4.10.4). You may want to open a new question if you have another specific generic or erasure question.Underlaid
read your update, you made it even worse IMO (besides the disturbing PS). The question OP has is very simple, why is "leftmost" rule not respected - and asks for coverage against JLS. Your answer simply throws different JLS parts without making any sense (at least to me). This "leftmost" rule is widespread (hey, until reading this I would have said that T extends Runnable & AutoCloseable would be erased Runnable), but apparently not that correct. I really hoped that your answer would cover the details of this, but it does not IMHOLong
@Long I don't think that is what the OP ask, since it should be obvious that type erasure/leftmost rule is not deciding the vararg base class here. There are literally hundreds of steps involved (thus the jumping around) and I am not sure why you insist on this one. But yes the answer is a bit out of control, I'll try to find time to reorganise it. May take a while. Toddler is clinging on me, crying, refusing to eat…Underlaid
I was actually not informed about the leftmost rule when I asked. If @Long is correct and that rule applies to vararg arrays, and if leftmost means first in declaration order, then the compiler is violating the spec.Anis
@Underlaid off-the-record I've got an advice from the user that posted another answer here :first the kid, then SO, I use it to day. I would suggest the same thing...Long
C
3

This is a very interesting question. The relevant part of the specification is §15.12.4.2. Evaluate Arguments:

If the method being invoked is a variable arity method m, it necessarily has n > 0 formal parameters. The final formal parameter of m necessarily has type T[] for some T, and m is necessarily being invoked with k ≥ 0 actual argument expressions.

If m is being invoked with kn actual argument expressions, or, if m is being invoked with k = n actual argument expressions and the type of the k'th argument expression is not assignment compatible with T[], then the argument list (e1, ..., en-1, en, ..., ek) is evaluated as if it were written as (e1, ..., en-1, new |T[]| { en, ..., ek }), where |T[]| denotes the erasure (§4.6) of T[].

It’s interestingly vague about what “some T” actually is. The simplest and most straight-forward solution would be the declared parameter type of the invoked method; that would be assignment compatible and there is no actual advantage of using a different type. But, as we know, javac doesn’t go that route and uses some sort of common base type of all arguments or picks some of the bounds according to some unknown rule for the array’s element type. Nowadays you might even find some applications in the wild relying on this behavior, assuming to get some information about the actual T at runtime by inspecting the array type.

This leads to some interesting consequences:

static AutoCloseable[] ARR1;
static Serializable[]  ARR2;
static <T extends AutoCloseable & Serializable> void method(T... args) {
    ARR1 = args;
    ARR2 = args;
}
public static void main(String[] args) throws Exception {
    method(null, null);
    ARR2[0] = "foo";
    ARR1[0].close();
}

javac decides to create an array of the actual type Serializable[] here, despite the method’s parameter type is AutoClosable[] after applying type erasure, which is the reason why the assignment of a String is possible at runtime. So it will only fail at the last statement, when attempting to invoke the close() method on it with

Exception in thread "main" java.lang.IncompatibleClassChangeError: Class java.lang.String does not implement the requested interface java.lang.AutoCloseable

It’s blaming the class String here, though we could have put any Serializable object into the array as the actual issue is that a static field of the formal declared type AutoCloseable[] refers to an object of the actual type Serializable[].

Though it is a specific behavior of the HotSpot JVM that we ever got this far, as its verifier does not check assignments when interface types are involved (including arrays of interface types) but defers the check whether the actual class implements the interface to the last possible moment, when trying to actually invoke an interface method on it.

Interestingly, type casts are strict, when they appear in the class file:

static <T extends AutoCloseable & Serializable> void method(T... args) {
    AutoCloseable[] a = (AutoCloseable[])args; // actually removed by the compiler
    a = (AutoCloseable[])(Object)args; // fails at runtime
}
public static void main(String[] args) throws Exception {
    method();
}

While javac’s decision for Serializable[] in the above example seems arbitrary, it should be clear that regardless of which type it chooses, one of the field assignments would only be possible in a JVM with lax type checking. We could also highlight the more fundamental nature of the problem:

// erased to method1(AutoCloseable[])
static <T extends AutoCloseable & Serializable> void method1(T... args) {
    method2(args); // valid according to generic types
}
// erased to method2(Serializable[])
static <T extends Serializable & AutoCloseable> void method2(T... args) {
}
public static void main(String[] args) throws Exception {
    // whatever array type the compiler picks, it would violate one of the erased types
    method1();
}

While this doesn’t actually answer the question what actual rule javac uses (besides that it uses “some T”), it emphasizes the importance of treating arrays created for varargs parameter as intended: a temporary storage (don’t assign to fields) of an arbitrary type you better don’t care about.

Consolidation answered 18/5, 2018 at 13:42 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.