Why doesn't this generic cast fail?
Asked Answered
F

4

7

I'd expect this code to throw a ClassCastException:

public class Generics {
    public static void main(String[] args) {
        method(Integer.class);
    }

    public static <T> T method(Class<T> t) {
        return (T) new String();
    }
}

But it doesn't. Casting String to T doesn't fail, until I use the returned object somehow, like:

public class Generics {
    public static void main(String[] args) {
        method(Integer.class).intValue();
    }

    public static <T> T method(Class<T> t) {
        return (T) new String();
    }
}

Background: I created a Class which uses JAXB to unmarshal an XML file. It looks like this:

public static <T> T unmarshal(File file, Class<? extends T> clazz)

Depending on whether the root-Element is an anonymous type or not, either T or JAXBElement is being returned. JAXBElement of course, can't be casted to T. In my unit test, where I only called unmarshal() without doing something with the result, everything worked fine. In Code, it failed.

Why doesn't it fail directly? Is this a bug? If not, I'd like to understand why.

Fanchie answered 19/11, 2010 at 12:41 Comment(1)
Well, it does produce a warning.Melancholia
A
2

Basically, because of type erasure, Java will perform a type-check at the call-site whenever you make use of the fact that T is something specific.
But things aren't quite that simple, unfortunately.


The other answers are incorrect when they say T being Object is the reason you don't get a ClassCastException.

Let's test the theory and manually choose T to be Integer:

Generics.<Integer>method(Integer.class);

When I run this, it still doesn't fail.

Java does infer T to be Integer, method(Integer.class).intValue() would be a compile-time error, chained calls do not inform type inference.


So what is going on?

Note that when it does fail, it never fails method, it will always fail inside main.

Due to type erasure, method basically ends up without any generics information after compilation. The return type ends up being Object, the parameter type is the raw type Class and the cast inside the method is simply removed because a cast would be a no-op in the absence of any generics information.

You can see this when checking the bytecode of the callsite:

   0: ldc           #7   // class java/lang/Integer
   2: invokestatic  #9   // Method method:(Ljava/lang/Class;)Ljava/lang/Object;
                                                               ^^^^^^^^^^^^^^^^
                                                               return type

When calling a method in the bytecode, the return type ends up as part of the method's "name", if you will.

Exploring furthercompiler explorer, we find that a modified main method produces the following bytecode1 for the first four lines:

Main.<Integer>method(Integer.class);
   0: ldc           #7   // class java/lang/Integer
   2: invokestatic  #9   // Method method:(Ljava/lang/Class;)Ljava/lang/Object;
   5: pop
Object o = Main.<Integer>method(Integer.class);
   6: ldc           #7   // class java/lang/Integer
   8: invokestatic  #9   // Method method:(Ljava/lang/Class;)Ljava/lang/Object;
  11: astore_1
Main.<Integer>method(Integer.class).intValue();
  12: ldc           #7   // class java/lang/Integer
  14: invokestatic  #9   // Method method:(Ljava/lang/Class;)Ljava/lang/Object;
  17: checkcast     #7   // class java/lang/Integer
  20: invokevirtual #15  // Method java/lang/Integer.intValue:()I
  23: pop
Integer i = Main.<Integer>method(Integer.class);
  24: ldc           #7   // class java/lang/Integer
  26: invokestatic  #9   // Method method:(Ljava/lang/Class;)Ljava/lang/Object;
  29: checkcast     #7   // class java/lang/Integer
  32: astore_2

For each line, I have added the corresponding bytecode interspersed with the Java code.

Compare the bytecode for the different lines. Note how Java inserts a checkcast instruction after the method call to method, i.e. after invokevirtual. This performs a type-check on the returned value, which is currently on top of the stack. Since it's a String and it's cast to Integer, you get a ClassCastException.

It does not do that for the first two lines which don't use the result.

This is why your code fails only when you actually use the result like you do.


I would have assumed that Java inserts this cast whenever you make use of the fact that T is Integer to verify that method actually did return something of type T as best it can to fail early.

Here's another example:

Main.<Integer>method(Integer.class).toString();
  33: ldc           #7   // class java/lang/Integer
  35: invokestatic  #9   // Method method:(Ljava/lang/Class;)Ljava/lang/Object;
  38: checkcast     #7   // class java/lang/Integer
  41: invokevirtual #19  // Method java/lang/Integer.toString:()Ljava/lang/String;
  44: pop

The compiler knows that the .toString() call is being placed on something of type Integer, so it places a virtual call directly to Integer's version of this method. Of course the compiler needs insert a check to ensure the returned value (which could be anything at runtime) conforms to Integer, so it inserts another checkcast instruction.

However, even when using a class that doesn't override Object's toString, Java still inserts a checkcast:

Main.<Main>method(Main.class).toString();
  45: ldc           #6   // class Main
  47: invokestatic  #3   // Method method:(Ljava/lang/Class;)Ljava/lang/Object;
  50: checkcast     #6   // class Main
  53: invokevirtual #7   // Method java/lang/Object.toString:()Ljava/lang/String;
  56: pop

Despite targeting a method that exists for all objects with essentially choosing the static receiver type to be Object, Java still inserts checkcast.

When we cast the returned value to Object by ourselves, however, Java does not add any checkcast whatsoever and the call can go through.


Let's back off a little and think about what we've been doing. We're not looking at Java per se, we've been looking at bytecode.

Java is defined by the Java Language Specification. I'd expect to find some kind of rule that describe when this type check is done and when it isn't that.

Unfortunately, I've been unable to find anything about these inserted type-checks in the spec.

Others have looked, too, several years after you've stumbled across this.

If it is truly unspecified, whenever I said "Java does/doesn't insert a checkcast" above, I should probably have said "this particular compiler" instead of "Java" and what we've been looking at might technically just be an implementation detail (as of yet).

1 Running some variant of JDK 17.0.0

Alluring answered 8/3, 2022 at 20:33 Comment(0)
B
3

If T is not explicitly specified, the type erasure will treat it as Object. Therfore, your String object can be casted...

Bartram answered 19/11, 2010 at 12:51 Comment(0)
G
3

You don't explicitly specified, so the T is Object.

So the look up look like this

public class Generics {

    public static void main(String[] args) {
        Generics.method(Integer.class).intValue();
    }

    public static Object method(Class<Object> t) {
        return (Object) new String();
    }
}

If you specify the generic parameter:

public class Generics {

    public static void main(String[] args) {
        Generics.<Integer>method(Integer.class).intValue();
    }

    public static <T> T method(Class<T> t) {
        return (T) new String();
    }
}

You will get that exception .

Groceryman answered 19/11, 2010 at 12:55 Comment(0)
A
2

Basically, because of type erasure, Java will perform a type-check at the call-site whenever you make use of the fact that T is something specific.
But things aren't quite that simple, unfortunately.


The other answers are incorrect when they say T being Object is the reason you don't get a ClassCastException.

Let's test the theory and manually choose T to be Integer:

Generics.<Integer>method(Integer.class);

When I run this, it still doesn't fail.

Java does infer T to be Integer, method(Integer.class).intValue() would be a compile-time error, chained calls do not inform type inference.


So what is going on?

Note that when it does fail, it never fails method, it will always fail inside main.

Due to type erasure, method basically ends up without any generics information after compilation. The return type ends up being Object, the parameter type is the raw type Class and the cast inside the method is simply removed because a cast would be a no-op in the absence of any generics information.

You can see this when checking the bytecode of the callsite:

   0: ldc           #7   // class java/lang/Integer
   2: invokestatic  #9   // Method method:(Ljava/lang/Class;)Ljava/lang/Object;
                                                               ^^^^^^^^^^^^^^^^
                                                               return type

When calling a method in the bytecode, the return type ends up as part of the method's "name", if you will.

Exploring furthercompiler explorer, we find that a modified main method produces the following bytecode1 for the first four lines:

Main.<Integer>method(Integer.class);
   0: ldc           #7   // class java/lang/Integer
   2: invokestatic  #9   // Method method:(Ljava/lang/Class;)Ljava/lang/Object;
   5: pop
Object o = Main.<Integer>method(Integer.class);
   6: ldc           #7   // class java/lang/Integer
   8: invokestatic  #9   // Method method:(Ljava/lang/Class;)Ljava/lang/Object;
  11: astore_1
Main.<Integer>method(Integer.class).intValue();
  12: ldc           #7   // class java/lang/Integer
  14: invokestatic  #9   // Method method:(Ljava/lang/Class;)Ljava/lang/Object;
  17: checkcast     #7   // class java/lang/Integer
  20: invokevirtual #15  // Method java/lang/Integer.intValue:()I
  23: pop
Integer i = Main.<Integer>method(Integer.class);
  24: ldc           #7   // class java/lang/Integer
  26: invokestatic  #9   // Method method:(Ljava/lang/Class;)Ljava/lang/Object;
  29: checkcast     #7   // class java/lang/Integer
  32: astore_2

For each line, I have added the corresponding bytecode interspersed with the Java code.

Compare the bytecode for the different lines. Note how Java inserts a checkcast instruction after the method call to method, i.e. after invokevirtual. This performs a type-check on the returned value, which is currently on top of the stack. Since it's a String and it's cast to Integer, you get a ClassCastException.

It does not do that for the first two lines which don't use the result.

This is why your code fails only when you actually use the result like you do.


I would have assumed that Java inserts this cast whenever you make use of the fact that T is Integer to verify that method actually did return something of type T as best it can to fail early.

Here's another example:

Main.<Integer>method(Integer.class).toString();
  33: ldc           #7   // class java/lang/Integer
  35: invokestatic  #9   // Method method:(Ljava/lang/Class;)Ljava/lang/Object;
  38: checkcast     #7   // class java/lang/Integer
  41: invokevirtual #19  // Method java/lang/Integer.toString:()Ljava/lang/String;
  44: pop

The compiler knows that the .toString() call is being placed on something of type Integer, so it places a virtual call directly to Integer's version of this method. Of course the compiler needs insert a check to ensure the returned value (which could be anything at runtime) conforms to Integer, so it inserts another checkcast instruction.

However, even when using a class that doesn't override Object's toString, Java still inserts a checkcast:

Main.<Main>method(Main.class).toString();
  45: ldc           #6   // class Main
  47: invokestatic  #3   // Method method:(Ljava/lang/Class;)Ljava/lang/Object;
  50: checkcast     #6   // class Main
  53: invokevirtual #7   // Method java/lang/Object.toString:()Ljava/lang/String;
  56: pop

Despite targeting a method that exists for all objects with essentially choosing the static receiver type to be Object, Java still inserts checkcast.

When we cast the returned value to Object by ourselves, however, Java does not add any checkcast whatsoever and the call can go through.


Let's back off a little and think about what we've been doing. We're not looking at Java per se, we've been looking at bytecode.

Java is defined by the Java Language Specification. I'd expect to find some kind of rule that describe when this type check is done and when it isn't that.

Unfortunately, I've been unable to find anything about these inserted type-checks in the spec.

Others have looked, too, several years after you've stumbled across this.

If it is truly unspecified, whenever I said "Java does/doesn't insert a checkcast" above, I should probably have said "this particular compiler" instead of "Java" and what we've been looking at might technically just be an implementation detail (as of yet).

1 Running some variant of JDK 17.0.0

Alluring answered 8/3, 2022 at 20:33 Comment(0)
C
1

I think that you can make stronger method definition like this:

public static <T extends Number> T method(Class<T> t) {
    return //// some code.
}

In this case line return new String() just cannot be compiled.

but line return new Integer(123); is compiled, works and does not require casting.

Cloister answered 19/11, 2010 at 13:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.