When is generic return value of function casted after type erasure?
Asked Answered
G

3

15

This question was inducted by this StackOverflow question about unsafe casts: Java Casting method without knowing what to cast to. While answering the question I encountered this behaviour I couldn't explain based on purely the specification

I found the following statement in The Java Tutorials at the Oracle docs:

It is not explained what "if necessary" means exactly, and I've found no mention about these casts in the Java Language Specification at all, so I started to experiment.

Let's look at the following piece of code:

// Java source
public static <T> T identity(T x) {
    return x;
}
public static void main(String args[]) {
    String a = identity("foo");
    System.out.println(a.getClass().getName());
    // Prints 'java.lang.String'

    Object b = identity("foo");
    System.out.println(b.getClass().getName());
    // Prints 'java.lang.String'
}

Compiled with javac and decompiled with the Java Decompiler:

// Decompiled code
public static void main(String[] paramArrayOfString)
{
    // The compiler inserted a cast to String to ensure type safety
    String str = (String)identity("foo");
    System.out.println(str.getClass().getName());

    // The compiler omitted the cast, as it is not needed
    // in terms of runtime type safety, but it actually could
    // do an additional check. Is it some kind of optimization
    // to decrease overhead? Where is this behaviour specified?
    Object localObject1 = identity("foo");
    System.out.println(localObject1.getClass().getName());
}

I can see that there is a cast which ensures type safety in the first case, but in the second case it is omitted. It is fine of course, because I want to store the return value in an Object typed variable, so the cast is not strictly necessary as per type safety. However it leads to an interesting behaviour with unsafe casts:

public class Erasure {
    public static <T> T unsafeIdentity(Object x) {
        return (T) x;
    }

    public static void main(String args[]) {
        // I would expect c to be either an Integer after this
        // call, or a ClassCastException to be thrown when the
        // return value is not Integer
        Object c = Erasure.<Integer>unsafeIdentity("foo");
        System.out.println(c.getClass().getName());
        // but Prints 'java.lang.String'
    }
}

Compiled and decompiled, I see no type cast to ensure correct return type at runtime:

// The type of the return value of unsafeIdentity is not checked,
// just as in the second example.
Object localObject2 = unsafeIdentity("foo");
System.out.println(localObject2.getClass().getName());

This means that if a generic function should return an object of a given type, it is not guaranteed it will return that type ultimately. An application using the above code will fail at the first point where it tries to cast the return value to an Integer if it does so at all, so I feel like it breaks the fail-fast principle.

What are the exact rules of the compiler inserting this cast during compilation that ensures type safety and where are those rules specified?

EDIT:

I see that the compiler will not dig into the code and try to prove that the generic code really returns what it should, but it could insert an assertation, or at least a type cast (which it already does in specific cases, as seen in the first example) to ensure correct return type, so the latter would throw a ClassCastException:

// It could compile to this, throwing ClassCastException:
Object localObject2 = (Integer)unsafeIdentity("foo");
Gusty answered 2/1, 2016 at 3:54 Comment(10)
I'm not an expert on this topic, but I don't think the compiler can do any checking in this case because (1) when it sees the line return (T) x;, it has no way to know statically that x can't be converted to T; and (2) when you actually call unsafeIdentity, the compiler can't know that this will fail because it will not delve into the code of the method and look for statements that will fail. Basically, I think this means that the cast to (T) in the method is useless.Paulo
Thanks @ajb, of course that cast to (T) is useless, it is really a minimalist example. But it could easily compile the outer function to Object o = (Integer)unsafeIdentity("foo");, and that would throw a ClassCastException or am I missing something?Gusty
I don't think that the compiler will / must ever insert an assertion unless you code it so why should it do it here? But the remainder of the question is interesting, +1.Chimene
OK, I see--the method is declared as returning a T, so I can see how the compiler might be able to add this check without reading the code of the method. But that would add unnecessary overhead in the vast majority of cases, including lots of Collections classes where, say, a get() method returns a generic type. That's probably an unacceptable tradeoff.Paulo
Did you check if the compiler creates a bridge method?Ursine
I don't fully understand the question. Why should it insert a cast? It clearly warns about an unchecked cast when compiling the unsafeIdentity method, and from that on, there are no guarantees about the type anyhow. However, I think that hg.openjdk.java.net/jdk8/jdk8/langtools/file/756ae3791c45/src/… might be relevant here, as it clearly says that it simply does not insert the cast when it is not necessary (and in fact, this may even be the answer to your question - but I'm not sure)Bakeman
"An application using the above code will fail at the first point where it tries to cast the return value to an Integer if it does so at all, so I feel like it breaks the fail-fast principle." When a cast fails at runtime, it always fails only when it actually tries the cast. When they intend to detect it early, they do so during compile-time.Rabassa
@Bakeman I'm not saying it should, I just say it could. The piece of source is promising, I have never dug into the OpenJDK source until now. So you say that this piece of behaviour is not specified and is implementation dependent?Gusty
@Rabassa Exactly! That's why I think the compiler could enforce an implicit cast on generic return values.Gusty
I added a reference to another stack overflow question, which made me think about this topic originallyGusty
M
6

If you can't find it in the specification, that means it's not specified, and it is up to the compiler implementation to decide where to insert casts or not, as long as the erased code meets the type safety rules of non-generic code.

In this case, the compiler's erased code looks like this:

public static Object identity(Object x) {
    return x;
}
public static void main(String args[]) {
    String a = (String)identity("foo");
    System.out.println(a.getClass().getName());

    Object b = identity("foo");
    System.out.println(b.getClass().getName());
}

In the first case, the cast is necessary in the erased code, because if you removed it, the erased code wouldn't compile. This is because Java guarantees that what is held at runtime in a reference variable of reifiable type must be instanceOf that reifiable type, so a runtime check is necessary here.

In the second case, the erased code compiles without a cast. Yes, it will also compile if you added a cast. So the compiler can decide either way. In this case, the compiler decided not to insert a cast. That is a perfectly valid choice. You should not rely on the compiler to decide either way.

Mania answered 2/1, 2016 at 20:33 Comment(1)
Thank you very much, this clearly answers my question. I hoped somebody could find something relevant in the spec that I couldn't, but it looks like the only mention about this is in the java tutorials article about type erasure, mention in JLS 15.5 (found by @HopefullyHelpful) , and the OpenJDK source (found by @Marco13)Gusty
S
-1

Version 1 is preferable because it fails at compiletime.

Typesafe version 1 non-legacy code:

class Erasure {
public static <T> T unsafeIdentity(T x) {
    //no cast necessary, type checked in the parameters at compile time
    return x;
}

public static void main(String args[]) {
    // This will fail at compile time and you should use Integer c = ... in real code
    Object c = Erasure.<Integer>unsafeIdentity("foo");
    System.out.println(c.getClass().getName());
   }
}

Typesafe version 2 legacy code (A run-time type error [...] In an automatically generated cast introduced to ensure the validity of an operation on a non-reifiable type and reference type casting):

class Erasure {
public static <T> T unsafeIdentity(Object x) {
    return (T) x;
    //Compiled version: return (Object) x; 
    //optimised version: return x;
}

public static void main(String args[]) {
    // This will fail on return, as the returned Object is type Object and Subtype Integer is expected, this results in an automatic cast and a ClassCastException:
    Integer c = Erasure.<Integer>unsafeIdentity("foo");
    //Compiled version: Integer c = (Integer)Erasure.unsafeIdentity("foo");
    System.out.println(c.getClass().getName());
   }
}

TypeSafe version 3 legacy code, Methods where you know a supertype everytime (JLS The erasure of a type variable (§4.4) is the erasure of its leftmost bound.):

class Erasure {
public static <T extends Integer> T unsafeIdentity(Object x) {
    // This will fail due to Type erasure and incompatible types:
    return (T) x;
    // Compiled version: return (Integer) x;
}

public static void main(String args[]) {
    //You should use Integer c = ...
    Object c = Erasure.<Integer>unsafeIdentity("foo");
    System.out.println(c.getClass().getName());
   }
}

Object was only used to illustrate that Object is a valid assignment target in version 1 and 3, but you should use the real type or the generic type if possible.

If you use another version of java you should look at the particular pages of the specification, I don't expect any changes.

Shupe answered 2/1, 2016 at 5:2 Comment(3)
All that you write is in fact true, but it still doesn't answer the questionGusty
"What are the exact rules of the compiler inserting this cast during compilation that ensures type safety and where are those rules specified?" I expect to see either a reference to the corresponding JLS paragraph if any, or some educated guess deducted from decompilation of multiple (occasionally edge-case) scenariosGusty
I added references, I only found an indirect reference to the automatic introduction, but after type erasure the type of the righthand statement in version 2 is Object, due to type erasure. This is also stated in both generics Tutorials.Shupe
G
-1

I can't explain it very well, but the comment can't add code as well as I want,so I add this answer. Just hope this answer can help your understanding.The comment can't add code as well as I want.

In your code:

public class Erasure {
    public static <T> T unsafeIdentity(Object x) {
        return (T) x;
    }

    public static void main(String args[]) {
        // I would expect it to fail:
        Object c = Erasure.<Integer>unsafeIdentity("foo");
        System.out.println(c.getClass().getName());
        // but Prints 'java.lang.String'
    }
}

It will erasure Generics after compile time. At compile time, the Erasure.unsafeIdentity has not errors. The jvm erasure Generics depend on the Generics params you give(Integer). After that, the function is like this?:

public static Integer unsafeIdentity(Object x) {
    return x;
}

In fact, the covariant returns will add Bridge Methods:

public static Object unsafeIdentity(Object x) {
    return x;
}

If the function is like last one, do you think the code in your main method will compile fail? It has no errors.Generics Erasure will not add cast in this function, and the return params is not the indentity of java function.

My explanation is a bit farfetched, but hope can help you to understand.

Edit:

After google about that topic, I guess your problems is covariant return types using bridge methods. BridgeMethods

Greenock answered 2/1, 2016 at 6:56 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.