Strange "!*" entry in LocalVariableTypeTable when compiling with Eclipse compiler
Asked Answered
D

1

13

Let's compile the following code with ECJ compiler from Eclipse Mars.2 bundle:

import java.util.stream.*;

public class Test {
    String test(Stream<?> s) {
        return s.collect(Collector.of(() -> "", (a, t) -> {}, (a1, a2) -> a1));
    }
}

The compilation command is the following:

$ java -jar org.eclipse.jdt.core_3.11.2.v20160128-0629.jar -8 -g Test.java

After the successful compilation let's check the resulting class file with javap -v -p Test.class. The most interesting is the synthetic method generated for the (a, t) -> {} lambda:

  private static void lambda$1(java.lang.String, java.lang.Object);
    descriptor: (Ljava/lang/String;Ljava/lang/Object;)V
    flags: ACC_PRIVATE, ACC_STATIC, ACC_SYNTHETIC
    Code:
      stack=0, locals=2, args_size=2
         0: return
      LineNumberTable:
        line 5: 0
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0       1     0     a   Ljava/lang/String;
            0       1     1     t   Ljava/lang/Object;
      LocalVariableTypeTable:
        Start  Length  Slot  Name   Signature
            0       1     1     t   !*

I was quite surprised to see this !* entry in LocalVariableTypeTable. JVM specification covers LocalVariableTypeTable attribute and says:

The constant_pool entry at that index must contain a CONSTANT_Utf8_info structure (§4.4.7) representing a field signature which encodes the type of a local variable in the source program (§4.7.9.1).

§4.7.9.1 defines a grammar for field signatures which, if I understand correctly, does not cover anything similar to !*.

It should also be noted that neither javac compiler, nor older ECJ 3.10.x versions generate this LocalVariableTypeTable entry. Is !* some non-standard Eclipse extension or I'm missing something in JVM spec? Does this mean that ECJ does not conform to JVM spec? What !* actually mean and are there any other similar strings which could appear in LocalVariableTypeTable attribute?

Department answered 17/5, 2016 at 6:9 Comment(1)
It might be linked to those bugs #429264 and #425183.Mungo
W
8

The token ! is used by ecj to encode a capture type in generic signatures. Hence !* signifies a capture of an unbounded wildcard.

Internally, ecj uses two flavours of CaptureBinding, one to implement, what JLS 18.4 calls "fresh type variables", the other to implement captures a la JLS 5.1.10 (which uses the same lingo of "free type variables"). Both produce a signature using !. At a closer look, in this example we have an "old-style" capture: t has type capture#1-of ?, capturing the <T> in Stream<T>.

The problem is: JVMS 4.7.9.1. doesn't seem to define an encoding for such fresh type variables (which among other properties have no correspondence in source code and hence no name).

I couldn't get javac to emit any LocalVariableTypeTable for the lambda, so they might simply avoid answering this question.

Given that both compilers agree on inferring t to a capture, why does one compiler generate a LVTT, where the other does not? JVMS 4.7.14 has this

This difference is only significant for variables whose type uses a type variable or parameterized type.

According to JLS, captures are fresh type variables, so an LVTT entry is significant, and it is an omission in JVMS not to specify a format for this type.

Consequences

The above only describes and explains the status quo, demonstrating that no specification tells a compiler to behave differently from current status. Obviously, this is not an entirely desirable situation.

  1. Someone may want to contact Oracle, mentioning that Java 8 introduces a situation that is not covered by parts of the JVMS. This situation may become even more relevant once also local variables become subject to type inference
  2. Anybody observing negative impact of the current situation is invited to chime in in rfe 494198 (ecj), which otherwise has low priority.

Update: Meanwhile someone has reported an example where a regular Signature attribute (which cannot be opportunistically omitted) is required to encode a type which cannot be encoded according to JVMS. In that case also javac creates unspecified byte code. According to a follow-up no variable should ever have such a type, but I don't think that this discussion is over, yet (and admittedly JLS doesn't yet ensure this goal).

Update 2: After receiving advice from a spec author I see three parts to the ultimate solution:

(1) Every type signature in any bytecode attribute must adhere to the grammar in JVMS 4.7.9.1. Neither ecj's ! nor javac's <captured wildcard> is legal.

(2) Compilers should approximate type signatures where no legal encoding exists, e.g., by using the erasure instead of a capture. For an LVTT entry, such approximation should be considered as legitimate.

(3) JLS must ensure that only types encodable using JVMS 4.7.9.1 appear in positions where generating a Signature attribute is mandatory.

For future versions of ecj items (1) and (2) have been resolved. I cannot speak about schedules when javac and JLS will be fixed accordingly.

Wine answered 19/5, 2016 at 21:52 Comment(21)
Thank you for the answer. Why ecj creates LocalVariableTypeTable for lambdas? Probably it would be better to skip this as javac does. Is it useful for something?Department
All we need to get lambda with LocalVariableTypeTable is to use parametrized type in lambda's body and, of course, compile with javac -g:vars. Here is an example: gist.github.com/Maccimo/c881bb71f1e9d14853de3a0e8a5ab077Balderas
@TagirValeev, ecj creates a LocalVariableTypeTable for every method in the byte code that contains at least one local variable with a "generic" type (parameterized or type variable). At the bytecode level the lambda method is a regular (synthetic) method, no reason to exclude this from the spec'd behaviour. Without the LVTT argument t appears to be of type Object, which isn't the full answer, anybody reading the bytecode who is aware of generics needs the LVTT for the full answer.Wine
These “fresh type variables” help to verify the correctness of the generic invocation of the Collector.of method, but I don’t see any reason why they should appear in a local variable table. The parameter t is not generic, it’s just Object as that’s the only type that is valid in that context. It’s worth noting that t is a parameter of the synthetic method, so if t was something other than plain Object, ECJ had to provide a Generic method signature describing the Generic parameter, but it didn’t.Nub
On closer inspection, we have a regular capture here, not a fresh type variable from 18.4., but the observable effect is the same. How to represent captures in LVTT is not specified in JVMS.Wine
Capture enters the picture right at the start because the expression s has type Stream<capture#1-of ?>. This capture percolates all the way into inference of the middle lambda, to finally become the type of t. @Holger, can you show, why this inference result would be wrong, why instead Object should be inferred? For me the difference between compilers very much sounds like bugs.openjdk.java.net/browse/JDK-8016207Wine
Interestingly, during type checking even javac knows the correct type of t: add t = new Object(); into the middle lambda, and javac will correctly complain: "Object cannot be converted to CAP#1".Wine
See JLS§15.27.3: “If T is a wildcard-parameterized functional interface type and the lambda expression is implicitly typed, then the ground target type is the non-wildcard parameterization (§9.9) of T” and at the end of §9.9: “Sometimes, it is possible to known from the context, such as the parameter types of a lambda expression, which function type is intended (§15.27.3). Other times, it is necessary to pick one; in these circumstances, the bounds are used.Nub
I general, variables have a real type, not something like CAP#1 that exists only within the compiler. So it’s not surprising that the JVMS has no way to encode such non-type things. There is no other purpose of the local variable tables than debugging anyway, so what would be the point of telling a debugger that a variable’s type is CAP#1 rather than ? or just Object?Nub
@Holger, sure ecj implements and applies JLS 15.27.3 where appropriate. However, when resolving the lambda, the target type BiConsumer is parameterized with a capture not a wildcard. Please see that also javac resolves t to CAP#1 as shown in another comment.Wine
@Holger: your view about real-typed local variables only holds for explicitly typed variables. Lambda arguments (and in the future inferred local variables) can have any type that is a possible result of inference.Wine
Since discussing the formal specification really exceeds the scope of SO, let’s end this by focusing on the one practical question: since the LocalVariableTypeTable merely exists for debugging purposes: what will the Eclipse debugger show, when it encounters a !* in said table? Will it be in any way more useful than what it will show when encountering an equivalent javac compiled lambda expression not even having that table?Nub
I cannot answer the "why" of this LVTT entry, this decision was basically made 11 years ago (and only incidentally surfaces now via lambdas). NB: While JVMS only mentions debugging, nowadays there are plenty more tools that read byte code. I can only answer the original question: "Does this mean that ECJ does not conform to JVM spec?" by saying: I don't see any violation of JLS nor JVMS.Wine
@StephanHerrmann The ECJ's !* does not comply with JVMS 4.7.9.1. Isn't it?Balderas
@user882813, the ! token fills a gap in the spec. So it neither fulfils nor violates the spec. I could even argue that the compiler would still be correct if it crashes in this situation. Answering ! is done to avoid that crash, though. I could only see one reason for changing this behavior: if it breaks downstream tools that consume the LVTT. Does it?Wine
@StephanHerrmann The only way to fill gap in spec is to issue revised version of spec. JVMS 4.7.9.1 define a grammar signature should comply to. And !* violate such a grammar. So ECJ violates JVMS by emitting !* in LVTT. BTW, there is other examples when ECJ behavior differ from javac in emitting of LVTT. Try to compile source from the gist I posted above. Javac will generate LVTT and ECJ will not at all.Balderas
When conforming to the spec ecj would use the rule TypeVariabeSignature as to emit [ Identifier ; and crash because Identifier is null. Better? Tell me: is the current behavior causing any harm?Wine
@StephanHerrmann If ECJ is unable to figure out what to emit then it should not emit anything since LVTT is not a mandatory attribute. And, as I stated before, ECJ is in fact didn't emit LVTT entry for local variable of type List<String> while JAVAC does. So it's not a problem for ECJ to avoid generating meaningless garbage.Balderas
@Balderas thanks for the additional test case. Here it turns out the LVTT entry is just "optimized" out, since the local variable is unused. If the variable is used, the LVTT entry is correctly generated. You can follow progress on this issue via bugs.eclipse.org/494225Wine
@StephanHerrmann well it causes harm for me: procyon compiler tools library dies when tries to parse such entry. I might convince its author to ignore this particular string, but it would be better if I could refer to some specification or whatever...Department
I also see the syntax !+ This currently chokes bcel.Loggins

© 2022 - 2024 — McMap. All rights reserved.