Is there a Java bytecode optimizer that removes useless gotos?
Asked Answered
H

7

18

Problem: I have a method that compiles to over 8000 bytes of Java bytecode. HotSpot has a magic limit that makes the JIT not kick in for methods that exceed 8000 bytes. (Yes, it is reasonable to have a huge method. This is a tokenizer loop.) The method is in a library and I don't want to require users of the library to have to configure HotSpot to deactivate the magic limit.

Observation: Decompiling the bytecode shows that Eclipse Java Compiler generates a lot of pointless gotos. (javac is even worse.) That is, there are gotos that are only reachable from jumps. Obviously, the jump that jumps to the goto should instead jump directly where the goto jumps and the goto should be eliminated.

Question: Is there a bytecode optimizer for Java 5 class files that flattens pointless jump chains and then removes unnecessary gotos?

Edit: I mean patterns like:

8698:   goto    8548
8701:   goto    0

Obviously, the second goto can only be reached by a jump to 8701 which might as well be a direct jump to 0.

On a second investigation, this questionable pattern is more common:

4257:   if_icmpne   4263
4260:   goto    8704
4263:   aload_0

Where obviously, one would like the compiler to reverse the "not equal" comparison to "equal" comparison, jump to 8704 and eliminate the goto.

Hazaki answered 3/6, 2009 at 12:37 Comment(5)
Some architectures have a limit on how far a relative branch can go (because they hold the address in a 8 or 16 bit register) so they often got around that with a relative branch to a non-relative branch that used the full program counter size. Is the JVM like that?Analytic
Do you mean LABELS only reachable from jumps?Shrub
A runtime annotation to give a hint to the JVM sure would seem to be nice in this case....but I don't think such a thing exists (and a quick google doesn't turn up anything.)Skinnydip
Interesting question. Why does the JIT not kick in for methods that exceed a magical limit? Is there a good engineering reason for it?Aglow
I guess I was wrong about the relative jumps - I just looked and both if_cmpne and goto both take 16 bit signed ints.Analytic
N
1

I feel your pain. I had to write a parser once that had around 5kloc of if(str.equals(...)) code. I broke into several methods along the lines of parse1, parse2, etc. If parse1 didn't result in a parsed answer, parse2 was called, etc. This isn't necessarily best-practices, but it does do what you need it to.

Neutralize answered 3/6, 2009 at 12:55 Comment(4)
btw: in case you didn't happen to know, you've implemented the "Chain of Responsibility" pattern with the parse1, parse2 etc approach (not that that's a bad or good thing for this problem; just wanted to mention it...)Castled
I used to have multiple methods instead of one huge switch in a loop. However, the huge switch structure fits the spec better and is faster when the JIT does kick in.Hazaki
Well you could always do a smaller switch() and the default case calls another method with a small switch and so on until all of your larger switch cases are covered. This wouldn't be as nice as one large switch table, but it would work. In my case I couldn't use switch since it doesn't work on Strings.Neutralize
Yeah, I've considered having two different switch loops and shipping locals between the two. That's ugly though, and I want to keep the overall structure such that it's easy to mechanically compile the whole thing into gotoful C++ with even less switch. (Yes, I have unusual requirements. :-)Hazaki
H
0

One method compiling to over 8000 bytes? Does anybody understand that code? Is is testable? Try to split it up into multiple (private?) methods with meaningfull names instead of hassling with the optimizer!

OK, maybe there are cases legitimate large methods. But sorry, there are no hints in the question.

Hymenium answered 3/6, 2009 at 12:49 Comment(2)
Sounds like he's writing a parser -- it is actually very reasonable for a tokenizer loop to get big, and it's actually more readable to keep it together. (I've had folks force me to break up such a method just because it was over n lines and the next month they complained because it was harder to follow...)Castled
Yes, I understand the code. It also maps more closely to spec than any reformulation as recursive descent methods. Yes, it is testable with a large set of unit tests.Hazaki
E
0

Does it make a difference if you don't compile with debug symbols (i.e. the -g flag in javac)? That might bring the method down below the magic limit.

Euraeurasia answered 3/6, 2009 at 12:57 Comment(1)
At least in the Eclipse Java Compiler this has no effect on the number of bytecodes in the compiled method. (I assume the debug table is separate.)Hazaki
M
0

Would it be impossible to refactor the method into submethods? Modern JIT's inline those calls anyway.

Mefford answered 3/6, 2009 at 13:6 Comment(7)
if you've not hand-written a parser before, sometimes the scanner (lexer/tokenizer) can get big, but readability can go way down when you split it up. of course I haven't seen his code, but I've been there...Castled
I've already split the actual tokenizer actions that can be split out without affecting control flow. The control structure itself is huge.Hazaki
@scott, I've hand-written a parser before. If it is too complex (for any particular reason) it might be a reasonable time to consider a suitable parser generator.Shrub
@hsivonen, well, I believe it is time you show us some code then :)Shrub
@Thorbjørn, Sure - most of the time I prefer a parser generator; sometimes the parser grammar is very trivial and it's all about lexing, which often is very straightforward but can grow long quicklyCastled
@Thorbjørn Current rev in the middle of a refactoring: hsivonen.iki.fi/Tokenizer.java There used to be a per code unit read() method. I wanted to inline it manually in order to hoist crucial state variables from fields to locals. As far as I can tell, the only substantial piece eligible for extraction is stuff in CHARACTER_REFERENCE_LOOPHazaki
Based on the snippet, I can see why you have problems with this. It is basically a maze of goto's all alike with too many for(;;)'s and fallthroughs for my taste. I would suggest you rewrite so there is no "break foo" or "continue foo" which will most likely require boolean flags, which then in turn goes in the various "for"-loops. When you are there, you should be able to break up the logic easily.Shrub
D
0

If it's a tokenizer loop, would it be better to do it with a data drivven set of mappings and a bit of reflection as appropriate?

So you would store your token matches in a structure that maps them to data about that token's syntax and methods implementing associated functions. Lookup can be optimised over the structure and you avoid the big loop.

That introduces the issue of keeping the data and implementation in sync, but you could generate the data from your codebase with a doclet or possibly annotation.

Without knowing exactly what your big method does, we're limited to trying to optimise it the way that you're assuming is best (and which is apparently not possible anyway).

Demography answered 3/6, 2009 at 14:34 Comment(2)
No, ideally you want the state transitions to compile down to jumps in code—not data structure lookups.Hazaki
Surely that rather depends on whether you can optimise the lookup - both through efficient searching of the match space and through caching of the results. A large, hard coded series of token matches might be quite resistant to optimisation (which I presume is why you're looking for JIT help). In a previous project I used cached method references with quite good results. Changes in the 'source' only needed to invalidate the reference to propmpt a fresh lookup, and normal execution could run at good speed, firing off the method calls without any performance hit from token matching.Demography
H
0

does your performance increase if you run a bytecode shrinker/obfuscator on your class? e.g., yguard, proguard, ...

maybe you can write a class file postprocessor using asm because your use case is so specific.

even if you remove all pointless gotos, does that bring you under the magic limit?

Hormuz answered 30/6, 2009 at 12:29 Comment(0)
A
0

A list of bytecode libraries mentions BCEL and ASM, that I'd heard of before, along with many others doing various things.

Abdella answered 30/6, 2009 at 12:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.