Just in Time Compilation - Storing vs Doing always [duplicate]

T

5

3

Possible Duplicate:
Why doesn't the JVM cache JIT compiled code?

I understand that JIT compilation is compilation to native code using hotspot mechanisms, which can be very very fast as it is optimization to the OS, Hardwards, etc.

My question is, why does Java not store that JIT complied code somewhere in file and use the same for future purposes? This can reduce the 'initial warm-up' time as well.

Please let me know what I am missing here.

To add to my question: Why does not Java complie the complete code to native and use that always(for a specific JVM,OS, platform)? Why JIT?

Typhus answered 25/7, 2012 at 14:51 Comment(0)

B

2

If I remember correctly, caching and sharing of JIT-compiled code has been tried, and found to be not a good idea.

On the one hand, a modern HotSpot JIT compiler generates and optimizes code in the context of the current CPU model, and the usage patterns of the current execution. If it were to cache compiled code, then there is a good chance that the code would not be optimal.

On the other hand, there are apparently a variety of tricking technical problems. For instance, the cached code becomes a potential security hole, For instance, the code area needs to be writeable by all applications / users that share it. But that means that one user could potentially interfere with the running of another user's applications.

Bathelda answered 25/7, 2012 at 15:29 Comment(4)

Why does the cached optimized code have to be shared between users? Usually there is a .java directory in the user home directory to store such data. – Margaritamargarite 25/7, 2012 at 16:12

Code optimized from data gathered from previous runs is much likely to be better than non-optimized code, I'd guess. – Margaritamargarite 25/7, 2012 at 16:13

@Margaritamargarite - Not if it the presence of the poorly optimized code inhibits the JVM from gathering usage stats, or the JIT compiler from trying to optimize. This is a complicated problem area where "intuitively obvious" things are not necessarily correct. – Bathelda 26/7, 2012 at 1:54

@Margaritamargarite - "Why does the cached optimized code have to be shared between users?" - It doesn't. I raised this as just one of the examples why this is technically tricky ... – Bathelda 26/7, 2012 at 1:59

F

3

While there is a guarantee that you will always be using a JVM, there is no guarantee that you will always be using the same JVM. The hotspot optimized code is only valid for your machine.

With Java, there is no guarantee that the code is local to the JVM. Applets are a perfect example, and Webstart also illustrates this point. A generic "keep the optimization" would only clutter up caches in seldom run code, and creates issues in where to keep the optimized extensions.

It would also create quite a puzzle in knowing how long to keep the on-disk cache, and wouldn't you have to recompile the 'class' file to verify that the cache was for the right "release" of the class file? Java doesn't have a "this version" of the same class file designator, with the exception of the optional serial version uid.

Perhaps there's a workaround by check summing the class file and placing it in a field of the compiled class, but I'd hate to consider start up times of a JVM tasked with scanning all cached machine specific code, building a table, intervening in the class loader, and checking the check sum of the loaded class with the optimized code.

Freemason answered 25/7, 2012 at 15:2 Comment(0)

F

2

I have asked this question myself. The impression I get is its very hard to get right and avoid having a store which contains out of date code.

One way around this issue is to do a -XX:+PrintCompilation and write a short warm-up routine to warm these methods up.

Fall answered 25/7, 2012 at 14:56 Comment(0)

B

2

If I remember correctly, caching and sharing of JIT-compiled code has been tried, and found to be not a good idea.

On the one hand, a modern HotSpot JIT compiler generates and optimizes code in the context of the current CPU model, and the usage patterns of the current execution. If it were to cache compiled code, then there is a good chance that the code would not be optimal.

On the other hand, there are apparently a variety of tricking technical problems. For instance, the cached code becomes a potential security hole, For instance, the code area needs to be writeable by all applications / users that share it. But that means that one user could potentially interfere with the running of another user's applications.

Bathelda answered 25/7, 2012 at 15:29 Comment(4)

Why does the cached optimized code have to be shared between users? Usually there is a .java directory in the user home directory to store such data. – Margaritamargarite 25/7, 2012 at 16:12

Code optimized from data gathered from previous runs is much likely to be better than non-optimized code, I'd guess. – Margaritamargarite 25/7, 2012 at 16:13

@Margaritamargarite - Not if it the presence of the poorly optimized code inhibits the JVM from gathering usage stats, or the JIT compiler from trying to optimize. This is a complicated problem area where "intuitively obvious" things are not necessarily correct. – Bathelda 26/7, 2012 at 1:54

@Margaritamargarite - "Why does the cached optimized code have to be shared between users?" - It doesn't. I raised this as just one of the examples why this is technically tricky ... – Bathelda 26/7, 2012 at 1:59

E

2

It exists in .Net (which is similar to java in many ways), it's called NGEN. So I don't see why it couldn't exist in java.

I can see two reasons why it wasn't done:

Java doesn't have a good ID mechanism like .net has for its assembly. But hashing could indeed be used (at the jar or class level).
It would mostly (only?) benefit at the application startup. And since JRE6, startup has become a lot faster.

Existential answered 25/7, 2012 at 16:55 Comment(0)

X

2

Some JVMs (like the IBM one) do have "ahead of time shared JIT code". It's quite difficult to do (as other answers point out) because the class files used at one time by one JVM may not be the same ones as used next time, even if they have the same names. Thus there's a lot of logic required to prove that "class A I saw earlier is really the same as class A I now have".

The other issue is that JITed code very often includes address-space specific values (eg: the address of a given static variable, or entry point for another JITed method) and those can (and certainly will!) change on every JVM invocation, so again, care must be taken in dealing with those issues.

The performance wins that AOT code provides are real, and the feature is very much worth using depending on the circumstance. (specifically: things won't be changing run to run, etc.. - like invoking the same version of an app server, or Eclipse, for example)

Xavierxaviera answered 26/7, 2012 at 3:15 Comment(2)

"Thus there's a lot of logic required to prove that "class A I saw earlier is really the same as class A I now have"." -> look at the dates? – Inkblot 14/9, 2013 at 3:1

So yes, dates are the first thing looked at, but it's often not sufficient. For a simple example, 2 As (with same size & timestamp) might be in 2 different jar files and the search order was reversed between runs. You're also assuming simple classloaders that have file-based .class representations (which the core JVM itself doesn't actually really know about after bootstrap). – Xavierxaviera 3/10, 2013 at 2:0

Recommended topics

Hot tags