Are there any examples of multithreaded Tracing JIT compilers?

Asked 6/9, 2016 at 3:10 Answered 14/9, 2016 at 9:51

multithreading compilation jit pypy luajit

Both the JVM and the .NET CLR include Just-In-Time compilers which support multiple user threads. However, I believe these are method-at-a-time JITs.

All of the tracing JITs I am aware of, for example LuaJIT and PyPy, are only single-threaded.

Are there any examples of tracing JITs which support multiple user threads? If not, are there any technical reasons why these do not exist?

Cum answered 6/9, 2016 at 3:10 Comment(2)

.NET supports multicore jit. But it is in general not exactly a universal solution, it can only ever have any noticeable effect when the cores are kept busy jitting. That requires a time machine, it has know what method is likely to be used next. Solved in .NET by recording profile data, the order in which methods are executed. So the next time the program runs they can be jitted ahead of time. Time machines are tricky, they tell you that people have hoverboards and flying cars in 2015. Well, the hoverboards turned out to be true. – Delvecchio 6/9, 2016 at 7:37

@HansPassant - Using background threads to JIT-compile code is interesting, I didn't know that .NET had that feature. However, even before this feature was added, users could create multiple threads - so the .NET JIT was already required to compile code on multiple threads. AFAIK though, .NET still JIT-compiles one method at a time. My question is specifically about Tracing JITs: are there any technical impediments to a Tracing JIT which compiles on multiple threads (either in the background or to support multiple user threads)? – Cum 8/9, 2016 at 12:53

Profiling (tracing) a running multi-threaded program is a lot harder, but also not impossible. The whole point of tracing is to make the runtime better than an optimizing compiler did the first time around. If the threads are interlinked, then the JIT that is going to modify the code needs to understand not just how the code is executed, but what the side effects are on other threads.

When thread one needs to access a big file in memory, does it create a level two cache flush that causes thread two to stall for a reason that is external to the code it is running. The JIT has to understand these interactions. Otherwise it might spend a lot of time trying to optimize thread two when improvements in thread two would come from realizing that thread one code is adversely effecting thread two and trying to eliminate the cache flush.

Are you considering trying to write your own tracing multi-threaded JIT? It could be done, but it is involved.

Merrygoround answered 13/9, 2016 at 17:34 Comment(3)

AFAIK LuaJIT, for example, decides which code to JIT based only on how many times it is run, not using timing information. Therefore I believe its algorithms should be immune to the inter-thread caching issues you mention. It sounds to me that there is no theoretical reason why tracing JITs could not support multiple threads - but perhaps such a JIT has never been implemented? Are you considering trying to write your own tracing multi-threaded JIT? I am considering writing a tracing JIT, yes. I am trying to work out whether support for multithreading is a plausible goal for the project. – Cum 14/9, 2016 at 1:44

There is no question you could optimize just on how many times certain code is run, but that would be sub-optimal from a global perspective in multi-threaded applications. In a past job we would do the manual optimization of turning certain cores off in a processor so that more L2 and L3 cache space were available without flushing and we would do things like eliminate mutexes and verify that inputs had not been updated after a calculation was complete because "snooping" is much cheaper than actually locking a mutex, where profiling showed that updates and reads rarely conflicted. – Merrygoround 14/9, 2016 at 3:27

Support for multithreading in a tracing JIT is plausible. The only question is how good you can make the optimizations. – Merrygoround 14/9, 2016 at 3:28

Your question is moot due to its wrong premise. The HotSpot optimizer of Oracle’s JVM/OpenJDK is not a “method-at-a-time JIT”. One of its fundamental technologies is the inlining capability, often called “aggressive inlining” as it does speculatively inline methods assumed to be most likely the target of a dynamic method dispatch, based on the profiling of the current execution (and other hints). This even includes the capability of de-optimizing, if the runtime behavior of the program changes and it doesn’t execute the optimized code path anymore.

The inlining is fundamental, as most other code optimizations develop their real potential only, when the code of methods is inlined into the caller’s, providing the necessary context.

So with the HotSpot JVM, you already have a multi-threaded optimizing environment utilizing known execution paths. This information doesn’t need to be gathered in the way described as “tracing”, though. Since this JVM is capable of creating a snapshot of a thread’s stack trace at any time, it can also peek the trace in defined time intervals, having more control over the profiling overhead than adding a recording feature to every method invocation. So, the JVM can limit the acquisition of traces to threads actually consuming significant CPU time and will intrinsically get an actual call chain, even if the involved methods are contained in multiple call chains of different threads.

Samella answered 14/9, 2016 at 9:51 Comment(1)

As an addendum, you can use the options -XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining to let the JVM print the call trees it will inline. – Samella 14/9, 2016 at 9:57

Recommended topics

Hot tags