How to retransform an executing method with JVMTI agent which has no further invocations?
Asked Answered
C

1

9

I am instrumenting a class file during runtime for various purposes. I'm using a JVMTI agent to that end. My strategy to instrument a method is to call RetransformClasses function to invoke ClassFileLoadHook. This strategy works fine for all the methods which have any further invocation after the time of instrumentation because actual instrumentation happens at subsequent function call, but it doesn't work for any method which does not have further invocations like main function in a program.

I want to instrument a method on the fly during its execution. I want some procedure like On-Stack Replacement (OSR) of the instrumented code. Is there any strategy available in JVMTI or any other approach????

PS: I'm open to editing/patching OpenJDK source code if that can help.

Capernaum answered 23/1, 2017 at 8:6 Comment(6)
What I don't get: given the fact that such a method will never be called; what is the point instrumenting it? I mean: isn't instrumentation about giving you "insight" when the method is called later on; like when the method is invoked?Aquaplane
You're correct as far as Instrumentation for profiling is concerned. I'm instrumenting my code for parallelizing long running loops in a method. So if you have a tedious loop in your main, I would like to instrument it to spawn some threads and join em (If it is parallelizable ofcourse). That's why I came across instrumenting single invocation functions.Capernaum
Have you looked into javaagent?Daniels
@DennisC Yes. I started off with Javaagent. I tried ASM and Javassist. Javagent also uses same strategy at its backend. For dynamic runtime instrumentation, it'll do the same. That is, instrument subsequent invocations if there are any.Capernaum
BTW. Don't trust OSR too much. I read lucene project usually uncover few bug in hotspot due to OSR and optimizer, every time a new jdk released.Daniels
Well I guess that might be a reason for no OSR equivalent in case of instrumentation. 😊Capernaum
A
2

After some further thinking, I believe you are ask for something that might maybe (maybe!) be possible technically; but require lots of efforts; but conceptually it is not a good approach.

I assume your requirement is actually that you want to instrument any kind of application thrown at you in order to improve its performance by doing "under the cover parallelizing".

So, instead of having a real solution, I mainly have a list of concerns:

  • First of all, if you even want to modify methods that were already triggered and that are currently executed, you are not only talking about instrumenting. What you actually want to do is to provide your own "JIT" mechanism - while the JVM JIT is also there, and doing its job.
  • So, if you are really serious about this; and want to make sure that even the things in any main() can benefit from your optimizations - then I think, conceptually, you are better of designing and implementing your own JVM then.
  • Then I am wondering: you say want to cover main() methods that are already running "long time loops". That sounds like you intend to fix bad designs by throwing your instrumentation at it. I think the more sane approach is: look into such applications, and improve their design.
  • In the sense of: if "parallelizing" arbitrary applications would be "that easy" - it would be part of the JVM anyway. And the fact that it is not; and that the JVM doesn't do such kind of optimizations is for a good reason: it is probably super hard to get that correct and robust.

In other words: I guess you have an XY problem; and the X problem is that the application(s) you are dealing could benefit from "parallelizing". But that is something that is very hard to do "in general".

In that sense; I would rather define some kind of architecture (that might include specific, well-defined steps how an application should "start up"; so that your instrumentation can do its work successfully) and gain experience with that approach first. Meaning: tell your folks to not put "long running loops" into their main() in the first place (as said; that alone sounds like pretty bad design to me!).

Aquaplane answered 2/2, 2017 at 8:16 Comment(8)
You got it more or less. My main concern is to exploit implicit parallelism in java codes with the assumption that I don't have the java source of the application. As far as analysis is concerned, I analyze the bytecode for parallelizability and that works perfectly. Obviously it is really difficult to analyze but that's not my main concern here. I'm simply concerned to redefine my classes on the fly. I'm already instrumenting my classes during class load time and I'm getting my desired results. Now I want to use the JIT profiler to sift some hotspots for me and parallelize only those.Capernaum
The reason to use hotspots is also obvious. Axiom is: "Every loop is not worth parallelizing. Only the hotpots are." Keeping this in mind, I'm naturally inclined to instrument executing hotspots in the code. And there comes the runtime dynamic instrumentation.Capernaum
I analyze the bytecode for parallelizability and that works perfectly ... is a pretty bold statement ;-) ... I am wondering what you discovered what zillions of people doing research in this area overlooked so far.Aquaplane
Yeah. I'm sorry, it gives that impression. :-) What I meant is my system is working so far and I have an end to end prototype. I didn't mean to say that I have found some novel approach to analyze parallelizability. And I have discovered something called JAVAB. It's a bytecode parallelization tool.Capernaum
Still ... sounds interesting. Will this be open source at some point. In any case; I had hoped for my first bounty win, but so far the answer seems not even worth upvoting for the people coming by ... but I will keep thinking; and maybe I have some more helpful content later on.Aquaplane
Fyi: I put some content into that chat.Aquaplane
The problem is, OSR is done by generating optimized code in a restricted way so that there is a common point at which the control can be transferred from the old to the new code. That’s relying on the fact that the optimized code still does the same as the old code, at least semantically. Instrumentation allows you to replace the code with some arbitrary code, not required to do the same, not even semantically. Parallelized code is unlikely to have something in common with the sequential code, not to speak of a point for safe transfer of control (in the middle of a loop).Papp
Thanks. I can't promise I will have much time to look into that, but thanks again!Aquaplane

© 2022 - 2024 — McMap. All rights reserved.