Which parts of the code should you warm up?
Usually, you don't have to do anything. However for a low latency application, you should warmup the critical path in your system. You should have unit tests, so I suggest you run those on start up to warmup up the code.
Even once your code is warmed up, you have to ensure your CPU caches stay warm as well. You can see a significant slow down in performance after a blocking operation e.g. network IO, for up to 50 micro-seconds. Usually this is not a problem but if you are trying to stay under say 50 micro-seconds most of the time, this will be a problem most of the time.
Note: Warmup can allow Escape Analysis to kick in and place some objects on the stack. This means such objects don't need to be optimised away. It is better to memory profile your application before optimising your code.
Even if I warm up some parts of the code, how long does it remain warm (assuming this term only means how long your class objects remain in-memory)?
There is no time limit. It depends on whether the JIt detects whether the assumption it made when optimising the code turned out to be incorrect.
How does it help if I have objects which need to be created each time I receive an event?
If you want low latency, or high performance, you should create as little objects as possible. I aim to produce less than 300 KB/sec. With this allocation rate you can have an Eden space large enough to minor collect once a day.
Consider for an example an application that is expected to receive messages over a socket and the transactions could be New Order, Modify Order and Cancel Order or transaction confirmed.
I suggest you re-use objects as much as possible, though if it's under your allocation budget, it may not be worth worrying about.
Note that the application is about High Frequency Trading (HFT) so performance is of extreme importance.
You might be interested in our open source software which is used for HFT systems at different Investment Banks and Hedge Funds.
http://chronicle.software/
My production application is used for High frequency trading and every bit of latency can be an issue. It is kind of clear that at startup if you don't warmup your application, it will lead to high latency of few millis.
In particular you might be interested in https://github.com/OpenHFT/Java-Thread-Affinity as this library can help reduce scheduling jitter in your critical threads.
And also it is said that the critical sections of code which requires warmup should be ran (with fake messages) atleast 12K times for it to work in an optimized manner. Why and how does it work?
Code is compiled using background thread(s). This means that even though a method might be eligible for compiling to native code, it doesn't mean that it has done so esp on startup when the compiler is pretty busy already. 12K is not unreasonable, but it could be higher.
-XX:+PrintCompilation
useful for tracing compilation behavior. You also might want to contact oracle, they're working on an AOT compiler, currently only offered to commercial customers AIUI. I think some other JVM vendors also offer AOT. – Irish