What are safe points and safe point polling in context of profiling?

Asked 24/7, 2013 at 16:38 Answered 25/7, 2013 at 0:29

I am facing a situation where I do not see some method calls not being recorded by the VisualVM application. Wanted to find out the reason and came across this answer on SO. The third point mentions about a potential issue of the sampling method(which is the only option that I am seeing enabled probably because I am doing remote profiling). It mentions about safe points in code and safe point polling by code itself. What do these terms mean?

Lightweight answered 24/7, 2013 at 16:38 Comment(0)

The issue of inaccuracy of Java sampling profiler tools and its relation to the safe points is very well discussed in Evaluating the Accuracy of Java Profilers (PLDI'10).

Essentially, Java profilers may produce inaccurate results when sampling due to the fact that the sampling occurs during the safe points. And since occurrence of safe-points can be modified by the compiler, execution of some methods may never by sampled by the profiler. Therefore, the profiler is scheduled to record a sample of the code (the time interval is up) but it must wait for the occurrence of the safe-point. And since the safe-point is e.g. moved around by the compiler, the method that would be ideally sampled is never observed.

As already explained by the previous anwer, a safepoint is an event or a position in the code where compiler interrupts execution in order to execute some internal VM code (for example GC).

The safe-point polling is a method of implementing the safepoint or a safepoint trigger. It means that in the code being executed you regularly check a flag to see if a safe-point execution is required, if yes (due to e.g. GC trigger), the thread is interrupted and the safepoint is executed. See e.g. GC safe-point (or safepoint) and safe-region

Herrod answered 25/7, 2013 at 0:29 Comment(1)

As I commented on the blog: "The Mytkowicz and Diwan paper really bothers me. For example, its definition of "hotness" seems to mean "self time percent". The whole reason gprof was invented 30 years ago was that self time is an inadequate diagnostic. Another way it bothers me is that it concentrates on methods, rather than lines of code. Yet another way is its shallow understanding of sampling statistics." Does anyone actually read these things? Look here: – Goodbye 25/7, 2013 at 13:19

This blog post discusses safe points. Basically they are points in the code where the JITter allows interruptions for GC, stack traces etc.

The post also says the safe points, by delaying stack samples, cannot occur in places where you might like them to, and that's a problem.

In my opinion, that's a small problem. The whole reason you take a stack sample (as opposed to just a program-counter sample) is to show you all the call-sites leading to the current state, because those are likely to be much more juicy sources of slowness than whatever the program counter is doing. (If it's doing anything. You might be in the middle of I/O, where the PC is meaningless, but the call-sites are still just as important.) If the stack sample has to wait a few cycles to get to a safe point, all that means is it happens at the end of a block of instructions, not in the middle. If you examine the sample you can still get a good idea what's happening.

I'm hoping profiler writers come to realize they don't need to sweat the small stuff. What's more important is not to miss the big stuff.

Goodbye answered 24/7, 2013 at 18:8 Comment(7)

I disagree based on experience using both kinds of profilers. A biased sampler can waste your time on optimizing completely the wrong thing... – Limbert 25/3, 2014 at 22:31

@Nitsan: What you are concerned about is "false positives" - false problems found. There's a bigger issue - "false negatives" - true problems not found. The usual assumption that if the profiler can't find something to optimize then there is nothing is not a theorem nor even well based in practice. It is all explored here. The method I mentioned does have a small probability of false positives. In my experience profilers have a high probability of false negatives. The latter are far more costly. – Goodbye 27/3, 2014 at 1:15

Your assumption "the stack sample has to wait a few cycles to get to a safe point" is what your argument hangs on. The JIT compiler can inline methods (thus removing the safepoint on return) to the point where there is a significant amount of code between safepoints. – Limbert 1/4, 2014 at 11:46

@Nitsan: That's OK, because it's at the program-counter level. In big software the call tree can be 20-30 levels deep, so that's the depth of a stack sample. Every call on the stack, if you see it on >1 stack samples is something that, if you can get rid of it, will give a significant speedup, regardless of safepoints, inlines, and system blocking. The amount of speedup is shown by the distribution next to my name. – Goodbye 1/4, 2014 at 11:57

My use case is low latency applications, so the stack is not so deep and the code you look to optimize is in the order of less than 1us. It's a real world use case is all I'm saying. – Limbert 1/4, 2014 at 19:17

@Nitsan: That makes sense. What I've done in situations like that is single-step the assembly code, looking for operations that could be excized. Of course it's hard to do, what with jitting etc, but it works. – Goodbye 1/4, 2014 at 20:25

1us is allot of assembly to work through. An unbiased profiler will do as good a job as a native profiler in finding the hot spots in the code. Saves me allot of assembly reading :). There are good profilers for Java out there, so given choice why not choose better? – Limbert 1/4, 2014 at 21:26

Recommended topics

Hot tags