What is the performance difference between a JVM method call and a remote call?
Asked Answered
G

3

6

I'm gathering some data about the difference in performance between a JVM method call and a remote method call using a binary protocol (in other words, not SOAP). I am developing a framework in which a method call may be local or remote at the discretion of the framework, and I'm wondering at what point it's "worth it" to evaluate the method remotely, either on a much faster server or on a compute grid of some kind. I know that a remote call is going to be much, much slower, so I'm mostly interested in understanding the order-of-magnitude differnce. Is it 10 times slower, or 100, or 1,000? Does anyone have any data on this? I'll write my own benchmarks if necessary, but I'm hoping to re-use some existing knowledge. Thanks!

Galaxy answered 24/9, 2011 at 12:18 Comment(2)
I already don't understand the person who voted this as "not a real question"... But the one who voted this as "not constructive" should head to the nearest asylum ; ) Thanks to the OP for the cool question and thanks to Peter Lawrey, as usual, for his great answer that said.Pyotr
@Pyotr it is not a real question for the reasons stated by duffymo and mikera. What remote method? Over what network? With what size parameters? The difference between local and remote is asymptotically zero the longer the remote method takes to execute.Demonic
M
4

Having developed a low latency RMI (~20 micro-seconds min) it is still 1000x slower than a direct call. If you use plain Java RMI, (~500 micro-seconds min) it can be 25,000x slower.

NOTE: This is only a very rough estimate to give you a general idea of the difference you might see. There are many complex factors which could change these numbers dramatically. Depending on what the method does, the difference could be much lower, esp if you perform RMI to the same process, if the network is relatively slow the difference could be much larger.

Additionally, even when there is a very large relative difference, it may be that it won't make much difference across your whole application.


To elaborate on my last comment...

Lets say you have a GUI which has to poll some data every second and it uses a background thread to do this. Lets say that using RMI takes 50 ms and the alternative is making a direct method call to a local copy of a distributed cache takes 0.0005 ms. That would appear to be an enormous difference, 100,000x. However, the RMI call could start 50 ms earlier, still poll every second, the difference to the user is next to nothing.

What could be much more important is when RMI compared with using another approach is much simpler (if its the right tool for the job)

An alternative to use RMI is using JMS. Which is best depends on your situation.

Murphy answered 24/9, 2011 at 13:23 Comment(2)
This is much too sweeping a generalization. See my comment under the OP.Demonic
@EJP, Sometimes it is better to have some idea, than give no idea. However, a little knowledge can be dangerous without context. The numbers deserve to come with a big warning.Murphy
M
3

It's impossible to answer your question precisely. The ratio of execution time will depends on factors like:

  • The size / complexity of the parameters and return values that need to be serialized for the remote call.
  • The execution time of the method itself
  • The bandwidth / latency of the network connection

But in general, direct JVM method calls are very fast, any kind of of serialization coupled with network delay caused by RMI is going to add a significant overhead. Have a look at these numbers to give you a rough estimate of the overhead:

http://surana.wordpress.com/2009/01/01/numbers-everyone-should-know/

Apart from that, you'll need to benchmark.

One piece of advice - make sure you use a really good binary serialization library (avro, protocol buffers, kryo etc.) couple with a decent communications framework (e.g. Netty). These tools are far better than the standard Java serialisation/io facilities, and probably better than anything you can code yourself in a reasonable amount of time.

Montes answered 24/9, 2011 at 12:39 Comment(0)
A
3

No one can tell you the answer, because the decision of whether or not to distribute is not about speed. If it was, you would never make a distributed call, because it will always be slower than the same call made in-memory.

You distribute components so multiple clients can share them. If the sharing is what's important, it outweighs the speed hit.

Your break even point has to do with how valuable it is to share functionality, not method call speed.

Acklin answered 24/9, 2011 at 12:59 Comment(6)
You make a good point. However a remote call has an advantage in that it introduces a new thread of execution that can run in parallel with the current thread, so the current thread can carry on--maybe invoking more remote calls--and then assemble the results later. It's also possible to do this with local threads, but number of available threads runs out much sooner. But I do agree, the execution of a single thread will always suffer from a remote call; it only makes sense if we're running lots of things in parallel.Galaxy
It can if the call is asynchronous, but not if it's synchronous. Your app isn't going anywhere if it calls and blocks. And then there's the issue of find out if the process is done and how to get the results. I don't agree with your point about local threads; it's still an option to do asynch locally. I don't think any of this comment is correct.Acklin
You can invoke methods asynchronously within the local JVM, but there are just simply fewer cores available on the local machine than there are available on the potentially hundreds of remote hosts on which the method might conceivably run.Galaxy
You can still make multiple calls on one core; you're just time slicing in that case. Potentially hundreds of remote hosts? Yes. In practice, no.Acklin
There's no point in launching more parallel method invocations than there are cores, assuming that each one is CPU-bound. Yes, they'll time-slice, but the work will take longer overall because there will be more context-switching and management overhead. As for hundreds of remote hosts, at my company it's not at all uncommon to have pools of hundreds of hosts for running computations. So I don't know what you mean by "In practice, no." It seems like you're just trying to find something wrong with everything I write, which odd, given that I basically agreed with your initial answer.Galaxy
The existence of the Swing event thread says you're wrong. Long-running processes do make sense to farm out to existing threads, because it allows the user to move on and makes the UI experience more responsive.Acklin

© 2022 - 2024 — McMap. All rights reserved.