Abstract implementation of non-blocking MPI calls

Asked 12/1, 2014 at 7:35 Answered 13/1, 2014 at 19:30

Non-blocking sends/recvs return immediately in MPI and the operation is completed in the background. The only way I see that happening is that the current process/thread invokes/creates another process/thread and loads an image of the send/recv code into that and itself returns. Then this new process/thread completes this operation and sets a flag somewhere which the Wait/Test returns. Am I correct ?

Abnormal answered 12/1, 2014 at 7:35 Comment(0)

There are two ways that progress can happen:

In a separate thread. This is usually an option in most MPI implementations (usually at configure/compile time). In this version, as you speculated, the MPI implementation has another thread that runs a separate progress engine. That thread manages all of the MPI messages and sending/receiving data. This way works well if you're not using all of the cores on your machine as it makes progress in the background without adding overhead to your other MPI calls.
Inside other MPI calls. This is the more common way of doing things and is the default for most implementations I believe. In this version, non-blocking calls are started when you initiate the call (MPI_I<something>) and are essentially added to an internal queue. Nothing (probably) happens on that call until you make another call to MPI later that actually does some blocking communication (or waits for the completion of previous non-blocking calls). When you enter that future MPI call, in addition to doing whatever you asked it to do, it will run the progress engine (the same thing that's running in a thread in version #1). Depending on what the MPI call that's supposed to be happening is doing, the progress engine may run for a while or may just run through once. For instance, if you called MPI_WAIT on an MPI_IRECV, you'll stay inside the progress engine until you receive the message that you're waiting for. If you are just doing an MPI_TEST, it might just cycle through the progress engine once and then jump back out.
More exotic methods. As Jeff mentions in his post, there are more exotic methods that depend on the hardware on which you're running. You may have a NIC that will do some magic for you in terms of moving your messages in the background or some other way to speed up your MPI calls. In general, these are very specific to the implementation and hardware on which you're running, so if you want to know more about them, you'll need to be more specific in your question.

All of this is specific to your implementation, but most of them work in some way similar to this.

Lemon answered 13/1, 2014 at 19:30 Comment(1)

Thanks for the elaborate answer ! – Abnormal 14/1, 2014 at 8:21

Are you asking, if a separate thread for message processing is the only solution for non-blocking operations?

If so, the answer is no. I even think, many setups use a different strategy. Usually progress of the message processing is done during all MPI-Calls. I'd recommend you to have a look into this Blog entry by Jeff Squyres.

See the answer by Wesley Bland for a more complete answer.

Embrasure answered 12/1, 2014 at 23:56 Comment(3)

Its a complicated post but yes he does point out that hardware assistance to pass the messages is a feasible solution. That answers my question. Thank you ! – Abnormal 13/1, 2014 at 9:44

But it will be great if someone can in easy language explain the different 'methods' by which this 'progress' is implemented. – Abnormal 13/1, 2014 at 9:46

Well, not sure, if I am the right one to answer this, as I am not working on MPI implementations, but basically the point is, that work on pending communications is done during any MPI calls. Thus, whenever you call MPI it might progress pending communication operations, that are not related to the actual call at hand. – Embrasure 13/1, 2014 at 13:51

Recommended topics

Hot tags