Implied synchronization with MPI_BCAST for both sender and receivers?

Asked 11/7, 2011 at 16:6 Answered 19/10, 2015 at 10:5

When calling MPI_BCAST, is there any implied synchronization? For example, if the sender process were to get to the MPI_BCAST before others could it do the BCAST and then continue without any acknowledgements? Some recent tests with code like:

program test
include 'mpif.h'

integer ierr, tid, tmp

call MPI_INIT(ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, tid, ierr)

tmp = tid

if(tid.eq.0) then
  call MPI_BCAST(tmp,1,MPI_INTEGER,MPI_ROOT,MPI_COMM_WORLD, ierr)
else

endif

write(*,*) tid,'done'
call MPI_FINALIZE(ierr)

end

shows that with two threads they both reach completion, despite only the sender doing a call to MPI_BCAST.

Output:

1 done           0
0 done           0

Could this be a problem with the MPI installation I'm working with (MPICH), or is this standard behavior for MPI?

Sphygmomanometer answered 11/7, 2011 at 16:6 Comment(0)

Bcast is a collective communication call, and as such blocks. More precisely, it blocks until all processes in the specified communicator have made a matching call to Bcast, at which point communication occurs and execution continues.

Your code is too simplified for debugging purposes. Can you post a working minimal example that demonstrates the problem?

Kinshasa answered 11/7, 2011 at 16:31 Comment(6)

OK, I've put the full program that I'm working with. I would expect it to block indefinitely waiting for others to call MPI_BCAST if mpirun is launched with more than 1 processor, but on my machine it exits with both processes making the call to write, with the value 0 in tmp – Sphygmomanometer 11/7, 2011 at 16:44

It does not need to block -- the root doesn't need any responses from the other ranks to continue, so it may not wait for them. In particular, in an eager message protocol, the root will send its message immediately and it will sit in a buffer on (some of) the other ranks until they call MPI_Bcast. – Exothermic 10/3, 2012 at 21:28

@Jeremiah: not true. The MPI standard requires that by the time Bcast returns, the contents of the root's buffer have been copied to all processes. @user631027: In your program, process 0 blocks on Bcast, but process 1 immediately reaches Finalize, reducing the size of MPI_COMM_WORLD to 1. As such, process 0 is now free to complete Bcast since he's the only one to broadcast to. If process 1 were to call a Bcast that doesn't match that of process 0, the program would hang. – Kinshasa 11/3, 2012 at 1:41

@suszterpatt: I think what you are saying only says that when the non-root processes return from MPI_Bcast, they have the data from the root; the root can have MPI_Bcast return as soon as it is safe to overwrite the buffer being sent. – Exothermic 11/3, 2012 at 8:17

This answer is not completely correct. From the MPI Standard v3 section 5.1: "It is dangerous to rely on synchronization side-effects of the collective operations for program correctness. For example, even though a particular implementation may provide a broadcast routine with a side-effect of synchronization, the standard does not require this, and a program that relies on this will not be portable.". Bcast is not required to block on all processes until the operation has finished. – Thorazine 10/6, 2014 at 15:3

@Kinshasa - This answer is not correct, according to the MPI standard. Blocking refers to the fact that the call should return only after it is safe to modify the buffer. The return does not indicate that other processes in the group have completed or even started the operation (see Section 5.1, MPI Standard v3 ). Except MPI_Barrier, other collective completions do not imply synchronization. – Roommate 14/8, 2015 at 21:47

I can attest that MPI_Bcast does NOT block, at least for the root (sending) process. You should call MPI_Barrier immediately afterward if you want to be certain that your program blocks. I know this because I recently accidentally called MPI_Bcast for the root process only (instead of collectively) and the program execution continued as normal until much later when the NEXT unrelated call to MPI_Bcast, in which the old buffer was received into the new different buffers. This mismatch in buffer data type/length produced garbage data and it took me a while to find that bug.

Aw answered 19/10, 2015 at 10:5 Comment(3)

Hi MasterHD, I believe the reasons for the behavior you observed and the behavior that caused me to post the question originally are the same. MPI_Bcast is a collective MPI operation, which means that the MPI standard says all ranks must participate. If they do, then MPI_Bcast is a blocking operation. If they do not, then the MPI program is not compliant with the standard and you will see undefined behavior. In both of our cases, that undefined behavior was that the program proceeded merrily along its way as if nothing was wrong. – Sphygmomanometer 19/10, 2015 at 13:56

The point I was trying to make is that if MPI_Bcast was truly a blocking method then calling it on ONLY the root process would immediately hang that root process without allowing it to continue. Was that your question or something else? – Aw 20/10, 2015 at 5:45

Also, @MasterHD, if your MPI_Bcast call had sufficiently large buffers that it was trying to send, it probably would have blocked on the root process as well. It only completed because you were probably sending small amounts of data that were buffered internally. – Sprage 20/10, 2015 at 16:50

Recommended topics

Hot tags