OpenMPI MPI_Barrier problems
Asked Answered
S

4

10

I having some synchronization issues using the OpenMPI implementation of MPI_Barrier:

int rank;
int nprocs;

int rc = MPI_Init(&argc, &argv);

if(rc != MPI_SUCCESS) {
    fprintf(stderr, "Unable to set up MPI");
    MPI_Abort(MPI_COMM_WORLD, rc);
}

MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);


printf("P%d\n", rank);
fflush(stdout);

MPI_Barrier(MPI_COMM_WORLD);

printf("P%d again\n", rank);

MPI_Finalize();

for mpirun -n 2 ./a.out

output should be: P0 P1 ...

output is sometimes: P0 P0 again P1 P1 again

what's going on?

Surfboarding answered 3/3, 2011 at 14:30 Comment(0)
N
16

The order in which your print out lines appear on your terminal is not necessarily the order in which things are printed. You are using a shared resource (stdout) for that so there always must be an ordering problem. (And fflush doesn't help here, stdout is line buffered anyhow.)

You could try to prefix your output with a timestamp and save all of this to different files, one per MPI process.

Then to inspect your log you could merge the two files together and sort according to the timestamp.

Your problem should disappear, then.

Neu answered 3/3, 2011 at 16:33 Comment(4)
Relying on timestamps may not be ideal if the MPI processes are running on different nodes unless you can guaranteed that the clocks are synced.Coastguardsman
@Shawn: There's MPI_Wtime() for that.Nadya
@suszterpatt: MPI_Wtime() is usually NOT a global/synchronized clock! (It only is if MPI_WTIME_IS_GLOBAL is defined and true)Motorcycle
AFAIK, MPI_Wtime in OpenMPI is not synchronised.Coastguardsman
C
12

There is nothing wrong with MPI_Barrier().

As Jens mentioned, the reason why you are not seeing the output you expected is because stdout is buffered on each processes. There is no guarantee that prints from multiple processes will be displayed on the calling process in order. (If stdout from each process is be transferred to the main process for printing in real time, that will lead to lots of unnecessary communication!)

If you want to convince yourself that the barrier works, you could try writing to a file instead. Having multiple processes writing to a single file may lead to extra complications, so you could have each proc writing to one file, then after the barrier, swap the files they write to. For example:

    Proc-0           Proc-1
      |                 |
 f0.write(..)     f1.write(...) 
      |                 |
      x  ~~ barrier ~~  x
      |                 |
 f1.write(..)     f0.write(...) 
      |                 |
     END               END

Sample implementation:

#include "mpi.h"
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv) {
    char filename[20];
    int rank, size;
    FILE *fp;

    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);

    if (rank < 2) { /* proc 0 and 1 only */ 
        sprintf(filename, "file_%d.out", rank);
        fp = fopen(filename, "w");
        fprintf(fp, "P%d: before Barrier\n", rank);
        fclose(fp);
    }

    MPI_Barrier(MPI_COMM_WORLD);

    if (rank < 2) { /* proc 0 and 1 only */ 
        sprintf(filename, "file_%d.out", (rank==0)?1:0 );
        fp = fopen(filename, "a");
        fprintf(fp, "P%d: after Barrier\n", rank);
        fclose(fp);
    }

    MPI_Finalize();
    return 0;

}

After running the code, you should get the following results:

[me@home]$ cat file_0.out
P0: before Barrier
P1: after Barrier

[me@home]$ cat file_1.out
P1: before Barrier
P0: after Barrier

For all files, the "after Barrier" statements will always appear later.

Coastguardsman answered 3/3, 2011 at 17:38 Comment(0)
T
5

Output ordering is not guaranteed in MPI programs.

This is not related to MPI_Barrier at all.

Also, I would not spend too much time on worrying about output ordering with MPI programs.

The most elegant way to achieve this, if you really want to, is to let the processes send their messages to one rank, say, rank 0, and let rank 0 print the output in the order it received them or ordered by ranks.

Again, dont spend too much time on trying to order the output from MPI programs. It is not practical and is of little use.

Theorist answered 3/3, 2011 at 22:42 Comment(0)
V
0

Adding to the previous answers here, your MPI_BARRIER works fine.

Though, if you just intend to see it working, you can force pause the execution (SLEEP(1)) for a moment to let the output catch up.

Violence answered 29/7, 2020 at 20:27 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.