MPI_Waitall is failing
Asked Answered
R

1

6

I wonder if anyone can shed some light on the MPI_Waitall function for me. I have a program passing information using MPI_Isend and MPI_Irecv. After all the sends and receives are complete, one process in the program (in this case, process 0), will print a message. My Isend/Irecv are working, but the message prints out at some random point in the program; so I am trying to use MPI_Waitall to wait until all the requests are done before printing the message. I receive the following error message:

Fatal error in PMPI_Waitall: Invalid MPI_Request, error stack:
PMPI_Waitall(311): MPI_Waitall(count=16, req_array=0x16f70d0, status_array=0x16f7260) failed
PMPI_Waitall(288): The supplied request in array element 1 was invalid (kind=0)

Here is some relevant code:

MPI_Status *status;
MPI_Request *request;

MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &taskid);
MPI_Comm_size(MPI_COMM_WORLD, &numtasks);
status = (MPI_Status *) malloc(numtasks * sizeof(MPI_Status));
request = (MPI_Request *) malloc(numtasks * sizeof(MPI_Request));

/* Generate Data to send */
//Isend/Irecvs look like this:
MPI_Isend(&data, count, MPI_INT, dest, tag, MPI_COMM_WORLD, &request[taskid]);
MPI_Irecv(&data, count, MPI_INT, source, tag, MPI_COMM_WORLD, &request[taskid]);
MPI_Wait(&request[taskid], &status[taskid]

/* Calculations and such */

if (taskid == 0) {
        MPI_Waitall (numtasks, request, status);
        printf ("All done!\n");
}
MPI_Finalize();

Without the call to MPI_Waitall, the program runs cleanly, but the "All done" message prints as soon as process 0's Isend/Irecv messages complete, instead of after all Isend/Irecvs complete.

Thank you for any help you can provide.

Rom answered 15/3, 2013 at 20:1 Comment(0)
C
11

You are only setting one element of the request array, namely request[taskid] (and by the way you overwrite the send request handle with the receive one, irrevocably losing the former). Remember, MPI is used to program distributed memory machines and each MPI process has its own copy of the request array. Setting one element in rank taskid does not magically propagate the value to the other ranks, and even if it does, requests have only local validity. The proper implementation would be:

MPI_Status status[2];
MPI_Request request[2];

MPI_Init(&argc, &argv);
MPI_Comm_rank (MPI_COMM_WORLD, &taskid);
MPI_Comm_size (MPI_COMM_WORLD, &numtasks);

/* Generate Data to send */
//Isend/Irecvs look like this:
MPI_Isend (&data, count, MPI_INT, dest, tag, MPI_COMM_WORLD, &request[0]);
//          ^^^^
//           ||
//      data race !!
//           ||
//          vvvv
MPI_Irecv (&data, count, MPI_INT, source, tag, MPI_COMM_WORLD, &request[1]);
// Wait for both operations to complete
MPI_Waitall(2, request, status);

/* Calculations and such */

// Wait for all processes to reach this line in the code
MPI_Barrier(MPI_COMM_WORLD);

if (taskid == 0) {
  printf ("All done!\n");
}
MPI_Finalize();

By the way, there is a data race in your code. Both MPI_Isend and MPI_Irecv are using the same data buffer, which is incorrect. If you are simply trying to send the content of data to dest and then receive into it from source, then use MPI_Sendrecv_replace instead and forget about the non-blocking operations:

MPI_Status status;

MPI_Init(&argc, &argv);
MPI_Comm_rank (MPI_COMM_WORLD, &taskid);
MPI_Comm_size (MPI_COMM_WORLD, &numtasks);

/* Generate Data to send */
MPI_Sendrecv_replace (&data, count, MPI_INT, dest, tag, source, tag,
                      MPI_COMM_WORLD, &status);

/* Calculations and such */

// Wait for all processes to reach this line in the code
MPI_Barrier(MPI_COMM_WORLD);

if (taskid == 0) {
  printf ("All done!\n");
}
MPI_Finalize();
Centripetal answered 15/3, 2013 at 20:38 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.