I am trying to send data from process 0 to process 1. This program succeeds when the buffer size is less than 64kb, but hangs if the buffer gets much larger.
The following code should reproduce this issue (should hang), but should succeed if n
is modified to be less than 8000.
int main(int argc, char *argv[]){
int world_size, world_rank,
count;
MPI_Status status;
MPI_Init(NULL, NULL);
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
if(world_size < 2){
printf("Please add another process\n");
exit(1);
}
int n = 8200;
double *d = malloc(sizeof(double)*n);
double *c = malloc(sizeof(double)*n);
printf("malloc results %p %p\n", d, c);
if(world_rank == 0){
printf("sending\n");
MPI_Send(c, n, MPI_DOUBLE, 1, 0, MPI_COMM_WORLD);
printf("sent\n");
}
if(world_rank == 1){
printf("recv\n");
MPI_Recv(d, n, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD, &status);
MPI_Get_count(&status, MPI_DOUBLE, &count);
printf("recved, count:%d source:%d tag:%d error:%d\n", count, status.MPI_SOURCE, status.MPI_TAG, status.MPI_ERROR);
}
MPI_Finalize();
}
Output n = 8200;
malloc results 0x1cb05f0 0x1cc0640
recv
malloc results 0x117d5f0 0x118d640
sending
Output n = 8000;
malloc results 0x183c5f0 0x184c000
recv
malloc results 0x1ea75f0 0x1eb7000
sending
sent
recved, count:8000 source:0 tag:0 error:0
I found this question and this question which are similar, but I believe the issue there is with creating deadlocks. I would not expect a similar issue here because each process is performing only one send or receive.
EDIT: Added status checking.
EDIT2: It seems the issue was that I have OpenMPI installed but also installed an implementation of MPI from Intel when I installed MKL. My code was being compiled with the OpenMPI header and libraries, but run with Intel's mpirun. All works as expected when I ensure I run with the mpirun executable from OpenMPI.
status
. – Chandellemalloc
! – Samphire