In place mpi_reduce crashes with OpenMPI
Asked Answered
P

1

15

Whenever I try to call mpi_reduce with mpi_in_place as the send buffer it crashes. A trawl of google reveals this to be have been a problem on Mac OS for OMPI 1.3.3 - but I'm on CentOS with OMPI 1.6.3 (with gfortran 4.4.6).

The following program crashes:

PROGRAM reduce

  USE mpi

  IMPLICIT NONE

  REAL, DIMENSION(2, 3) :: buffer, gbuffer

  INTEGER :: ierr, me_world
  INTEGER :: buf_shape(2), counts

  CALL mpi_init(ierr)
  CALL mpi_comm_rank(mpi_comm_world, me_world, ierr)

  buffer = 1.
  IF (me_world .EQ. 0) PRINT*, "buffer: ", buffer

  buf_shape = SHAPE(buffer)
  counts = buf_shape(1)*buf_shape(2)

  CALL mpi_reduce(MPI_IN_PLACE, buffer, counts, mpi_real, mpi_sum, 0, mpi_comm_world, ierr)
  IF (me_world .EQ. 0) PRINT*, "buffer: ", buffer

  CALL mpi_finalize(ierr)

END PROGRAM reduce

The MPI error is:

MPI_ERR_ARG: invalid argument of some other kind

which is not very helpful.

Am I missing something as to how mpi_reduce should be called? Does this work with other compilers/MPI implementations?

Plebeian answered 19/7, 2013 at 8:36 Comment(0)
C
26

You are missing a very important part of how the in-place reduction operation works in MPI (see the bolded text):

When the communicator is an intracommunicator, you can perform a reduce operation in-place (the output buffer is used as the input buffer). Use the variable MPI_IN_PLACE as the value of the root process sendbuf. In this case, the input data is taken at the root from the receive buffer, where it will be replaced by the output data.

The other processes still have to supply their local buffers as sendbuf, not MPI_IN_PLACE:

IF (me_world == 0) THEN
  CALL mpi_reduce(MPI_IN_PLACE, buffer, counts, MPI_REAL, MPI_SUM, 0, MPI_COMM_WORLD, ierr)
ELSE
  CALL mpi_reduce(buffer, buffer, counts, MPI_REAL, MPI_SUM, 0, MPI_COMM_WORLD, ierr)
END IF

You can safely pass buffer as both sendbuf and recvbuf in non-root processes since MPI_REDUCE does not write to recvbuf in those processes.

Califate answered 19/7, 2013 at 11:17 Comment(4)
Thanks, that fixed it! I misinterpreted that MPI_IN_PLACE documentation as I thought that collective communications had to be called by all processes with exactly the same arguments.Plebeian
Oops, are you really sure? Just looked into my production code and I have many instances of the wrong use and I have not encountered a problem so far.Exploratory
@VladimirF, there are some collective operations where MPI_IN_PLACE has to be specified as send buffer by all ranks, e.g., MPI_ALLTOALL or MPI_ALLREDUCE. The standard lists the proper use for each operation separately.Califate
@HristoIliev Thanks, you are right! They are mostly Allreduce and the few instances of Reduce were treated correctly.Exploratory

© 2022 - 2024 — McMap. All rights reserved.