How do I retain precision for a Fortran MPI program in a portable way?
Asked Answered
D

3

17

I have a Fortran program where I specify the kind of the numeric data types in an attempt to retain a minimum level of precision, regardless of what compiler is used to build the program. For example:

integer, parameter :: rsp = selected_real_kind(4)
...
real(kind=rsp) :: real_var

The problem is that I have used MPI to parallelize the code and I need to make sure the MPI communications are specifying the same type with the same precision. I was using the following approach to stay consistent with the approach in my program:

call MPI_Type_create_f90_real(4,MPI_UNDEFINED,rsp_mpi,mpi_err)
...
call MPI_Send(real_var,1,rsp_mpi,dest,tag,MPI_COMM_WORLD,err)

However, I have found that this MPI routine is not particularly well-supported for different MPI implementations, so it's actually making my program non-portable. If I omit the MPI_Type_create routine, then I'm left to rely on the standard MPI_REAL and MPI_DOUBLE_PRECISION data types, but what if that type is not consistent with what selected_real_kind picks as the real type that will ultimately be passed around by MPI? Am I stuck just using the standard real declaration for a datatype, with no kind attribute and, if I do that, am I guaranteed that MPI_REAL and real are always going to have the same precision, regardless of compiler and machine?

UPDATE:

I created a simple program that demonstrates the issue I see when my internal reals have higher precision than what is afforded by the MPI_DOUBLE_PRECISION type:

program main

   use mpi

   implicit none

   integer, parameter :: rsp = selected_real_kind(16)
   integer :: err
   integer :: rank

   real(rsp) :: real_var

   call MPI_Init(err)
   call MPI_Comm_rank(MPI_COMM_WORLD,rank,err)

   if (rank.eq.0) then
      real_var = 1.123456789012345
      call MPI_Send(real_var,1,MPI_DOUBLE_PRECISION,1,5,MPI_COMM_WORLD,err)
   else
      call MPI_Recv(real_var,1,MPI_DOUBLE_PRECISION,0,5,MPI_COMM_WORLD,&
         MPI_STATUS_IGNORE,err)
   end if

   print *, rank, real_var

   call MPI_Finalize(err)

end program main

If I build and run with 2 cores, I get:

       0   1.12345683574676513672      
       1   4.71241976735884452383E-3998

Now change the 16 to a 15 in selected_real_kind and I get:

       0   1.1234568357467651     
       1   1.1234568357467651  

Is it always going to be safe to use selected_real_kind(15) with MPI_DOUBLE_PRECISION no matter what machine/compiler is used to do the build?

Demi answered 1/11, 2013 at 12:11 Comment(9)
The code I'm using defines its working precision as wp = selected_real_kind(14,40) and then uses MPI_DOUBLE_PRECISION in the MPI calls. As far as I can tell, there is no issue with it.Nonobservance
I added a simple example of what problem I need to avoid. So I'm looking to find out, will my reals always be compatible with MPI_DOUBLE_PRECISION if I define their kind explicitly?Demi
Last I checked, using 16 signifies quad precision (hence the -3998 range) which should not work with MPI_DOUBLE_PRECISION. As long as you use between 7 and 15 (inclusive), you ought to be fine to use MPI_DOUBLE_PRECISION.Nonobservance
Yes, you're right. But will MPI_DOUBLE_PRECISION always mean that the MPI communications are retaining between 7 and 15 digits of precision, regardless of system/compiler? In other words, if I specify my reals to have 14 digits precision, and then do my MPI communications with MPI_DOUBLE_PRECISION, am I ever going to have to be concerned about getting a result like the one I posted, where the number comes out as garbage on the receiving end?Demi
To my knowledge, you shouldn't have those concerns.Nonobservance
Inead of using selected_real_kind, you could use ISO_FORTRAN_ENV and use types real32 and real64 (number of bits) to portably specify single and double precision reals.Amphictyony
I believe that came in with the 2008 standard, which I'm unable to use at the moment. However, if I were able to use this standard, would I be guaranteed that real32 will always be compatible with MPI_REAL and real64 will always be compatible with MPI_DOUBLE_PRECISION?Demi
Only on CPU's supporting IEEE floating point standards, and without using some compiler switch that changes the default precision. That means you are quite safe. Fortunately for reals it is much safer than for integers. I believe most CPU's or coprocessors in recent 30 years support IEEE SP and DP and compilers use them as default and double precisions.Conventioner
Thanks Vladimir. With that said, do you believe it is necessary to go through the trouble of explicitly setting the kind of a variable in Fortran in the first place? Is there much to be concerned about if I just declare my variable as real :: var as opposed to real(rsp) :: var?Demi
O
6

Use the Fortran 2008 intrinsic STORAGE_SIZE to determine the number bytes that each number requires and send as bytes. Note that STORAGE_SIZE returns the size in bits, so you will need to divide by 8 to get the size in bytes.

This solution works for moving data but does not help you use reductions. For that you will have to implement a user-defined reduction operation. If that's important to you, I will update my answer with the details.

For example:

program main

   use mpi

   implicit none

   integer, parameter :: rsp = selected_real_kind(16)
   integer :: err
   integer :: rank

   real(rsp) :: real_var

   call MPI_Init(err)
   call MPI_Comm_rank(MPI_COMM_WORLD,rank,err)

   if (rank.eq.0) then
      real_var = 1.123456789012345
      call MPI_Send(real_var,storage_size(real_var)/8,MPI_BYTE,1,5,MPI_COMM_WORLD,err)
   else
      call MPI_Recv(real_var,storage_size(real_var)/8,MPI_BYTE,0,5,MPI_COMM_WORLD,&
         MPI_STATUS_IGNORE,err)
   end if

   print *, rank, real_var

   call MPI_Finalize(err)

end program main

I confirmed that this change corrects the problem and the output I see is:

   0   1.12345683574676513672      
   1   1.12345683574676513672  
Oyster answered 13/3, 2015 at 23:44 Comment(2)
Wouldn't this solution be problematic for data communication between systems with different endianness?Hermilahermina
Yes, but that situation is not supported by most implementations and is almost entirely irrelevant at this point. Do you know anyone trying to run MPI jobs across PowerPC and x86 or AArch64 nodes?Oyster
O
1

Not really an answer, but we have the same problem and use something like this:

!> Number of digits for single precision numbers
integer, parameter, public :: single_prec = 6
!> Number of digits for double precision numbers
integer, parameter, public :: double_prec = 15
!> Number of digits for extended double precision numbers
integer, parameter, public :: xdble_prec = 18
!> Number of digits for quadruple precision numbers
integer, parameter, public :: quad_prec = 33

integer, parameter, public :: rk_prec = double_prec

!> The kind to select for default reals
integer, parameter, public :: rk = selected_real_kind(rk_prec)

And then have an initialization routine where we do:

!call mpi_type_create_f90_real(rk_prec, MPI_UNDEFINED, rk_mpi, iError)
!call mpi_type_create_f90_integer(long_prec, long_k_mpi, iError)
! Workaround shitty MPI-Implementations.
select case(rk_prec)
case(single_prec)
  rk_mpi = MPI_REAL
case(double_prec)
  rk_mpi = MPI_DOUBLE_PRECISION
case(quad_prec)
  rk_mpi = MPI_REAL16
case default
  write(*,*) 'unknown real type specified for mpi_type creation'
end select
long_k_mpi = MPI_INTEGER8

While this is not nice, it works reasonably well, and seems to be usable on Cray, IBM BlueGene and conventional Linux Clusters. Best thing to do is push sites and vendors to properly support this in MPI. As far as I know it has been fixed in OpenMPI and planned to be fixed in MPICH by 3.1.1. See OpenMPI Tickets 3432 and 3435 as well as MPICH Tickets 1769 and 1770.

Oddson answered 2/11, 2013 at 11:57 Comment(1)
You are right, my code worked fine using OpenMPI 1.4.3, but is giving me problems using MPICH. I think I'm going to end up just falling back to using the default MPI Fortran data types, rather than trying to set the precision manually.Demi
D
0

How about:

integer, parameter :: DOUBLE_PREC = kind(0.0d0)
integer, parameter :: SINGLE_PREC = kind(0.0e0)

integer, parameter :: MYREAL = DOUBLE_PREC


if (MYREAL .eq. DOUBLE_PREC) then
   MPIREAL = MPI_DOUBLE_PRECISION
else if (MYREAL .eq. SINGLE_PREC) then
   MPIREAL = MPI_REAL
else
   print *, "Erorr: Can't figure out MPI precision."
   STOP
end if

and use MPIREAL instead of MPI_DOUBLE_PRECISION from then on.

Dorothy answered 1/11, 2014 at 12:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.