I have a large parallel (using MPI) simulation application which produces large amounts of data. In order to evaluate this data I use a python script.
What I now need to do is to run this application a large number of times (>1000) and calculate statistical properties from the resulting data.
My approach up until now is, to have a python script running in parallel (using mpi4py, using i.e. 48 nodes) calling the simulation code using subprocess.check_call
.
I need this call to run my mpi simulation application in serial.
I do not need the simulation to also run in parallel in this case.
The python script can then analyze the data in parallel and after finishing it will startup a new simulation run till a large number of runs is accumulated.
Goals are
- not saving the whole data set from 2000 runs
- keeping intermediate data in memory
Stub MWE:
file multi_call_master.py
:
from mpi4py import MPI
import subprocess
print "Master hello"
call_string = 'python multi_call_slave.py'
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
print "rank %d of size %d in master calling: %s" % (rank, size, call_string)
std_outfile = "./sm_test.out"
nr_samples = 1
for samples in range(0, nr_samples):
with open(std_outfile, 'w') as out:
subprocess.check_call(call_string, shell=True, stdout=out)
# analyze_data()
# communicate_results()
file multi_call_slave.py
(this would be the C simulation code):
from mpi4py import MPI
print "Slave hello"
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
print "rank %d of size %d in slave" % (rank, size)
This will not work. Resulting output in stdout
:
Master hello
rank 1 of size 2 in master calling: python multi_call_slave_so.py
Master hello
rank 0 of size 2 in master calling: python multi_call_slave_so.py
[cli_0]: write_line error; fd=7 buf=:cmd=finalize
:
system msg for write_line failure : Broken pipe
Fatal error in MPI_Finalize: Other MPI error, error stack:
MPI_Finalize(311).....: MPI_Finalize failed
MPI_Finalize(229).....:
MPID_Finalize(150)....:
MPIDI_PG_Finalize(126): PMI_Finalize failed, error -1
[cli_1]: write_line error; fd=8 buf=:cmd=finalize
:
system msg for write_line failure : Broken pipe
Fatal error in MPI_Finalize: Other MPI error, error stack:
MPI_Finalize(311).....: MPI_Finalize failed
MPI_Finalize(229).....:
MPID_Finalize(150)....:
MPIDI_PG_Finalize(126): PMI_Finalize failed, error -1
Resulting output in sm_test.out
:
Slave hello
rank 0 of size 2 in slave
The reason is, that the subprocess assumes to be run as a parallel application, whereas I intend to run it as a serial application. As a very "hacky" workaround I did the following:
- Compile all needed mpi aware libraries with a specific mpi distribution, i.e. intel mpi
- Compile simulation code with a different mpi library, i.e. openmpi
If I would now start my parallel python script using intel mpi, the underlying simulation would not be aware of the surrounding parallel environment as it was using a different library.
This worked fine for a while, but unfortunately is not very portable and difficult to maintain on different clusters for various reasons.
I could
- put the subprocess calling loop into a shell script using
srun
- Would mandate buffering results on HD
- use some kind of
MPI_Comm_spawn
technique in python- not meant to be used like that
- difficult to find out if subprocess finished
- propably changes to C code necessary
- Somehow trick the subprocess into not forwarding MPI information
- tried manipulating the environment variables to no avail
- also not meant to be used like that
- using
mpirun -n 1
orsrun
for the subprocess call does not help
Is there any elegant, official way of doing this? I am really out of ideas and appreciate any input!
MPI_Comm_spawn
route yesterday and had the I/O problem, which I will solve using an intermediate file piping bash script. For finish detection I will be forced to modify the C code as an intermediate bash script does not work, because it will involve multipleMPI_init
s or multipleMPI_Comm_spawn
s... This is all really messy and I do not understand why this is not properly defined in the MPI Interface standard. – Ollie