Python with embedded call to mpirun
Asked Answered
F

1

8

I am trying to run some parallel optimization using PyOpt. The tricky part is that within my objective function, I want to run a C++ code using mpi as well.

My python script is the following:

#!/usr/bin/env python    
# Standard Python modules
import os, sys, time, math
import subprocess


# External Python modules
try:
    from mpi4py import MPI
    comm = MPI.COMM_WORLD
    myrank = comm.Get_rank()
except:
    raise ImportError('mpi4py is required for parallelization')

# Extension modules
from pyOpt import Optimization
from pyOpt import ALPSO

# Predefine the BashCommand
RunCprogram = "mpirun -np 2 CProgram" # Parallel C++ program


######################### 
def objfunc(x):

    f = -(((math.sin(2*math.pi*x[0])**3)*math.sin(2*math.pi*x[1]))/((x[0]**3)*(x[0]+x[1])))

    # Run CProgram 
    os.system(RunCprogram) #where the mpirun call occurs

    g = [0.0]*2
    g[0] = x[0]**2 - x[1] + 1
    g[1] = 1 - x[0] + (x[1]-4)**2

    time.sleep(0.01)
    fail = 0
    return f,g, fail

# Instantiate Optimization Problem 
opt_prob = Optimization('Thermal Conductivity Optimization',objfunc)
opt_prob.addVar('x1','c',lower=5.0,upper=1e-6,value=10.0)
opt_prob.addVar('x2','c',lower=5.0,upper=1e-6,value=10.0)
opt_prob.addObj('f')
opt_prob.addCon('g1','i')
opt_prob.addCon('g2','i')

# Solve Problem (DPM-Parallelization)
alpso_dpm = ALPSO(pll_type='DPM')
alpso_dpm.setOption('fileout',0)
alpso_dpm(opt_prob)
print opt_prob.solution(0)

I run that code using:

mpirun -np 20 python Script.py

However, I am getting the following error:

[user:28323] *** Process received signal ***
[user:28323] Signal: Segmentation fault (11)
[user:28323] Signal code: Address not mapped (1)
[user:28323] Failing at address: (nil)
[user:28323] [ 0] /lib64/libpthread.so.0() [0x3ccfc0f500]
[user:28323] *** End of error message ***

I figure that the 2 different mpirun calls (the one calling the python script and the one within the script) are conflicting with each other. Any lead on how to solve that?

Thank you!!

Favor answered 21/5, 2015 at 21:26 Comment(1)
Do you exchange data between the python processes using mpi communications or are you just using mpi4py to run multiple isolated instances. If this is the case, you may consider using the subprocess module in python to spawns multiple thread, each of which can call an mpirun instance (using subprocess.Popen). I do this frequently and have had no problems. If you're running the Script.py across multiple machine this may not be possible...Malatya
U
3

See Calling mpi binary in serial as subprocess of mpi application : the safiest way to go is to use MPI_Comm_spawn() . Take a look at this manager-worker example for instance.

A quick fix would be to use subprocess.Popen as signaled by @EdSmith. Yet, notice that the default behavior of subprocess.Popen use the parent's environment. My guess is that it is the same for os.system(). Unfortunately, some environnement variables are added by mpirun, depending on the MPI implementation, such as OMPI_COMM_WORLD_RANK or OMPI_MCA_orte_ess_num_procs. To see these environment variables, type import os ; print os.environ in an mpi4py code and in a basic python shell. These environment variables may lead to a failure of the subprocess. So i had to add a line to get rid of them...which is rather dirty... It boils down to :

    args = shlex.split(RunCprogram)
    env=os.environ
    # to remove all environment variables with "MPI" in it...rather dirty...
    new_env = {k: v for k, v in env.iteritems() if "MPI" not in k}

    #print new_env
    # shell=True : watch for security issues...
    p = subprocess.Popen(RunCprogram,shell=True, env=new_env,stdout=subprocess.PIPE, stdin=subprocess.PIPE)
    p.wait()
    result="process myrank "+str(myrank)+" got "+p.stdout.read()
    print result

Complete test code, ran by mpirun -np 2 python opti.py :

#!/usr/bin/env python    
# Standard Python modules
import os, sys, time, math
import subprocess
import shlex


# External Python modules
try:
    from mpi4py import MPI
    comm = MPI.COMM_WORLD
    myrank = comm.Get_rank()
except:
    raise ImportError('mpi4py is required for parallelization')

# Predefine the BashCommand
RunCprogram = "mpirun -np 2 main" # Parallel C++ program


######################### 
def objfunc(x):

    f = -(((math.sin(2*math.pi*x[0])**3)*math.sin(2*math.pi*x[1]))/((x[0]**3)*(x[0]+x[1])))

    # Run CProgram 
    #os.system(RunCprogram) #where the mpirun call occurs
    args = shlex.split(RunCprogram)
    env=os.environ
    new_env = {k: v for k, v in env.iteritems() if "MPI" not in k}

    #print new_env
    p = subprocess.Popen(RunCprogram,shell=True, env=new_env,stdout=subprocess.PIPE, stdin=subprocess.PIPE)
    p.wait()
    result="process myrank "+str(myrank)+" got "+p.stdout.read()
    print result



    g = [0.0]*2
    g[0] = x[0]**2 - x[1] + 1
    g[1] = 1 - x[0] + (x[1]-4)**2

    time.sleep(0.01)
    fail = 0
    return f,g, fail

print objfunc([1.0,0.0])

Basic worker, compiled by mpiCC main.cpp -o main :

#include "mpi.h"

int main(int argc, char* argv[]) { 
    int rank, size;

    MPI_Init (&argc, &argv);    
    MPI_Comm_rank (MPI_COMM_WORLD, &rank);  
    MPI_Comm_size (MPI_COMM_WORLD, &size);  

    if(rank==0){
        std::cout<<" size "<<size<<std::endl;
    }
    MPI_Finalize();

    return 0;

}
Ultranationalism answered 24/5, 2015 at 13:19 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.