MPI and global variables

Asked 11/12, 2013 at 18:39 Answered 11/12, 2013 at 21:1

I have to implement an MPI program. There are some global variables (4 arrays of float numbers and other 6 single float variables) which are first inizialized by the main process reading data from a file. Then I call MPI_Init and, while process of rank 0 waits for results, the other processes (rank 1,2,3,4) work on the arrays etc... The problem is that those array seem not to be initialized anymore, all is set to 0. I tried to move global variable inside the main function but the result is the same. When MPI_Init() is called all processes are created by fork right? So everyone has a memory copy of the father so why do they see not initizialized arrays?

Bernice answered 11/12, 2013 at 18:39 Comment(0)

I fear you have misunderstood.

It is probably best to think of each MPI process as an independent program, albeit one with the same source code as every other process in the computation. Operations that process 0 carries out on variables in its address space have no impact on the contents of the address spaces of other processes.

I'm not sure that the MPI standard even requires process 0 to have values for variables which were declared and initialised prior to the call to mpi_init, that is before process 0 really exists.

Whether it does or not you will have to write code to get the values into the variables in the address space of the other processes. One way to do this would be to have process 0 send the values to the other processes, either one by one or using a broadcast. Another way would be for all processes to read the values from the input files; if you choose this option watch out for contention over i/o resources.

In passing, I don't think it is common for MPI implementations to create processes by forking at the call to mpi_init, forking is more commonly used for creating threads. I think that most MPI implementations actually create the processes when you make a call to mpiexec, the call to mpi_init is the formality which announces that your program is starting its parallel computations.

Pretended answered 11/12, 2013 at 18:55 Comment(3)

Your answer was really useful. I have to implement this exercise as a little project for the course of parallel programming and the teacher wants us to use mpi (maybe it was better openmp)...the program should compute the total volume of 100 spheres equally distribuited on the three directions x,y,z and given their rays and using a Monte Carlo algorithm (an approximation). All these data are inserted into 4 different array at the beginning and I thought that every process should have a copy of them...ok ok, I'll use a broadcast. Thank you...your were really kind to me. – Bernice 11/12, 2013 at 19:5

So the best solution is that process 0 initializes the array and then he bcas them to the others, isn't it? I'm afrais this is a waste of time for parallel computing... – Bernice 11/12, 2013 at 19:12

It may be a waste of time for a parallel computer, but a good use of your time as you learn about the subject. – Pretended 11/12, 2013 at 19:53

When MPI_Init() is called all processes are created by fork right?

Wrong.

MPI spawns multiple instances of your program. These instances are separate processes, each with its own memory space. Each process has its own copy of every variable, including globals. MPI_Init() only initializes the MPI environment so that other MPI functions can be called.

Insulation answered 11/12, 2013 at 19:5 Comment(1)

As the other answers say, that's not how MPI works. Data is unique to each process and must be explicitly transferred between processes using the API available in the MPI specification.

However, there are programming models that allow this sort of behavior. If, when you say parallel computing, you mean multiple cores on one processor, you might be better served by using something like OpenMP to share your data between threads.

Alternatively, if you do in fact need to use multiple processors (either because your data is too big to fit in one processor's memory, or some other reason), you can take a look at one of the Parallel Global Address Space (PGAS) languages. In those models, you have memory that is globally available to all processes in an execution.

Last, there is a part of MPI that does allow you to expose memory from one process to other processes. It's the Remote Memory Access (RMA) or One-Sided chapter. It can be complex, but powerful if that's the kind of computing model you need.

All of these models will require changing the way your application works, but it sounds like they might map to your problem better.

Leaky answered 11/12, 2013 at 21:1 Comment(4)

After reading through some of the comments, I see that this is actually an exercise for a class so changing your model isn't going to be an option, but it's still good to have this answer for posterity, which is the point of SE. – Leaky 11/12, 2013 at 21:3

another question please: all the processes/ranks should create different random float numbers between min and max (given by me) but if I launch mpirun -np 5 ./program all the processes/ranks generate the same casual number...how can avoid it? – Bernice 11/12, 2013 at 21:4

This should really be asked as a separate question. This isn't specific to MPI anyway. – Leaky 11/12, 2013 at 21:33

It has in fact been asked and answered on SO already, e.g.: #8920911 – Insulation 11/12, 2013 at 23:38

Recommended topics

Hot tags