Classify node processes together with MPI and FORTRAN
Asked Answered
T

2

6

I am trying to make an implementation using MPI and Fortran that separates processes wich are on the same node to groups. Does MPI have a routine that can identify that?

I had the idea of separating these processes by their hostnames, that are the same on the nodes of the machine I am using. But I don't know if it is general for all clusters.

Tommietommy answered 8/8, 2013 at 15:33 Comment(0)
B
8

You probably want to check out MPI_COMM_SPLIT_TYPE. It will allow you to split an existing communicator based on the split_type you pass in as a parameter:

int MPI_Comm_split_type(MPI_Comm comm, int split_type, int key,
                        MPI_Info info, MPI_Comm *newcomm)

Right now, the only split_type is MPI_COMM_TYPE_SHARED, defined in the standard as:

This type splits the communicator into subcommunicators, each of which can create a shared memory region.

This is usually the same thing as what you're asking, but you'll have to double check that it's true on your machine.

The other thing you need to know is that this is a new function in MPI-3 so it might not be available in all implementations of MPI. I know that it's available for MPICH and it's derivatives. AFAIK, it's not available in the last release of Open MPI. So make sure you have a version of MPI that actually supports it.

Boonie answered 8/8, 2013 at 16:33 Comment(5)
Wesley, thank you. It's a shame that this is not avaiable for Open MPI. Although, let me make an example to see if I understood how this split type_ works. Supose I have 32 processes on 4 8-core nodes. My intention is to create 4 groups of 8 cores, based on shared memory processes that are on the same node. I could use MPI_Comm_split_type. Using this i could'nt, for example, create 8 groups of 4 (putting 2 groups in each node), could I?Tommietommy
Yes, you can do that. If you want to make 8 groups of 4 where each group is local to a node, you'll need to use the key parameter so that half of the processes contribute one key and the other half contribute a different key. This has been the behavior of the existing call MPI_COMM_SPLIT so there should be plenty of tutorials out there for you to follow. You'll just need to add a bit for the new type additions.Boonie
OpenMPI sets environment variables OMPI_COMM_WORLD_LOCAL_RANK and OMPI_COMM_WORLD_LOCAL_SIZE that you can use to find out how many processes share a node ("local size").Georgianngeorgianna
I'm pretty sure that you can use this function now in Open MPI too.Boonie
You can use it with Open MPI 1.8.x. Things have improved since 2013 :)Rutland
R
0

I've implemented similar split function for systems whose environment does not provide MPI 3.0 and it works quite well on several clusters. It uses MPI_GET_PROCESSOR_NAME and relies on the fact that most cluster MPI implementations return the FQDN of the node as a result - tested with Open MPI and Intel MPI (which is based on MPICH, therefore similar behaviour is to be expected with other MPICH derivatives). In pseudocode it works like this:

rank := MPI_COMM_RANK(communicator)
prev_rank := rank - 1; IF (prev_rank < 0) prev_rank := MPI_PROC_NULL
next_rank := rank + 1; IF (next_rank >= num_procs) next_rank := MPI_PROC_NULL

proc_name := MPI_GET_PROCESSOR_NAME

list := MPI_RECV(from prev_rank)
IF (list does not contain proc_name) THEN
   list := list + proc_name
END IF

colour := index of proc_name in list
key := rank

MPI_SEND(list to next_rank)

MPI_COMM_SPLIT(communicator, colour, key, newcomm)

This code basically builds a list of unique MPI processor names (host names) and each process uses the position of its MPI processor name in this list as colour for the usual split function. In my C implementation of the algorithm the list is simply a string - concatenation of all items with a zero byte as a separator. In Fortran one could use any symbol that would not be normally allowed in a host name, e.g. ;. Then the string is simply passed along as an array of MPI_CHAR (C) or MPI_CHARACTER (Fortran).

Rutland answered 11/8, 2013 at 14:45 Comment(3)
This mechanism works on all machines I've encountered except a Blue Gene/Q. There, the processor name includes the core number inside a node. You can construct your own processor name like this: <bitbucket.org/cactuscode/cactusutils/src/…>Georgianngeorgianna
On the BG/Q, processes cannot migrate between the cores, therefore it makes perfect sense to include the core ID in the processor name. What about Cray? I haven't had the chance to use one in quite some time.Rutland
The Crays I've used behave like Linux.Georgianngeorgianna

© 2022 - 2024 — McMap. All rights reserved.