MPI merge multiple intercoms into a single intracomm
Asked Answered
O

2

8

I trying to set up bunch of spawned processes into a single intracomm. I need to spawn separate processes into unique working directories since these subprocesses will write out a bunch of files. After all the processes are spawned in want to merge them into a single intra communicator. To try this out I set up a simple test program.

int main(int argc, const char * argv[]) {
    int rank, size;

    const int num_children = 5;
    int error_codes;

    MPI_Init(&argc, (char ***)&argv);

    MPI_Comm parentcomm;
    MPI_Comm childcomm;
    MPI_Comm intracomm;
    MPI_Comm_get_parent(&parentcomm);

    if (parentcomm == MPI_COMM_NULL) {
        printf("Current Command %s\n", argv[0]);

        for (size_t i = 0; i < num_children; i++) {
            MPI_Comm_spawn(argv[0], MPI_ARGV_NULL, 1, MPI_INFO_NULL, 0, MPI_COMM_WORLD, &childcomm, &error_codes);
            MPI_Intercomm_merge(childcomm, 0, &intracomm);
            MPI_Barrier(childcomm);
        }
    } else {
        MPI_Intercomm_merge(parentcomm, 1, &intracomm);
        MPI_Barrier(parentcomm);
    }

    printf("Test\n");

    MPI_Barrier(intracomm);

    printf("Test2\n");

    MPI_Comm_rank(intracomm, &rank);
    MPI_Comm_size(intracomm, &size);

    printf("Rank %d of %d\n", rank + 1, size);

    MPI_Barrier(intracomm);
    MPI_Finalize();
    return 0;
}

When I run this I get all 6 processes but my intracomm is only speaking between the parent and the last child spawned. The resulting output is

Test
Test
Test
Test
Test
Test
Test2
Rank 1 of 2
Test2
Rank 2 of 2

Is there a way to merge multiple communicators into a single communicator? Also note that I'm spawning these one at a time since I need each subprocess to execute in a unique working directory.

Overstudy answered 17/7, 2014 at 14:48 Comment(0)
I
4

If you are going to do this by calling MPI_COMM_SPAWN multiple times, then you'll have to do it more carefully. After you call SPAWN the first time, the spawned process will also need to take part in the next call to SPAWN, otherwise it will be left out of the communicator you're merging. it ends up looking like this:

Individual Spawns

The problem is that only two processes are participating in each MPI_INTERCOMM_MERGE and you can't merge three communicators so you'll never end up with one big communicator that way.

If you instead have each process participate in the merge as it goes, you end up with one big communicator in the end:

Group Spawns

Of course, you can just spawn all of your extra processes at once, but it sounds like you might have other reasons for not doing that.

Inglebert answered 17/7, 2014 at 15:55 Comment(2)
So your saying process 1 spawns 2, then 2 spawns 3, etc... I was trying to avoid MPI_Comm_spawn_multiple so that I could avoid the mess of setting up arrays of commands, info, etc. since I will ultimately end up doing this in Fortran.Overstudy
No, 0 spawns 1, then 0 & 1 spawn 2, etc.Inglebert
F
6

I realize I'm a year out of date with this answer, but I thought maybe other people might want to see an implementation of this. As the original respondent said, there is no way to merge three (or more) communicators. You have to build up the new intra-comm one at a time. Here is the code I use. This version deletes the original intra-comm; you may or may not want to do that depending on your particular application:

#include <mpi.h>


// The Borg routine: given
//   (1) a (quiesced) intra-communicator with one or more members, and
//   (2) a (quiesced) inter-communicator with exactly two members, one
//       of which is rank zero of the intra-communicator, and
//       the other of which is an unrelated spawned rank,
// return a new intra-communicator which is the union of both inputs.
//
// This is a collective operation.  All ranks of the intra-
// communicator, and the remote rank of the inter-communicator, must
// call this routine.  Ranks that are members of the intra-comm must
// supply the proper value for the "intra" argument, and MPI_COMM_NULL
// for the "inter" argument.  The remote inter-comm rank must
// supply MPI_COMM_NULL for the "intra" argument, and the proper value
// for the "inter" argument.  Rank zero (only) of the intra-comm must
// supply proper values for both arguments.
//
// N.B. It would make a certain amount of sense to split this into
// separate routines for the intra-communicator processes and the
// remote inter-communicator process.  The reason we don't do that is
// that, despite the relatively few lines of code,  what's going on here
// is really pretty complicated, and requires close coordination of the
// participating processes.  Putting all the code for all the processes
// into this one routine makes it easier to be sure everything "lines up"
// properly.
MPI_Comm
assimilateComm(MPI_Comm intra, MPI_Comm inter)
{
    MPI_Comm peer = MPI_COMM_NULL;
    MPI_Comm newInterComm = MPI_COMM_NULL;
    MPI_Comm newIntraComm = MPI_COMM_NULL;

    // The spawned rank will be the "high" rank in the new intra-comm
    int high = (MPI_COMM_NULL == intra) ? 1 : 0;

    // If this is one of the (two) ranks in the inter-comm,
    // create a new intra-comm from the inter-comm
    if (MPI_COMM_NULL != inter) {
        MPI_Intercomm_merge(inter, high, &peer);
    } else {
        peer = MPI_COMM_NULL;
    }

    // Create a new inter-comm between the pre-existing intra-comm
    // (all of it, not only rank zero), and the remote (spawned) rank,
    // using the just-created intra-comm as the peer communicator.
    int tag = 12345;
    if (MPI_COMM_NULL != intra) {
        // This task is a member of the pre-existing intra-comm
        MPI_Intercomm_create(intra, 0, peer, 1, tag, &newInterComm);
    }
    else {
        // This is the remote (spawned) task
        MPI_Intercomm_create(MPI_COMM_SELF, 0, peer, 0, tag, &newInterComm);
    }

    // Now convert this inter-comm into an intra-comm
    MPI_Intercomm_merge(newInterComm, high, &newIntraComm);


    // Clean up the intermediaries
    if (MPI_COMM_NULL != peer) MPI_Comm_free(&peer);
    MPI_Comm_free(&newInterComm);

    // Delete the original intra-comm
    if (MPI_COMM_NULL != intra) MPI_Comm_free(&intra);

    // Return the new intra-comm
    return newIntraComm;
}
Fad answered 30/6, 2015 at 21:25 Comment(0)
I
4

If you are going to do this by calling MPI_COMM_SPAWN multiple times, then you'll have to do it more carefully. After you call SPAWN the first time, the spawned process will also need to take part in the next call to SPAWN, otherwise it will be left out of the communicator you're merging. it ends up looking like this:

Individual Spawns

The problem is that only two processes are participating in each MPI_INTERCOMM_MERGE and you can't merge three communicators so you'll never end up with one big communicator that way.

If you instead have each process participate in the merge as it goes, you end up with one big communicator in the end:

Group Spawns

Of course, you can just spawn all of your extra processes at once, but it sounds like you might have other reasons for not doing that.

Inglebert answered 17/7, 2014 at 15:55 Comment(2)
So your saying process 1 spawns 2, then 2 spawns 3, etc... I was trying to avoid MPI_Comm_spawn_multiple so that I could avoid the mess of setting up arrays of commands, info, etc. since I will ultimately end up doing this in Fortran.Overstudy
No, 0 spawns 1, then 0 & 1 spawn 2, etc.Inglebert

© 2022 - 2024 — McMap. All rights reserved.