MPI Spawn: root process does not communicate to child processes
Asked Answered
D

3

8

(Beginner question) I'm trying to spawn processes dynamically using MPI_Comm_Spawn and then broadcast a message to the child processes, but the program stops in the broadcast from the root process to the children. I'm following the documentation from http://www.mpi-forum.org/docs/docs.html but i can't make it work. Can anybody help me please?

#include <stdio.h>
#include <mpi.h>

int main(int argc, char *argv[])
{
    MPI_Init(&argc, &argv);
    MPI_Comm parentcomm;

    MPI_Comm_get_parent( &parentcomm );

    if (parentcomm == MPI_COMM_NULL) {
        MPI_Comm intercomm;
        MPI_Status status;
        char msg_rec[1024];
        char msg_send[1024];
        int size, i;

        int np = (argc > 0) ? atoi(argv[1]) : 3;

        printf("Spawner will spawn %d processes\n", np);
        MPI_Comm_spawn( argv[0], MPI_ARGV_NULL, np, MPI_INFO_NULL, 0, MPI_COMM_SELF, &intercomm, MPI_ERRCODES_IGNORE );
        MPI_Comm_size(intercomm, &size);

        sprintf(msg_send, "Hello!");
        printf("Spawner will broadcast '%s'\n", msg_send);
        MPI_Bcast( (void*)msg_send, 1024, MPI_CHAR, 0, intercomm);

        printf("Spawner will receive answers\n");
        for (i=0; i < size; i++) {
            MPI_Recv( (void*)msg_rec, 1024, MPI_CHAR, i, MPI_ANY_TAG, intercomm, &status);
            printf("Spawner received '%s' from rank %d\n", msg_rec, i);
        };       

    } else {
        int rank, size;
        char msg_rec[1024];
        char msg_send[1024];

        MPI_Comm_rank(parentcomm, &rank);
        MPI_Comm_size(parentcomm, &size);

        printf("  Rank %d ready\n", rank);

        MPI_Bcast( (void*)msg_rec, 1024, MPI_CHAR, 0, parentcomm);

        printf("  Rank %d received '%s' from broadcast!\n", rank, msg_rec);
        sprintf(msg_send, "Hi there from rank %d!\n", rank);
        MPI_Send( (void*)msg_send, 1024, MPI_CHAR, 0, rank, parentcomm);
    };
    MPI_Finalize();
    return 0;
};

I don't know if it matters, but I'm using ubuntu 11.10 and Hidra Process Manager.

Doting answered 2/4, 2012 at 3:1 Comment(0)
B
3

As @suszterpatt pointed out, your are working with an "Intercommunicator" (not a "Intracommunicator"). Knowing this and looking at MPI_Bcast, we see:

If comm is an intercommunicator, then the call involves all processes in the intercommunicator, but with one group (group A) defining the root process. All processes in the other group (group B) pass the same value in argument root, which is the rank of the root in group A. The root passes the value MPI_ROOT in root. All other processes in group A pass the value MPI_PROC_NULL in root. Data is broadcast from the root to all processes in group B. The receive buffer arguments of the processes in group B must be consistent with the send buffer argument of the root.

This means that you need only replace the broadcast call in the parent with:

MPI_Bcast( (void*)msg_send, 1024, MPI_CHAR, MPI_ROOT, intercomm);

A few other bugs:

  • The check on the number of arguments should be argc > 1.
  • MPI_Comm_size(intercomm, &size) will return 1. You'll want to use MPI_Comm_remote_size(intercomm, &size) instead.
Bountiful answered 21/8, 2012 at 16:15 Comment(0)
H
3

If you don't want to deal with an intercommunicator after you've spawned your child processes, you can use MPI_Intercomm_merge to create an intracommunicator from your intercommunicator. Essentially, it would look like this:

Spawner:

MPI_Comm_spawn( argv[0], MPI_ARGV_NULL, np, MPI_INFO_NULL, 0, MPI_COMM_SELF, &intercomm, MPI_ERRCODES_IGNORE );
MPI_Intercomm_merge(intercomm, 0, &intracomm);

Spawnee:

MPI_Intercomm_merge(parentcomm, 1, &intracomm);

After that, you can continue to use intracomm (or whatever you want to call it) as if it were a regular intracommunicator. In this instance, the spawning processes would have the low order ranks and the new processes would have the higher ranks, but you can modify that as well with the second argument.

Harts answered 21/8, 2013 at 13:21 Comment(1)
How exactly would this work? I just posted a related question and this merge statement might work but I am confused about "inter", "intra" in the different parts of the code.Cordalia
S
-1

Collective communication calls such as Bcast() require an intracommunicator: you're trying to use an intercommunicator (both intercomm and parentcomm). You will have to use the group creation methods to define a group that encompasses the parent process and all child processes, then create a new intracommunicator over that group.

Shipman answered 3/4, 2012 at 13:29 Comment(1)
That's not accurate as @Bountiful points out. You can do a collective call with an intercommunicator. You just need to be careful about it and read the separate specification.Harts

© 2022 - 2024 — McMap. All rights reserved.