MPI global execution time

Asked 14/3, 2011 at 12:44 Answered 22/6, 2020 at 22:3

I'm working on a little application that multiples an array with a Matrix. It works without any problem. I'm loking for measure the execution time of the application. I can find the individual time of execution of each processes (its starting and ending) but I need the global time.

This is my code:

int main(int argc, char **argv){
    int rang, procesus;
    MPI_Status statut;
    double start, end, max_end = 0, min_start = 10000;

    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rang);
    MPI_Comm_size(MPI_COMM_WORLD, &procesus);
    MPI_Barrier(MPI_COMM_WORLD);

    start = MPI_Wtime();
    printf("Starting time of process n. %d %f\n",rang, start);
    if(rang==0){
        //Master work
    }else{
        //slaves work
    }
    MPI_Barrier(MPI_COMM_WORLD);
    end = MPI_Wtime();
    printf("Ending time of process n.%d %f\n\n\n",rang, end);

    MPI_Finalize();
//Out of the Parallelized task

    if(min_start > start){
        min_start = start;
        printf("New minumum starting time %f\n", min_start);
    }

    if(max_end < end){
        max_end = end;
        printf("New maximum ending time %f\n", max_end);
    }

    if(rang == 0){
        printf("Start %f\n", min_start);
        printf("End %f\n", max_end);
    }
    return 0;
}

I use the varables min_start and max_end as "global" variables to try to catch the max and min temps of all the processes, but I always get the starting and ending tme of the last process to execute, the ending time is ok, but the starting time is wrong cause the last process was ot the first to start. What am I doing wrong? Can I use a really global variable in MPI for all the processes, and if I can how?

That's what I have as output

Starting time of process n.2. 0.101562
Ending time of process n.2. 0.105469
New minumum starting time 0.101562
New maximum ending time 0.105469

Starting time of process n.3. 0.058594
Ending time of process n.3. 0.062500
New minumum starting time 0.058594
New maximum ending time 0.062500

Starting time of process n. 4. 0.007812
Ending time of process n. 4. 0.011719
New minumum starting time 0.007812
New maximum ending time 0.011719

Starting time of process n.1. 0.148438
Ending time of process n.1. 0.152344
New minumum starting time 0.148438
New maximum ending time 0.152344

Starting time of process n.0. 0.207031 
Ending time of process n.0. 0.210938
New minumum starting time 0.207031
New maximum ending time 0.210938

Start 0.207031
End 0.210938

Kero answered 14/3, 2011 at 12:44 Comment(0)

MPI_Init() and MPI_Finalize() don't mark the beginning and end of parallel execution, only the beginning and end of where MPI calls are allowed. In MPI, all your processes run parallel from beginning to end and share no global data whatsoever.

You can use MPI_Reduce() to find the minimum starting time and maximum ending time of processes.

Apocynaceous answered 14/3, 2011 at 13:18 Comment(3)

calculating runtime using start/end times from different nodes may give invalid results if the clocks are not synchronised. – Sills 14/3, 2011 at 16:35

Hi... yes... To solve the problem I add the MPI_Reduce to find the max and the min of the starting and ending time of all the processes... Thanks for the tip... :) – Kero 16/3, 2011 at 9:49

@ShawnChin, what do you mean? If the clock are not synchronised the start and end time are going to be (possibly completely) different, but their difference should be the elapsed time for the current process, so either I'm saying something wrong or you mean something else. Can you explain? Thank you in advance. – Adim 6/4, 2017 at 12:3

In most cases, it is often enough to simply keep track of the start and end time on the master node and derive the global run time on the the master only.

One thing worth noting is that you must place a barrier before collecting the start time (to make sure all nodes are ready to proceed), and before the end time (to make sure all nodes are done).

double start, end;

MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);

MPI_Barrier(MPI_COMM_WORLD); /* IMPORTANT */
start = MPI_Wtime();

/* ... do work ... */

MPI_Barrier(MPI_COMM_WORLD); /* IMPORTANT */
end = MPI_Wtime();

MPI_Finalize();

if (rank == 0) { /* use time on master node */
    printf("Runtime = %f\n", end-start);
}

Doing the same on all nodes will give the almost the same results with small deviations depending on how quickly each node returns from the MPI_Barrier call. This is usually a very small value relative to most practical runs and can be discounted.

Trying to derive a time using start/end times from different nodes is not worth the effort, and can give your wrong answers if MPI_Wtime does not use a global synchronised clock. Note that synced Wtime is not supported in some MPI implementations (check MPI_WTIME_IS_GLOBAL).

Sills answered 14/3, 2011 at 16:24 Comment(8)

Hi. thanks for the answer. Actually I'm still confused because of my output. Processes 0 is my master process but it starts after the other processes (let's say that works in other processes need some information sent from the master). I can find easily who's the processes who starts the first (#4 in my output) but I don't know how to save this value after the finish of its execution and to get it in the last process that finishes. – Kero 14/3, 2011 at 16:40

@jomaora. In MPI, your variables are not shared across processes. Every process sees their own version of start, end, min_start, max_end, which is why in your code, every process will say that min_start is their start time, and max_end is their end time. The last block of prints will always be p0 printing out his own start and end time. If you want to find out min/max of value across all processor, do as @suszterpatt mentioned and use MPI_Reduce. – Sills 14/3, 2011 at 16:47

Shawn.. you saved my life! – Syncrisis 1/11, 2013 at 0:18

hey, in regards to MPI_Barrier, do you have any idea why it behaves like this #29758317 – Someone 21/4, 2015 at 5:39

Any intention on using %d for printing? It generates a warning. Is it because we want to keep the result simple? Or maybe it doesn't matter to count something else than seconds in real experiments? – Fuddyduddy 3/1, 2016 at 21:15

@Fuddyduddy probably a typo. Sorry. – Sills 7/1, 2016 at 9:44

Better late than never I guess to correct it, I took the initiative to do so. :) – Fuddyduddy 8/1, 2016 at 13:17

can you elaborate more on why is the second Barrier important ? I can understand the first but the second ... isn't the master going to wait for the slaves to finish before the MPI_Wtime second call anyway ? – Etching 19/5, 2017 at 13:34

You can use MPI_Reduce() to find the minimum starting time and maximum ending time of processes.

Apocynaceous answered 14/3, 2011 at 13:18 Comment(3)

calculating runtime using start/end times from different nodes may give invalid results if the clocks are not synchronised. – Sills 14/3, 2011 at 16:35

Hi... yes... To solve the problem I add the MPI_Reduce to find the max and the min of the starting and ending time of all the processes... Thanks for the tip... :) – Kero 16/3, 2011 at 9:49

You can use MPI_REDUCE like this code snippet

 mytime = MPI_Wtime(); /*get time just before work section */

   work(myrank); /*need to create */

   MPI_Barrier(MPI_COMM_WORLD);

   mytime = MPI_Wtime() - mytime; /*get time just after work section*/



/*compute max, min, and average timing statistics*/

MPI_Reduce(&mytime, &maxtime, 1, MPI_DOUBLE,MPI_MAX, 0, MPI_COMM_WORLD);

MPI_Reduce(&mytime, &mintime, 1, MPI_DOUBLE, MPI_MIN, 0,MPI_COMM_WORLD);

MPI_Reduce(&mytime, &avgtime, 1, MPI_DOUBLE, MPI_SUM, 0,MPI_COMM_WORLD);

if (myrank == 0) 

   {

  avgtime /= numprocs;

  printf("Min: %lf Max: %lf Avg: %lf\n", mintime, maxtime,avgtime);
   }

MPI_Finalize ();

MPI_Reduce takes an array of input elements on each process and returns an array of output elements to the root process. The output elements contain the reduced result.

MPI_MAX - Returns the maximum element.
MPI_MIN - Returns the minimum element.
MPI_SUM - Sums the elements.
MPI_PROD - Multiplies all elements.
MPI_LAND - Performs a logical and across the elements.
MPI_LOR - Performs a logical or across the elements.
MPI_BAND - Performs a bitwise and across the bits of the elements.
MPI_BOR -Performs a bitwise or across the bits of the elements.
MPI_MAXLOC -Returns the maximum value and the rank of the process that owns it.
MPI_MINLOC - Returns the minimum value and the rank of the process that owns it.

MPI_Reduce(
    void* send_data,
    void* recv_data,
    int count,
    MPI_Datatype datatype,
    MPI_Op op,
    int root,
    MPI_Comm communicator)

Outstare answered 22/6, 2020 at 22:3 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags