Which profiler do you use for Fortran code base with MPI in it? gprof doesn't seem to be working correctly. Sun Studio Analyzer only returns the timings for the C/C++ system calls and none of the fortran functions appear.
There are a number of performance analysis tools specialized for Parallel/MPI Programs, such as:
- Score-P, which works with a number of different Analysis tools, e.g. Cube, Vampir
- HPCToolkit uses sampling only, so you do not have to recompile your application
- Tau
At first they may not be as simple to use simple to use, but they provide much more help to investigate the performance of parallel applications.
When the questioner says "gprof doesn't seem to be working correctly", perhaps he's referring to the fact that N MPI processes might clobber the gmon.out file. In that case, the (undocumented) GMON_OUT_PREFIX environment variable might make gprof more useful:
$ export GMON_OUT_PREFIX=gmon.out
$ mpiexec -np 4 cpi
Allinea MAP is a profiler that is simple and straightforward but very powerful.
It is designed to show the performance problems in Fortran, C and C++ MPI applications, and requires very little effort to get started and get profiling.
It is graphical, and has an integrated with a source code browser that shows performance against lines of code, and able to analyse bad MPI behaviour, poor work balance or poor vectorization.
I am one of the team behind the product, so am a little biased. It is commercial - there are evaluation licences available from the website.
gprof
is a good profiler for Fortran and other GNU based compilers.
You can use Intel Trace analyzer to profile MPI communication and Intel VTune to obtain a profile of single MPI Task. Both software was widely documented on Intel web site.
I would like to add two more profilers : (1) mpiP is a lightweight profiler and can produce textual output but measures only MPI functions. (2) Scalasca - this produces a sophisticated output which can point to synchronisation imbalances (late sender / late receiver) also (as opposed to TAU which does not point to synchronisation imbalances).
© 2022 - 2024 — McMap. All rights reserved.
gprof
? I use it to profile my MPI programs without problems. Did you compile the objects you want to profile with-pg
? – Saithgprof
is OK if your call tree is pretty shallow, and it is blind to time spent in I/O, if you have any. I use this method which works with GDB in Fortran. I turn off the MPI, do the performance tuning, then turn MPI back on. – Roy