Maximum amount of data that can be sent using MPI::Send
Asked Answered
P

4

6

With the syntax for MPI::Isend as

MPI::Request MPI::Comm::Isend(const void *buf, int count, 
              const MPI::Datatype& datatype, 
              int dest, int tag) const;

is the amount of data sent limited by

std::numeric_limits<int>::max()

Many other MPI functions have int parameter. Is this a limitation of MPI?

Pitta answered 26/11, 2012 at 4:53 Comment(1)
on a side note the C++ syntax of MPI are deprecated, you should use the C syntaxNympholepsy
K
12

MPI-2.2 defines data length parameters as int. This could be and usually is a problem on most 64-bit Unix systems since int is still 32-bit. Such systems are referred to as LP64, which means that long and pointers are 64-bit long, while int is 32-bit in length. In contrast, Windows x64 is an LLP64 system, which means that both int and long are 32-bit long while long long and pointers are 64-bit long. Linux for 64-bit x86 CPUs is an example of such a Unix-like system which is LP64.

Given all of the above MPI_Send in MPI-2.2 implementations have a message size limit of 2^31-1 elements. One can overcome the limit by constructing a user-defined type (e.g. a contiguous type), which would reduce the amount of data elements. For example, if you register a contiguous type of 2^10 elements of some basic MPI type and then you use MPI_Send to send 2^30 elements of this new type, it would result in a message of 2^40 elements of the basic type. Some MPI implementations may still fail in such cases if they use int to handle elements count internally. Also it breaks MPI_Get_elements and MPI_Get_count as their output count argument is of type int.

MPI-3.0 addresses some of these issues. For example, it provides the MPI_Get_elements_x and MPI_Get_count_x operations which use the MPI_Count typedef for their count argument. MPI_Count is defined so as to be able to hold pointer values, which makes it 64-bit long on most 64-bit systems. There are other extended calls (all end in _x) that take MPI_Count instead of int. The old MPI_Get_elements / MPI_Get_count operations are retained, but now they would return MPI_UNDEFINED if the count is larger than what the int output argument could hold (this clarification is not present in the MPI-2.2 standard and using very large counts in undefined behaviour there).

As pyCthon has already noted, the C++ bindings are deprecated in MPI-2.2 and were removed from MPI-3.0 as no longer supported by the MPI Forum. You should either use the C bindings or resort to 3rd party C++ bindings, e.g. Boost.MPI.

Kacey answered 26/11, 2012 at 12:20 Comment(3)
Many thanks for the answer :) Is there any reason why C++ binding are deprecated?Pitta
@SumanVajjala, it's all in the MPI Forum tickets. This one contains most of the reasons why the C++ bindings were deprecated in MPI-2.2, this one contains the rationale behind removing them in MPI-3.0, and this one makes for some funny reading :)Kacey
Thanks @ Hristo for the links!!Pitta
A
1

I haven't done MPI, however, int is the usual limiting size of an array, and I would suspect that is where the limitation comes from.

In practice, this is a fairly high limit. Do you have a need to send more than 4 GB of data? (In a single Isend)

For more information, please see Is there a max array length limit in C++?

Do note that link makes references to size_t, rather than int (Which, for all intents, allows almost unlimited data, at least, in 2012) - however, in the past, 'int' was the usual type for such counts, and while size_t should be used, in practice, a lot of code is still using 'int'.

Ambulant answered 26/11, 2012 at 5:0 Comment(0)
N
0

The maximum size of an MPI_Send will be limited by the maximum amount of memory you can allocate

and most MPI implementations supportsizeof(size_t)

Nympholepsy answered 26/11, 2012 at 5:7 Comment(5)
Feel free to copy mine into your answer so that there's one clear 'good answer' - you seem to know MPI a bit better than I do. :)Ambulant
@Nympholepsy It is not reflected in the syntax then? Or is it? I am confused:) I have an MPI code where I pack data using MPI::Pack and the datatype for the position in the buffer works only with int. void MPI::Datatype::Pack(const void* inbuf, int incount, void* outbuf, int outsize, int& position, const MPI::Comm& comm) const;Pitta
@SumanVajjala: pyCtyhon has already pointed out that the C++ interface is deprecated (although I haven't confirmed myself, but I'd listen to the guy who has done MPI rather than I!). As a tip, C and C++ are distinct languages, and if you are using namespaces, such as that 'MPI::' thing you have there, you are most certainly NOT using C.Ambulant
@Ambulant the mpi standards committee did a survey a while back of who actually used the c++ syntax and it turns out there wasn't any one really using it at the time so they stopped updating it. I know for a fact Open MPI support size_t not sure of other implementations, the C syntax works fine in a C++ programNympholepsy
This is either inadequate or wrong. If you mean to say that MPI_Send can in theory send a buffer requiring the entire allocatable memory, then this is true for more than 2^31 elements if and only if one uses user-defined datatypes carefully. And at the writing of your post, MPICH and OpenMPI did not fully support 64-bit counts internally and thus would break with this usage. Only very recently did MPICH become "count-safe".Lassa
L
0

This issue and a number of workarounds (with code) are discussed on https://github.com/jeffhammond/BigMPI. In particular, this project demonstrates how to send more than INT_MAX elements via user-defined datatypes.

Lassa answered 9/9, 2014 at 7:23 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.