I am unaware of any existing wrappers that handle this, but you could write your own. Most MPI implementations have an additional layer that is intended for profiling (PMPI). You can use this layer for other purposes, in this case splitting a message. The way this layer works is you call the desired MPI function, and it immediately calls the PMPI version of that function. You can write a wrapper of the MPI version which will split the message and call the PMPI version for each. Here is an extremely simple example I wrote long ago for splitting MPI_Bcast:
#include <mpi.h>
int MPI_Bcast(void* buffer, int count, MPI_Datatype datatype,
int root, MPI_Comm comm ) {
/*
This function is a simple attempt at automatically splitting MPI
messages, in this case MPI_Bcast. By utilizing the profiling interface
of MPI, this function is able to intercept a call to MPI_Bcast. Then,
instead of the typical profiling, the message size is checked. If the
message is larger than the maximum allowable size, it will be split into
multiple messages, each of which will be sent individually. This
function isnot intended for high performance, it is intended to add
capability without requiring access to the source code of either the MPI
implementation or the program using MPI. The intent is to compile
this as a shared library and preload this library to catch MPI calls.
*/
int result;
int typesize;
long totalsize;
long maxsize=1;
// Set the maximum size of a single message
maxsize=(maxsize<<31)-1;
// Get the size of the message to be sent
MPI_Type_size(datatype, &typesize);
totalsize=static_cast<long>(typesize)*static_cast<long>(count);
// Check the size
if (totalsize > maxsize) {
// The message is too large, split it
/*
Ideally, this should be tailored to the system, possibly split into
a minimum of equally sized messages that will fit into the maximum
message size. However, this is a very simple implementation, and
is focusing on proof of concept, not efficiency.
*/
int elementsPerChunk=maxsize/typesize; // Number of elements per chunk
int remCount=count; // Remaining number of elements
char *address=static_cast<char*>(buffer); // Starting address
// Cast to char to perform arithmetic
int nChunks=count/elementsPerChunk; // How many chunks to send
if (count%elementsPerChunk!=0) nChunks++; // One more for any remaining elements
int chunkCount; // Number of elements in current chunk
// Send one chunk at a time
for (int i=0;i<nChunks;i++) {
// Determine how many elements to send
if (remCount>elementsPerChunk) {
chunkCount=elementsPerChunk;
} else {
chunkCount=remCount;
}
// Decrement the remaining elements
remCount-=chunkCount;
// Send the message chunk
/*
There is room for improvement here as well. One key concern is the
return value. Normally, there would be a single return value for
the entire operation. However, as the operation is split into
multiple operations, each with its own return value, a decision must
be made as to what to return. I have chosen to simply use the
return value from the last call. This skips over some error checking
but is not critical at present.
*/
result=PMPI_Bcast(static_cast<void*>(address),chunkCount,datatype,root,comm);
// Update the address for the next chunk
address+=chunkCount*typesize;
}
} else {
// The message is small enough, just send as it is
result=PMPI_Bcast(buffer,count,datatype,root,comm);
}
// Pass the return value back to the caller
return result;
}
You can write something similar for MPI_Send (and MPI_Recv) and get the functionality you want. But if this is only for one program, you might be better off just modifying that program to send in chunks.