Doing a cast is not the solution as it will simply truncate the long count. There are two obstacles to overcome here - an easy one and a hard one.
The easy obstacle is the int
type for the count argument. You can get past it simply by creating a contiguous type of smaller size and then send the data as multiples of the new datatype. An example code follows:
// Data to send
int data[1000];
// Create a contiguous datatype of 100 ints
MPI_Datatype dt100;
MPI_Type_contiguous(100, MPI_INT, &dt100);
MPI_Type_commit(&dt100);
// Send the data as 10 elements of the new type
MPI_Send(data, 10, dt100, ...);
Since the count argument of MPI_Type_contiguous
is int
, with this technique you can send up to (231-1)2 = (262 - 232 + 1) elements. If this is not enough, you can create a new contiguous datatype from the dt100
datatype, e.g.:
// Create a contiguous datatype of 100 dt100's (effectively 100x100 elements)
MPI_Datatype dt10000;
MPI_Type_contiguous(100, dt100, &dt10000);
MPI_Type_commit(&dt10000);
If your original data size is not a multiple of the size of the new datatype, you could create a structure datatype whose first element is an array of int(data_size / cont_type_length)
elements of the contiguous datatype and whose second element is an array of datasize % cont_type_length
elements of the primitive datatype. Example follows:
// Data to send
int data[260];
// Create a structure type
MPI_Datatype dt260;
int blklens[2];
MPI_Datatype oldtypes[2];
MPI_Aint offsets[2];
blklens[0] = 2; // That's int(260 / 100)
offsets[0] = 0;
oldtypes[0] = dt100;
blklens[1] = 60; // That's 260 % 100
offsets[1] = blklens[0] * 100L * sizeof(int); // Offsets are in BYTES!
oldtypes[1] = MPI_INT;
MPI_Type_create_struct(2, blklens, offsets, oldtypes, &dt260);
MPI_Type_commit(&dt260);
// Send the data
MPI_Send(data, 1, dt260, ...);
MPI_Aint
is large enough integer that can hold offsets larger than what int
can represent on LP64 systems. Note that the receiver must construct the same datatype and use it similarly in the MPI_Recv
call. Receiving an arbitrary non-integer amount of the contiguous datatype is a bit problematic though.
That's the easy obstacle. The not so easy one comes when your MPI implementation does not use internally long counts. In that case MPI would usually crash or only send part of the data or something weird might happen. Such an MPI implementation could be crashed even without constructing a special datatype by simply sending INT_MAX
elements of type MPI_INT
as the total message size would be (231 - 1) * 4 = 233 - 4. If that is the case, your only escape is manually splitting the message and sending/receiving it in a loop.