MPI: is there mpi libraries capable of message compression?

Asked 1/6, 2012 at 12:8 Answered 1/4, 2013 at 11:29

Sometimes MPI is used to send low-entropy data in messages. So it can be useful to try to compress messages before sending it. I know that MPI can work on very fast networks (10 Gbit/s and more), but many MPI programs are used with cheap network like 0,1G or 1Gbit/s Ethernet and with cheap (slow, low bisection) network switch. There is a very fast Snappy (wikipedia) compression algorithm, which has

Compression speed is 250 MB/s and decompression speed is 500 MB/s

so on compressible data and slow network it will give some speedup.

Is there any MPI library which can compress MPI messages (at layer of MPI; not the compression of ip packets like in PPP).

MPI messages are also structured, so there can be some special method, like compression of exponent part in array of double.

PS: There is also LZ4 compression method with comparable speed

Othilia answered 1/6, 2012 at 12:8 Comment(3)

I wish it could be possible to compress the network latency... :) – Palisade 1/6, 2012 at 16:6

@Hristo: Have you tried using shorter cables? – Yun 1/6, 2012 at 17:4

Our cables are already as short as possible I think. – Palisade 1/6, 2012 at 18:7

I won't swear that there's none out there, but there's none in common use.

There's a couple of reason's why it's not common:

MPI is often used for sending lots of floating point data which is hard (but not impossible) to compress well, and often has relatively high entropy after a while.

In addition, MPI users are often as concerned with latency as bandwidth, and adding a compression/decompression step into the message-passing critical path wouldn't be attractive to those users.

Finally some operations (like reduction collectives, or scatter gather) would be very hard to implement efficiently with compression.

However, you sound like your use case could benefit from this for point-to-point communications, so there's no reason why you couldn't do it yourself. If you were going to send a message of size N and the receiver expected it then:

sender calls compression routine, receives buffer and new length M;
if M >= N, send the original data, with an initial byte of say 0, as N+1 bytes to the receiver
otherwise sends an initial byte of 1 + compressed data
receiver receives data into length N+1 buffer
if first byte is 1, calls MPI_Get_count to determine amount of data received, calls decompression routine
otherwises uses uncompressed data

I can't give you much guidance as to the compresion routines, but it does look like people have tried this before, eg http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.91.7936 .

Ehr answered 1/6, 2012 at 12:38 Comment(6)

there is some paper with evaluation on NAS parallel: civ.cvut.cz/others/konference_supercomputing/… – Othilia 1/6, 2012 at 12:42

Compression can be limited only to messages longer than some limit; so latency of short messages will be the same; latency of long messages will be less (on slow network and compressible data). – Othilia 1/6, 2012 at 12:43

This is a better answer than mine. Sigh. – Economist 1/6, 2012 at 12:44

Looks like the two papers are by the same two people (with a different third on each one). It's interesting stuff, but there's some weird things in the results. eg, sPPM not scaling to >64 processors on their system? That suggests their ethernet is badly overloaded, sPPM is a nearest-neighbour thing, should scale really well. I also wish they had a latency-dominated code like a particle code, or a collectives-dominated code (like an optimization problem) to add to the mix. Still, interesting stuff. – Ehr 1/6, 2012 at 12:46

Collectives in platform agnostic libraries like Open MPI or MPICH(2) are usually implemented internally using p2p communication routines (and thus can be reimplemented using normal MPI_* calls). But compression will mess up with those clever hierarchical algorithms that try to minimise the latency overhead for small messages. – Palisade 1/6, 2012 at 16:2

Well, exactly. It would mean that every step of a logarithmic reduction would involve a decompress-reduce-compress step. Yuck. – Ehr 1/6, 2012 at 16:3

I'll be happy to be told otherwise but I don't think many of us users of MPI are concerned with having a transport layer that compresses data.

Why the heck not ?

1) We already design our programs to do as little communication as possible, so we (like to think we) are sending the bare minimum across the interconnect.

2) The bulk of our larger messages comprise arrays of floating-point numbers which are relatively difficult (and therefore relatively expensive in time) to compress to any degree.

Economist answered 1/6, 2012 at 12:28 Comment(2)

This answer is more concise and doesn't need a compression. – Grosvenor 1/6, 2012 at 12:46

The first point is an especially good one; the expectation is we're already sending the bare minimum data. For that reason, I would have thought that there was no real chance of improving things with compression for the standard kind of applications, but the papers listed by @Othilia are pretty interesting. – Ehr 1/6, 2012 at 12:51

There's an ongoing project at the University of Edinburgh: http://link.springer.com/chapter/10.1007%2F978-3-642-32820-6_72?LI=true

Ashaashamed answered 1/4, 2013 at 11:29 Comment(2)

I will add article title "An Adaptive, Scalable, and Portable Technique for Speeding Up MPI-Based Applications" and it was published in "Euro-Par 2012 Parallel Processing Lecture Notes in Computer Science Volume 7484, 2012, pp 729-740", doi 10.1007/978-3-642-32820-6_72. Their project named PRAcTICaL-MPI and there is page about it here: research.nesc.ac.uk/node/725 – Othilia 1/4, 2013 at 12:31

And there is older project by the same authors: CoMPI, e.g. here: dl.acm.org/citation.cfm?id=1943326.1943336 or here: arcos.inf.uc3m.es/~rosaf/europvm-v3.pptx – Othilia 1/4, 2013 at 12:55

Recommended topics

Hot tags