Cisco Blogs

Unexpected messages = evil

June 11, 2011 - 0 Comments

Another term that is not-infrequently used when discussing message passing application is “unexpected messages.”

What are they, and why are they (usually) bad?

The quick definition is that an unexpected message is one that arrives before a corresponding MPI receive has been posted.  In more concrete terms: an MPI process has sent a message to a process that hadn’t yet called some flavor of MPI_RECV to receive the message.

Why is this a Bad Thing?

Unexpected message queueBecause unexpected messages consume RAM on the receiver.

As described in prior posts, MPI implementations typically send short messages “eagerly,” without waiting for the receiver to asking for it.  If the message is unexpected at the receiver, most MPI implementations just allocate a temporary buffer and receive the message into it.

This temp buffer is then placed on an unexpected message queue that is searched when the application posts a new MPI_RECV.  If a matching message is found on the unexpected message queue, the message is copied to the MPI_RECV buffer, it is dequeued from the unexpected message queue, and freed.

That’s all fine and good.

But what happens if unexpected messages start to pile up?  E.g., what if a lot of senders are sending oodles of short, unexpected messages to a single receiving process?

If senders overwhelm a receiver like this, the memory consumed by the unexpected message queue can actually grow without bound.  Imagine a very large value for N in the image above.  This means that the MPI implementation might effectively be stealing RAM from the application (remember: many HPC environments have page swapping disabled; applications are limited to the physical RAM on the machine).

This is clearly Bad.

The MPI specification actually says that an MPI implementation should never let this be a problem.  But the exact definition of “problem” in this context can vary quite a bit — what value of N is “too large”?  That’s very much an application-specific question; it can depend on many different factors (amount of RAM available, number of MPI processes running locally, bursty trends of communicator patterns, etc.).

MPI application developers should therefore strive to make their messages “expected” — or, put differently, “not unexpected.”

  • Ensure that you have receives posted before messages will arrive, whenever possible.
  • Ensure that “fast” senders don’t overwhelm “slow” receivers (e.g., have some kind of flow control where receivers can tell senders to slow down).

It isn’t always possible to pre-post receives before messages arrive, but try to do it whenever you can.

In an effort to keep conversations fresh, Cisco Blogs closes comments after 60 days. Please visit the Cisco Blogs hub page for the latest content.