Cisco Logo


High Performance Computing Networking

Pop quiz, hotshot: how many types of sends are there in MPI?

Most people will immediately think of MPI_SEND.  A few of you will remember the non-blocking variant, MPI_ISEND (where I = “immediate”).

But what about the rest — can you name them?

Here’s a hint: if I run “ls -1 *send*c | wc -l” in Open MPI’s MPI API source code directory, the result is 14.  MPI_SEND and MPI_ISEND are two of those 14.  Can you name the other 12?

Here’s a complete list:

  1. MPI_SEND: Standard send (a.k.a. “blocking”… but that’s not always true!)
  2. MPI_ISEND: Immediate (a.k.a. “non-blocking”) standard send
  3. MPI_SEND_INIT: Persistent standard send
  4. MPI_SSEND: Synchronous send
  5. MPI_ISSEND: Immediate synchronous send
  6. MPI_SSEND_INIT: Persistent synchronous send
  7. MPI_BSEND: Buffered send
  8. MPI_IBSEND: Immediate buffered send
  9. MPI_BSEND_INIT: Persistent buffered send
  10. MPI_RSEND: Ready send
  11. MPI_IRSEND: Immediate ready send
  12. MPI_RSEND_INIT: Persistent ready send
  13. MPI_SENDRECV: Standard send and receive
  14. MPI_SENDRECV_REPLACE: Standard send and receive into the same buffer

If you look closely, there’s really only 4 types of sends, but there’s 3 variants of each:

For each of those 4 types, there’s 3 variants:

  • The plain variant: exactly as described above.
  • The immediate variant: also known as “non-blocking,” immediate calls return (more-or-less) immediately.  The application will get an MPI_Request back that must be tested or waited on for completion.  Progress on the send is occurring “in the background.”
  • The persistent variant: a persistent send can be executed multiple times; they’re good for applications that need to send the same buffer repeatedly (e.g., sending boundary information in an iterative algorithm).  The rationale is that the MPI implementation can incur the setup overhead for this particular send once, and then quickly initiate the actual sending mechanism each time after that.

The last 2 of the 14 are SENDRECV and SENDRECV_REPLACE.  These are not really new modes / variants in themselves; they are actually both forms of the plain standard send.  The interesting part about these two sends is that they are combined with a receive — but not necessary from the same recipient as the send!

For example, you can “send to the left” and “receive from the right” in a single call.  Cool!

The REPLACE version allows you to use the same buffer to send and receive; MPI guarantees that the correct message will be sent and the receive buffer will contain the entire received message.  It’s a good optimization for large messages; an application doesn’t need to use 2x the buffer space.

In an effort to keep conversations fresh, Cisco Blogs closes comments after 90 days. Please visit the Cisco Blogs hub page for the latest content.

4 Comments.


  1. Why are buffered sends evil?

       0 likes

    • They are evil for the following reasons:

      * Buffered sends *force* an extra copy of the outgoing message (i.e., a copy from your buffer to the interner MPI buffer).
      * This copy not only takes up resources (memory), it also takes *time*.
      * The internal memory was also allocated by the user, which may not be optimal for the network transport that will be used for the message transfer.
      * Consider if you use a buffered send for a very large message. The MPI must first copy that message before it can send it.
      * Some MPI implementations may do copies (depending on the underlying network transport), in which case they’ll likely have a good scheme for pipelined copies and sending.

      In short, memory copies are not necessarily evil, but if they’re employed, it’s much better to let the underlying MPI implementation *choose* to do that, and therefore use all of its available optimizations (e.g., using special memory, pipelined copies overlapping with sending, using its own limits for buffered resources, etc.).

         0 likes

  2. There are other issues outside of performance why buffered sends are evil. A buffered send can complete to the application immediately, having been successfully copied to the internal buffer space specified via MPI_BUFFER_ATTACH. If a matching receive is never posted at the peer, there is no way for the application to reclaim that used buffer space – there’s no way of cancelling the request since it has already completed locally. There is also no way for the application to detect when it is safe to call MPI_BUFFER_DETACH. The normal options for apps to track (and cancel) their non-blocking send requests don’t work when using MPI_BSEND. Note that I single out MPI_BSEND here because there are cancelation semantics defined for the non-blocking variants.

       0 likes

  1. Return to Countries/Regions
  2. Return to Home
  1. All High Performance Computing Networking
  2. Return to Home