Cisco Logo


High Performance Computing Networking

Do you know what an iBarrier is?

  1. A joke in the MPI Forum
  2. A useful synchronization technique
  3. Waiting in line at the Apple store for an iPhone

For a long time, the answer was #1 — we jokingly referred to “non-blocking barriers” in the same breath as MPI_SEND_ANY and MPI_ESP(do_what_i_meant_not_what_i_coded).  But recently, the answer has become #2.

Non-blocking (a.k.a. “fuzzy”) barriers seem weird at first, but they’re really just a mechanism for knowing that all members of a communicator reached a specific synchronization point.  Loosely put, you can known that everyone got there, but you don’t have to block waiting for it.  Other collectives such as non-blocking broadcasts are fairly intuitive.  But the fuzzy barrier takes a minute to get used to — probably just because the name “barrier” implies “blocking.”

MPI-3 will have non-blocking variants of all of the MPI-2 collectives.  At the Forum meeting this past week, we had a formal reading of the new non-blocking collectives text this past meeting, meaning that the text now only needs two formal votes before it becomes an official part of what will eventually become MPI-3.  Unless something goes drastically wrong, this will all come to pass over the next few months.

But wait, there’s more.

A new proposal was floated this past meeting for more new non-blocking operations: all the MPI-2 dynamics operations.  This includes MPI_ICOMM_SPAWN, MPI_ICOMM_CONNECT, MPI_ICOMM_ACCEPT, and MPI_ICOMM_JOIN.  Yowzers!  Also included in this proposal was the idea of adding timeouts to MPI functions (imagine allowing MPI_SEND or MPI_ICOMM_CONNECT to timeout).  Holy schnikies!

The reasons for proposing this new stuff are actually quite sound.  You might not need this functionality today, but you’ll likely need it for fault tolerance reasons as the world scales up to larger and larger systems.  But…  aye carrumba!  Such things are going to be a bear to implement.

Unlike the non-blocking collectives, the non-blocking dynamics and timeout stuff is still only a proposal at this point.  Who knows where it will go, but it sure is interesting!

Comments Are Closed

  1. Return to Countries/Regions
  2. Return to Home
  1. All High Performance Computing Networking
  2. Return to Home