Crazy ideas to revamp MPI_INIT and MPI_FINALIZE

October 3, 2015 - 8 Comments

I recently had the pleasure of attending EuroMPI 2015, hosted by INRIA in Bordeaux, France (…hey, you should attend EuroMPI 2016 in Edinburgh, Scottland!).

I gave two mini-talks during my speaking slot, the first of which was entitled: Crazy ideas about revamping MPI_INIT and MPI_FINALIZE.

This first mini-talk is essentially a small taste of the fun kinds of discussions that we have at the MPI Forum.

In particular: there are many known limitations of MPI_INIT / MPI_FINALIZE as defined by MPI-1/2/3.  How can we overcome them?  Here’s some ideas that have been bumping around in my head for a while about MPI_INIT / MPI_FINALIZE:

Keep in mind: at this point, these are still crazy (and incomplete!) ideas.  The MPI Forum may end up using some (or none!) of them in a future version of the standard.  At this point, it’s just still talk.

If you’re interested in helping define the next generation of parallel computing, you should attend upcoming MPI Forum meetings.  The meetings are public and open to all.

In an effort to keep conversations fresh, Cisco Blogs closes comments after 60 days. Please visit the Cisco Blogs hub page for the latest content.


  1. It’s easy to solve MPI_COMM_WORLD. Just define a global MPI_SESSION_DEFAULT and get the communicator from that 😉

  2. I like how you switched from your generic slide template full of memes to one that obviously mocks overzealous corporate marketing. Oh, wait…

  3. How does one session ‘target’ another – does this become another vector in the rank/tag space? Can sessions communicate with other process-local sessions (aka loopback)?

    I think a ref counted init/finalize is immediately useful and fairly straight forward. You’re still limited by whatever threading level the first caller sets, but at least it’s discoverable. I’d also remove the limitation that you can’t init after finalize (though if init/finalize are ref counted, how do you know the last finalize was called?)

    • I amorphously referred to this in slide 23: “What are the arguments to SESSION_CREATE?” I think I said during the talk that I assumed that there would have to be a string or integer tag argument so that multiple threads invoking SESSION_CREATE concurrently could be done safely (i.e., they could match their peers in other processes and create a unique communicator safely).

      If you don’t do something like this, having only a single session wouldn’t be super-useful — it would be effectively the same as the current INIT / FINALIZE scheme.

  4. IMO the root of all these problems is that the MPI standard is based on a lot of magic global state, and thus libraries that use MPI do not compose.

    The right way to fix this is to kill this global state.

    That would mean allowing a single MPI_Session per process, and recommending libraries that use MPI that they should take a pointer/reference/… of this single MPI session instead of creating it by themselves (which is akin to calling Init/FInalize).

    • I’m afraid I don’t understand one part of your comment: why only allow one MPI_Session? Isn’t that the equivalent of global state (albeit explicit)? The idea is that the MPI library can use the MPI_Session to stash whatever per-entity state is required on the session (I don’t think it will ever be possible to 100% eliminate all of the global state — some of it relates to the hardware, the OS, …etc.). In this way, each entity can have it’s own private state (or at least, as much private state as possible — excluding whatever global state must remain global).

      • > Isn’t that the equivalent of global state (albeit explicit)?

        Yes, I’d just rather have it be explicit instead of implicit. Making all the state of MPI explicit has some advantages.

        For example the MPI standard could say “all MPI functions are thread unsafe, users can call them from different threads, but only from one thread at a time”. That way people that don’t need multi-threaded support don’t pay for it, those who need it can choose which kind of support they need (since the state is explicit they can decide how they want to synchronize access to it, e.g. via a mutex, timed mutex, …), and MPI implementations wouldn’t have to deal with it.

        One could also use this tho build communication libraries on top of MPI that e.g. use a thread pool to handle progress of asynchronous task.

    • Libraries that use MPI compose beautifully, so long as they are compiled against the same MPI implementation. It’s important to understand the difference between composability via the API and ABI compatibility. Like C++ and Fortran, MPI has implementation-specific quirks. Not everything can be as amazing as C (which provides ABI compatibility).