Cisco Logo


High Performance Computing Networking

It’s the beginning of a new year, so let’s take a step back and talk about what MPI is and why it is a Good Thing.

I’m periodically asked what exactly MPI is.  Those asking cover many different biases: network administrators, systems programmers, application programmers, web developers, server and network hardware designers, … the list goes on.  Most have typically heard about this “MPI” thing as part of “high performance computing” (HPC), and think that it’s some kind of parallel programming model.

Technically, it’s not.  MPI — or, more specifically, message passing — implies a class of parallel programming models.  But at its heart, MPI is about simplified inter-process communication (IPC).

I wrote a magazine column about What is MPI? a few years ago; let’s recap a few salient points:

There’s lots of other details, but that’s the core of MPI.

“So why is MPI associated with HPC and supercomputers?” people ask me.  ”MPI is simply message passing, usually over a network of some kind.  What’s the correlation?”

Take a step back: parallel computing means that there are multiple execution agents operating simultaneously, working on a common problem.  In some cases, that’s just multiple threads working together in a single program.  In other cases, it’s multiple independent processes working together.  In a multi-threaded application, the threads can use shared variables to communicate and synchronize with each other.  But when multiple processes are working together — particularly when those processes are not necessarily located on the same physical server — they need another form of communication for data transfer and synchronization.

This is where MPI fits in.

MPI is a simplified API that allows scientists and engineers to write network code.  They can just call MPI_SEND(…) and not have to worry about IP addresses, network transports, discovering or opening connections, …or any of the other idiosyncrasies of the underlying network.  They just send and receive the data that they need to run their code.  And the data is contained in atomic messages.  And typed.  And tagged / matched.  And reliable.  And ordered.  And …

These are all good, high-level network abstractions that make sense to users.  It sometimes takes a little work to explain such concepts to serious network wonks who think in terms of streams of octets and checksums.

When writing MPI-parallel code, it typically means that you’re scaling your code to be “larger” than one compute server — maybe you need more RAM; maybe you need more compute power.  Regardless, you need to run across multiple machines with some kind of network in between.   MPI provides the IPC between the processes of your parallel application running on multiple servers.

Could you use sockets (or OpenFabrics verbs or …)?  Sure; absolutely.  But MPI is a whole lot simpler for application developers to use than the underlying network APIs.  Plus, when you use MPI, your application becomes portable to other kinds of systems and networks.

That’s MPI in a nutshell.

Comments Are Closed

  1. Return to Countries/Regions
  2. Return to Home
  1. All High Performance Computing Networking
  2. Return to Home