Cisco Blogs


Cisco Blog > High Performance Computing Networking

“Eager Limits”, part 2

May 31, 2011 at 7:30 am PST

Open MPI actually has multiple different protocols for sending messages — not just eager / rendezvous.

Our protocols were originally founded on the ideas described in this paper.  Many things have changed since that 2004 paper, but some of the core ideas are still the same.

The picture to the right shows how Open MPI divides an MPI message up into segments and sends them in three phases.  Open MPI’s specific definition of the “eager limit” is the max payload size that is sent with MPI match information to the receiver as the first part of the transfer.  If the entire message fits in the eager limit, no further transfers / no CTS is needed.

Read More »

Tags: , ,

What is an MPI “eager limit”?

May 28, 2011 at 7:30 am PST

Technically speaking, the MPI standard does not define anything called an “eager limit.”

An “eager limit” is term used to describe a method of sending short messages used by many MPI implementations.  That is, it’s an implementation technique — it’s not part of the MPI standard at all.  And since it’s not standardized, it also tends to be different in each MPI implementation.  More specifically: if you write your MPI code to rely on a specific implementation’s “eager limit” behavior, your code may not perform well (or may even deadlock!) with other MPI implementations.

So — what exactly is an “eager limit”?

Read More »

Tags: ,

A bucket full of new MPI Fortran features

May 23, 2011 at 6:46 am PST

Over this past weekend, I had the motivation and time to overhaul Open MPI’s Fortran support for the better.  Points worth noting:

  • The “use mpi” module now includes all MPI subroutines.  Strict type checking for everything!
  • Open MPI now only uses a single Fortran compiler — there’s no more artificial division between “f77″ and “f90″

There’s still work to be done, of course (this is still off in a Mercurial bitbucket repo — not in the Open MPI main line SVN trunk yet), but the results of this weekend code sprint are significantly simpler Open MPI Fortran plumbing behind the scenes and a much, much better implementation of the MPI-2 “use mpi” Fortran bindings.

Read More »

Tags: , , ,

User-level timers for MPI

May 13, 2011 at 9:45 am PST

Fab Tillier (Microsoft MPI) and I recently proposed a set of user-level timers for MPI.  The following slides are an example of what the interface could be:

Read More »

Tags: , ,

Can we count on MPI to handle large datasets?

April 22, 2011 at 2:25 pm PST

(today’s entry is guest-written by Fab Tillier, Microsoft MPI engineer extraordinaire)

When you send data in MPI, you specify how many items of a particular datatype you want to send in your call to an MPI send routine.  Likewise, when you read data from a file, you specify how many datatype elements to read.

This “how many” value is referred to in MPI as a count parameter, and all of MPI’s functions define count parameters as integers: int in C, INTEGER in Fortran.  This definition often limits users to 231 elements (i.e., roughly two billion elements) because int and INTEGER default to 32 bits on many of today’s platforms.

That may sound pretty big, but consider that a 231 byte file is not really that large by today’s standards — especially in HPC, where datasets can sometimes be terabytes in size.  Reading a ~2 gigabyte file can take (far) less than a second.  Read More »

Tags: , ,