Cisco Blogs


Cisco Blog > High Performance Computing Networking

Multi / many / mucho cores

April 9, 2010 at 12:00 pm PST

I’ve briefly mentioned before the idea of dedicating some cores for MPI communication tasks (remember: the idea of using dedicated communication co-processors isn’t new).  I thought I’d explore this in a bit more detail in today’s entry.

Two networking vendors (I can’t say the vendor names or networking technologies here because they’re competitors, but let’s just say that the technology rhymes with “schminfiniband”) recently announced products that utilize communication processing offload for MPI collective communications.  Interestingly enough, they use different approaches.  Let’s look at both.

Read More »

Open Source MPI Implementations

March 29, 2010 at 12:00 pm PST

People periodically ask about my opinions of closed source forking from the open source project that I work on (Open MPI).  “Doesn’t it bother you that others making money off the software you wrote?” they ask.  “Aren’t they taking credit that belongs to you?”  And my personal favorite: “Don’t you worry about losing control of the Open MPI project?”

My answers to these particular questions are:

  • No.  And to be clear, I’m part of a community that wrote the software — I didn’t write (anywhere close to) all of it.
  • No, they’re not.  They’re exercising the license that we chose to use (BSD).
  • No.  There are good reasons both to extend and/or fork from our code base.

To be clear: I think that all the work — both open and closed source — surrounding the project and community that I am fortunate enough to be a part of is GREAT.

Read More »

MPI User Survey: Fun Results

March 19, 2010 at 12:00 pm PST

Here’s some fun results that we gleaned from the MPI user community survey…

Respondents were asked how much they valued each of the following in MPI on a scale from 1=most important to 5=least important (each item could be rated individually):

  • Runtime performance (e.g., latency, bandwidth, resource consumption, etc.)
  • Feature-rich API
  • Run-time reliability
  • Scalability to large numbers of MPI processes
  • Integration with other middleware, communication protocols, etc.

The first item in the list — runtime performance — may seem silly.  After all, this is high performance computing.  Many on the Forum assumed that everyone would rank runtime performance as the most important thing.  They were wrong (!).

Read More »

MPI User Survey: Raw Data

March 13, 2010 at 12:00 pm PST

Earlier this week, Josh Hursey and I presented some in-depth results analysis of the MPI user community survey at the MPI Forum meeting in San Jose, CA (hosted by Cisco — yay!).  Remember that the survey is intended to help the MPI Forum guide the MPI-3 standardization process.  We had a fabulous response rate: 1,401 respondents started the survey, 838 respondents completed it (almost 60%).

Some of the results were actually quite fascinating (I’ll talk about them in future blog entries), but Josh and I need to give the following disclaimers:

  1. We are not statisticians.  We tried to be accurate, rigorous, and unbiased in our analysis, but we may not have done it right.
  2. We only presented the answers to a specific set of questions posed to us by the Forum at the January meeting.

As such, we have decided to release the raw data of the survey to the general HPC community.  It is our hope that others will also analyze this data and share their findings with the community. 

Read More »

Why MPI is Good for You

March 6, 2010 at 12:00 pm PST

If ever I doubted that MPI was good for the world, I think that all I would need to do is remind myself of this commit that I made into the Open MPI source code repository today.  It was a single-character change — changing a 0 to a 1.  But the commit log message was Tolstoyian in length:

  • 87 lines of text
  • 736 words
  • 4225 characters

Go ahead — read the commit message.  I double-dog dare you.

That tome of a commit message both represents several months of on-and-off work on a single bug, and details the hard-won knowledge that was required to understand why changing a 0 to a 1 fixed a bug.

Ouch.

Read More »

Tags: , ,