Cisco Blogs


Cisco Blog > High Performance Computing Networking

MCAPI and MPI: take two

December 15, 2011 at 7:17 pm PST

My last blog post and MCAPI and MPI is worth some further explanation…

There were a number of good questions raised (both publicly in comments, and privately to me via email).

I ended up chatting with some MCAPI people from PolyCore Software: Sven Brehmer and Ted Gribb.  We had a very interesting discussion which I won’t try to replicate here.  Instead, we ended up recording an RCE-Cast today about MCAPI and MPI.  It’ll be released in a few weeks (Brock already had one teed up to be released this weekend).

The main idea is that Sven and Ted were not trying to say that MCAPI is faster/better than MPI.

MCAPI is squarely aimed at a different market than MPI — the embedded market.  Think: accelerators, DSPs, FPGAs, etc.  And although MCAPI can be used for larger things (e.g., multiple x86-type servers on a network), there’s already well-established high-quality tools for that (e.g., MPI).

So perhaps it might be interesting to explore the realm of MPI + MCAPI in some fashion.

There’s a bunch of different forms that (MPI + MCAPI) could take — which one(s) would be useful?  I cited a few forms in my prior blog post; we talked about a few more on the podcast.

But it’s hard to say without someone committing to doing some research, or a customer saying “I want this.”  Talk is cheap — execution requires resources.

Would this be something that you, gentle reader, would be interested in?  If so, let me know in the comments or drop me an email.

Tags: , , ,

MCAPI and MPI

December 9, 2011 at 11:15 am PST

From @softtalkblog, I was recently directed to an article about the Multicore Communication API (MCAPI) and MPI.  Interesting stuff.

The main sentiments expressed in the article seem quite reasonable:

  1. MCAPI plays better in the embedded space than MPI (that’s what MCAPI was designed for, after all).  Simply put: MPI is too feature-rich (read: big) for embedded environments, reflecting the different design goals of MCAPI vs. MPI.
  2. MCAPI + MPI might be a useful combination.  The article cites a few examples of using MCAPI to wrap MPI messages.  Indeed, I agree that MCAPI seems like it may be a useful transport in some environments.

One thing that puzzled me about the article, however, is that it states that MPI is terrible at moving messages around within a single server.

Huh.  That’s news to me…

Read More »

Tags: , , ,

Many Pairs of Eyes

December 1, 2011 at 7:00 am PST

Let me tell you a reason why open source and open communities are great: information sharing.

Let me explain…

I am Cisco’s representative to the Open MPI project, a middleware implementation of the Message Passing Interface (MPI) standard that facilitates big number crunching and parallel programming.  It’s a fairly large, complex code base: Ohloh says that there are 0ver 674,000 lines of code.  Open MPI is portable to a wide variety of platforms and network types.

However, supporting all the things that MPI is suppose to support and providing the same experience on every platform and network can be quite challenging.  For example, a user posted a problem to our mailing list the other day about a specific feature not working properly on OS X.

Read More »

Tags: , , , ,

The MPI C++ Bindings

October 31, 2011 at 6:06 am PST

What a strange position I find myself in: the C++ bindings have become somewhat of a divisive issue in the MPI Forum.  There are basically 3 groups in the Forum:

  1. Those who want to keep the C++ bindings deprecated.  Meaning: do not delete them, but do not add any C++ bindings for new MPI-3 functions.
  2. Those who want to un-deprecate the C++ bindings.  Meaning: add C++ bindings for all new MPI-3 functions.
  3. Those who want to delete the C++ bindings.  Meaning: kill.  Axe.  Demolish.  Remove.  Never speak of them again.

Let me explain.

Read More »

Tags: ,

Shared Receive Queues

October 25, 2011 at 5:00 am PST

In my last post, I talked about the so-called eager RDMA optimization, and its effects on resource consumption vs. latency optimization.

Let’s talk about another optimization: shared receive queues.

Shared receive queues are not a new idea, and certainly not exclusive to MPI implementations.  They’re a way for multiple senders to send to a single receiver while only consuming resources from a common pool.

Read More »

Tags: , , ,