Cisco Blogs

Cisco Blog > High Performance Computing Networking

Happy 1 year anniversary, RCE-Cast!


We were recording an RCE-Cast with the PETSc guys when we realized that we had just about hit our 1 year anniversary; the first recording was posted on January 17, 2009.  Wow!  I had no idea that we had been doing this so long — Brock and I are both very pleasantly surprised that we’ve managed to keep it going this long.

If you’re unaware of RCE-Cast, it’s a podcast about “Research Computing and Engineering” that Brock Palen and I record every two weeks.  We talk to a variety of software and hardware projects, and/or any other topic that seems to be related to HPC- or RCE-like things. 

Here’s an experiment for our next interview with the Condor folks: “tweet @brockpalen questions for #condor next guest on #RCE“.

Read More »

MPI-3 User survey: thank you!

We had an astonishing 837 responses to the MPI User Survey.  Many thanks to all of you who filled out the survey!

The MPI Forum minions are busy analyzing the data — there’s a lot!  We’ll have more definitive results later, but for now, see below the jump for a few quickie facts from the results.

Read More »

Network hardware offload

Sorry for the lack of activity here this month, folks.  As usual, December is the month to recover from SC and catch up on everything else you were supposed to be doing.  So I’ll try to make up for it with a small-but-tasty Christmas morsel.  Then I’ll disappear for a long winter’s nap; you likely won’t see me until January (shh! don’t tell my wife that I’m working today!).

The topic of my musing today is one that has come up multiple times in conversation over the past two weeks.  Although I’m certainly not the only guy to talk about this on the interwebs, today’s topic is server-side hardware offload of network communications.

Read More »

Random Tidbits

Here’s some random quick notes:

  • Brock posted the MPI-3 podcast on yesterday.  Have a listen if you’d like to hear some of the new/upcoming efforts in MPI-3.
  • I saw a post on the MVAPICH list the other day that some random user picked up hwloc and submitted a patch to integrate it into MVAPICH.  Huzzah!
  • I hear quite a bit about MPI being run on the Intel prototype 40-core chip.  This is an interesting subject, but quite a bit remains to be seen about the programming models of BigCore chips.  The Intel press releases state that there is hardware support for message passing on the silicon, but what exactly does that mean?  Do we have direct access to that from user space?  …that and many other questions will be discussed over time.

Who’s going to SC10 in New Orleans next year?

Read More »

Open Resilient Cluster Manager (ORCM)

Cisco announced this past weekend a new open source effort that is being launched under the Open MPI project umbrella named the Open Resilient Cluster Manager (or “OpenRCM”, or — my personal favorite — “ORCM”.  Say it 10 times fast!).

The Open MPI community is pleased to announce the establishment of a new subproject built upon the Open MPI code base. Using work initially contributed by Cisco Systems, the Open Resilient Cluster Manager is an open source project released under the Open MPI [BSD] license focused on development of an “always on” resource manager for systems spanning the range from embedded to very large clusters.

The ORCM web site neatly lays out the project goals:

  • Maintain operation of running applications in the face of single or multiple failures of any given process within that application.
  • Proactively detect incipient failures (hardware and/or software) and respond appropriately to maintain overall system operation.
  • Support both MPI and non-MPI applications.
  • Provide a research platform for exploring new concepts and methods in resilient systems.

“That’s great,” you say.  “But why on earth do we need yet another cluster resource manager?”

Read More »