Cisco Blogs

Cisco Blog > High Performance Computing Networking

Lawrence Berkeley Labs talk: Cisco Userspace NIC (usNIC)

Here’s the slides from my second talk, which is a deep technical dive into both how the usNIC technology works, and how we use that technology in the BTL plugin that we wrote for Open MPI (which is upstream starting with Open MPI v1.7.3).

Read More »

Tags: , , ,

Lawrence Berkeley Labs talk: (Open) MPI, Parallel Computing, Life, the Universe, and Everything

Many thanks to the crew at LBL for hosting my talks yesterday.  There were many insightful questions and comments throughout both talks.

Here’s the slides from my first talk, entitled “(Open) MPI, Parallel Computing, Life, the Universe, and Everything.”  This is a general MPI/Open MPI talk, where I discussed the current state of Open MPI, and then talked in detail about two of Open MPI’s newest features: the MPI-3 “MPI_T” tools interface, and Open MPI’s flexible process affinity system.

Read More »

Tags: , , ,

My new favorite Open MPI mpirun feature: tab completion

Today’s guest author is Nathan Hjelm, a Scientist 2 at Los Alamos National Laboratory.

We recently added scripts to support tab completion of mpirun flags and run-time MCA configuration variables to the Open MPI trunk development. The scripts support both bash and zsh and have a number of useful features (depending on the shell).

Can’t remember how to spell that MCA parameter name? Just hit <TAB>.
Can’t remember which transports are available? Just hit <TAB>.
Can’t remember the name of that mpirun CLI option? Just hit <TAB>.

Read More »

Tags: , ,

Speaking at Lawrence Berkeley National Lab next week

Are you in the Northern California Bay Area and want to hear about Open MPI and/or Cisco’s usNIC technology next week?

If so, you’re in luck!

I’ll be speaking at Lawrence Berkeley Lab (LBL) next Thursday, November 7, 2013, at 2:30pm.  Click through to see the location and directions and whatnot (LBL requests that you RSVP if you plan to attend).

Read More »

Tags: , , ,

Hardware and software queuing

I’ve talked before about how getting high performance in MPI is all about offloading to dedicated hardware.  You want to get software out of the way as soon as possible and let the underlying hardware progress the message passing at max speed.

But the funny thing about networking hardware: it tends to have limited resources.  You might have incredibly awesome NICs in your HPC cluster, but they only have a finite (small) amount of resources such as RAM, queues, queue depth, descriptors (for queue entries), etc.

Read More »

Tags: ,