Cisco Blogs

Cisco ultra low latency support for MPI

November 16, 2012 - 2 Comments

My team demonstrated our new ultra-low latency Ethernet solution in the Cisco booth at SC this past week (it was so busy that I didn’t get to post this until it was all over!).

The short version is that we have implemented operating system bypass and NIC hardware offload via the Linux OpenFabrics verbs API stack. We call it “userspace NIC”, or “USNIC”.

But let’s cut to the chase — what’s the performance?  Let’s break it down:

  • On Cisco M3 servers featuring the Intel “Sandy Bridge” processor series, using Cisco 2nd generation Virtual Interface Cards (VIC), each with 2x10Gbps ports.
  • With VICs hooked up back-to-back, the verbs half-round-trip ping pong latency is 1.7us.
  • Then you add in 190ns port-to-port latency from Cisco’s newest ultra-high performance switch, the Nexus 3548.
  • Then you add in 300-400ns latency for the prototype Open MPI plugin that we have written.
  • Total MPI HRT pingpong latency is therefore ~2.3us.

We expect to bring the MPI latency down a little further, but this is a good starting point.

All the software above the VIC firmware will be open source. We still have some work to do to finish and clean up the code, but we hope to push it all upstream 1H2013.

The fun part is that this is a 100% software upgrade to existing Cisco hardware. So even if you already have Cisco M3 servers with the 2nd generation VIC, you can upgrade once the software becomes available.

In an effort to keep conversations fresh, Cisco Blogs closes comments after 60 days. Please visit the Cisco Blogs hub page for the latest content.


  1. That certainly is one way to look at it. 🙂

    It’s a bit more complicated than that, though. When we sold IB products (and I know this, because I was in the IB group), we re-sold Mellanox ASICs. This inherently limited the directions of our innovation. Plus, we only worked on the network side of HPC.

    But now we sell servers, and we control an entire solution from the BIOS to the network switches. This is a fundamentally different playground than designing around someone else’s network ASIC — there’s soooo much that we can do. Put it this way: low latency is the first (and probably most obvious) HPC-related feature that we could do. But if you think about it a little, the possibilities of what can be done are endless. You’ll see more from us in this area.

    Put differently: back in IB days, we had many, many customers tell us, “There are a few things I like about IB. But I *really* like Ethernet. So take those few things that I like about IB and give me better Ethernet.” Ultra low latency is the just latest feature towards the goal of delivering better Ethernet.

  2. So basically you’re at the performance point of the IB lineup you killed.
    But, since UCS looks so much more sexy, I’ll be raving about it anyway.