Cisco ultra low latency support for MPI
My team demonstrated our new ultra-low latency Ethernet solution in the Cisco booth at SC this past week (it was so busy that I didn’t get to post this until it was all over!).
The short version is that we have implemented operating system bypass and NIC hardware offload via the Linux OpenFabrics verbs API stack. We call it “userspace NIC”, or “USNIC”.
But let’s cut to the chase — what’s the performance? Let’s break it down:
- On Cisco M3 servers featuring the Intel “Sandy Bridge” processor series, using Cisco 2nd generation Virtual Interface Cards (VIC), each with 2x10Gbps ports.
- With VICs hooked up back-to-back, the verbs half-round-trip ping pong latency is 1.7us.
- Then you add in 190ns port-to-port latency from Cisco’s newest ultra-high performance switch, the Nexus 3548.
- Then you add in 300-400ns latency for the prototype Open MPI plugin that we have written.
- Total MPI HRT pingpong latency is therefore ~2.3us.
We expect to bring the MPI latency down a little further, but this is a good starting point.
All the software above the VIC firmware will be open source. We still have some work to do to finish and clean up the code, but we hope to push it all upstream 1H2013.
The fun part is that this is a 100% software upgrade to existing Cisco hardware. So even if you already have Cisco M3 servers with the 2nd generation VIC, you can upgrade once the software becomes available.