Cisco Blogs


Cisco Blog > High Performance Computing Networking

Why is SR-IOV relevant in the HPC world ?

One feature of the usNIC ultra-low latency Ethernet solution for the UCS Cisco VIC that we think is interesting is the fact that it is based on SR-IOV.

What is SR-IOV, and why is it relevant in the HPC world?

SR-IOV (Single Root I/O Virtualization) is commonly used in the server virtualization world. The most commonly described purpose of SR-IOV in the hypervisor world is to allow a device partition, called VF (Virtual Function), to be mapped in the guest operating system address space. This allows the guest operating system to enjoy higher I/O performance and lower CPU utilization as compared to the alternative: software-emulated devices that are traditionally implemented in hypervisors.

Compared to the old world before hypervisors came along, that use of SR-IOV seems to allow to regain back some performance lost due to the hypervisor software intervention in the I/O data path. But why should I care about SR-IOV in the world of my network-latency-bound HPC applications running on common operating systems on bare metal servers?

Read More »

Tags: , , , , ,

Short message latency and NUMA effects

July 23, 2013 at 5:00 am PST

I’ve previously written a bunch about the effects of location, Location, LOCATION! on MPI applications.

Here’s another subtle NUMA effect that a well-tuned MPI implementation can hide from you: intelligently distributing traffic between multiple network interfaces.

Yeah, yeah, most MPI implementations have had so-called “multi-rail” support for a long time (i.e., using multiple network interfaces for MPI traffic).  But there’s more to it than that.

Read More »

Tags: , , , , ,

Ultra low latency Ethernet (UCS “usNIC”): questions and answers

July 17, 2013 at 5:00 am PST

I have previously written a few details about our upcoming ultra low latency solution for High Performance Computing (HPC).  Since my last blog post, a few of you sent me emails asking for more technical details about it.

So let’s just put it all out there.

Read More »

Tags: , , , , ,

Why MPI is Good for You (part 3)

June 24, 2013 at 1:05 pm PST

I’ve previously posted on “Why MPI is Good for You” (blog tag: why-mpi-is-good-for-you).  The short version is that it hides the typical application programmer from lots and lots of underlying network stuff; stuff that they really, really don’t want to be involved in.

Here’s another case study…

Cisco’s upcoming ultra-low latency MPI transport is implemented over an “unreliable” transport: raw Ethernet L2 frames. For latency reasons, it’s using the OpenFabrics verbs operating-system bypass API. These two facts mean that a) userspace is directly talking to the NIC hardware, and b) we don’t have a driver thread running down in the kernel that can service incoming frames regardless of what the MPI application is doing.

Read More »

Tags: , , , , ,

Ain’t your father’s TCP

February 15, 2013 at 5:00 am PST

TCP?  Who cares about TCP in HPC?

More and more people, actually.  With the commoditization of HPC, lots of newbie HPC users are intimidated by special, one-off, traditional HPC types of networks and opt for the simplicity and universality of Ethernet.

And it turns out that TCP doesn’t suck nearly as much as most (HPC) people think, particularly on modern servers, Ethernet fabrics, and powerful Ethernet NICs.

Read More »

Tags: , , , ,