Cisco Blogs

Cisco Blog > High Performance Computing Networking

MPI + VFIO on InsideHPC slidecast

Welcome to 2012!  I’m finally about caught up from the Christmas holidays, last week’s travel to the MPI Forum, etc.  It’s time to finally get my blogging back on.

Let’s start with a short one…

Rich Brueckner from InsideHPC interviewed me right before the Christmas break about the low Ethernet MPI latency demo that I gave at SC’11.  I blogged about this stuff before, but in the slidecast that Rich posted, I provide a bit more detail about how this technology works.

Remember that this is Cisco’s 1st generation virtualized NIC; our 2nd generation is coming “soon,” and will have significantly lower MPI latency (I hate being fuzzy and not quoting the exact numbers, but the product is not yet released, so I can’t comment on it yet.  I’ll post the numbers when the product is actually available).

Tags: , , ,

SC’11 Cisco booth demo: Open MPI over Linux VFIO

Linux VFIO (Virtual Function IO) is an emerging technology that allows direct access to PCI devices from userspace.  Although primarily designed as a hypervisor-bypass technology for virtualization uses, it can also be used in an HPC context.

Think of it this way: hypervisor bypass is somewhat similar to operating system (OS) bypass.  And OS bypass is a characteristic sought in many HPC low-latency networks these days.

Drop by the Cisco SC’11 booth (#1317) where we’ll be showing a technology preview demo of Open MPI utilizing Linux VFIO over the Cisco “Palo” family of first-generation hardware virtualized NICs (specifically, the P81E PCI form factor).  VIFO + hardware virtualized NICs allow benefits such as:

  • Low HRT ping-pong latencies over Ethernet via direct access to L2 from userspace (4.88us)
  • Hardware steerage of inbound and outbound traffic to individual MPI processes

Let’s dive into these technologies a bit and explain how they benefit MPI.

Read More »

Tags: , , ,

LISPmob, a new open source project for network mobility

What if your mobile device allowed you the freedom to seamlessly roam across any network in the world, regardless of location or operator and with all the attributes you would expect, security or privacy…  With LISPmob, we may have gotten a giant step closer as we open sourced a network stack for network mobility on Linux platforms, an implementation of basic LISP mobile node functionalities.

This is the Locator Identifier Separation Protocol, which supports the separation of the IPv4 and IPv6 address space following a network-based map-and-encapsulate scheme based on an IETF open standard.

We hope this will be a project and a community many will find not just interesting and vibrant, but necessary and fun to engage, collaborate and contribute.

How will this help your plans to deal with all these amazing possibilities of mobile access to an ever-growing Internet?

Tags: , , , , , , , , ,

More on Memory Affinity

There was a great comment chain on my prior post (“Unexpected Linux Memory Migration“) which brought out a number of good points.  Let me clarify a few things from my post:

  • My comments were definitely about HPC types of applications, which are admittedly a small subset of applications that run on Linux.  It is probably a fair statement to say that the OS’s treatment of memory affinity will be just fine for most (non-HPC) applications.
  • Note, however, that Microsoft Windows and Solaris do retain memory affinity information when pages are swapped out.  When the pages are swapped back in, if they were bound to a specific locality before swapping, they are restored to that same locality.  This is why I was a bit surprised by Linux’s behavior.
  • More specifically, Microsoft Windows and Solaris seem to treat memory locality as a binding decision — Linux treats it as a hint.
  • Many (most?) HPC applications are designed not to cause paging.  However, at least some do.  A side point of this blog is that HPC is becoming commoditized — not everyone is out at the bleeding edge (meaning: some people willingly violate the “do not page” HPC mantra and are willing to give up a little performance in exchange for the other benefits that swapping provides).

To be clear, Open MPI has a few cases where it has very specific memory affinity needs that almost certainly fall outside the realm of just about all OS’s default memory placement schemes.  My point is that other applications may also have similar requirements, particularly as core counts are going up, and therefore communication between threads / processes on different cores will become more common.

Read More »

Tags: , , , , ,

Unexpected Linux memory migration

I learned something disturbing earlier this week: if you allocate memory in Linux to a particular NUMA location and then that memory is paged out, it will lose that memory binding when it is paged back it.


Core counts are going up, and server memory networks are getting more complex; we’re effectively increasing the NUMA-ness of memory.  The specific placement of your data in memory is becoming (much) more important; it’s all about location, Location, LOCATION!

But unless you are very, very careful, your data may not be in the location that you think it is — even if you thought you had bound it to a specific NUMA node.

Read More »

Tags: , , , , ,