Cisco Blogs


Cisco Blog > High Performance Computing Networking

Special RCE podcast: Fukushima reactor

March 26, 2011 at 8:00 am PST

Given the seriousness of issues surrounding the Fukushima, Japan reactors, Brock and I decided to reach out through our HPC contacts to find some experts to discuss the situation.  We found Drs. Kim Kearfott and Mike Hartman at the University of Michigan (Dr. Harman is one of Brock’s HPC users at UM); both are on the faculty of the nuclear engineering department at the University of Michigan.

Our conversation with the good Doctors provided a wealth of accurate, easy-to-understand information about what is — and what is not — concerning about Fukushima.

Most people forget that the “E” in “RCE” stands for engineering, so while this podcast topic is a bit outside our normal fare, it is actually within the original charter of the series.

Tags: ,

Trust, but verify: good science

March 25, 2011 at 10:27 am PST

A recent exchange on the Open MPI users’ list turned up a minor bug in our code base.  The bug had to do with how Open MPI reported a settings value through our configuration querying tool (“ompi_info”).

The code using the configuration value in question was doing the Right Things, but the tool was effectively reporting the wrong value.  This led to some confusion on the mailing list, resulting in a bug fix being pushed upstream and the user concluding, “Trust, but verify.”

Very true!

Read More »

Tags: , ,

More on Memory Affinity

March 18, 2011 at 7:35 am PST

There was a great comment chain on my prior post (“Unexpected Linux Memory Migration“) which brought out a number of good points.  Let me clarify a few things from my post:

  • My comments were definitely about HPC types of applications, which are admittedly a small subset of applications that run on Linux.  It is probably a fair statement to say that the OS’s treatment of memory affinity will be just fine for most (non-HPC) applications.
  • Note, however, that Microsoft Windows and Solaris do retain memory affinity information when pages are swapped out.  When the pages are swapped back in, if they were bound to a specific locality before swapping, they are restored to that same locality.  This is why I was a bit surprised by Linux’s behavior.
  • More specifically, Microsoft Windows and Solaris seem to treat memory locality as a binding decision — Linux treats it as a hint.
  • Many (most?) HPC applications are designed not to cause paging.  However, at least some do.  A side point of this blog is that HPC is becoming commoditized — not everyone is out at the bleeding edge (meaning: some people willingly violate the “do not page” HPC mantra and are willing to give up a little performance in exchange for the other benefits that swapping provides).

To be clear, Open MPI has a few cases where it has very specific memory affinity needs that almost certainly fall outside the realm of just about all OS’s default memory placement schemes.  My point is that other applications may also have similar requirements, particularly as core counts are going up, and therefore communication between threads / processes on different cores will become more common.

Read More »

Tags: , , , , ,

Open MPI v1.5.2 released

March 9, 2011 at 12:30 pm PST

We’re very pleased to release Open MPI version 1.5.2 today.  The v1.5 series is our “feature development” series; this release includes lots of tasty new features; see the full announcement here.

Here’s an abbreviated list of new features:

  • Now using Hardware Locality for affinity and topology information
  • Added ummunotify support for OpenFabrics-based transports.  See the README for more details.
  • Add OMPI_Affinity_str() optional user-level API function (i.e., the “affinity” MPI extension).  See the Open MPI README for more details.
  • Added support for ARM architectures.
  • Updated ROMIO from MPICH v1.3.1 (plus one additional patch).
  • Updated the Voltaire FCA component with bug fixes, new functionality.  Support for FCA version 2.1.
  • Added new “bfo” PML that provides failover on OpenFabrics networks.
  • Added the MPI_ROOT environment variable in the Open MPI Linux SRPM for customers who use the BPS and LSF batch managers.
  • Added Solaris-specific chip detection and performance improvements.
  • Added more FTB/CIFTS support.
  • Added btl_tcp_if_seq MCA parameter to select a different ethernet interface for each MPI process on a node.  This parameter is only useful when used with virtual ethernet interfaces on a single network card (e.g., when using virtual interfaces give dedicated hardware resources on the NIC to each process).
  • Added new mtl_mx_board and mtl_mx_endpoint MCA parameters.

Tags:

Unexpected Linux memory migration

March 4, 2011 at 7:20 am PST

I learned something disturbing earlier this week: if you allocate memory in Linux to a particular NUMA location and then that memory is paged out, it will lose that memory binding when it is paged back it.

Yowza!

Core counts are going up, and server memory networks are getting more complex; we’re effectively increasing the NUMA-ness of memory.  The specific placement of your data in memory is becoming (much) more important; it’s all about location, Location, LOCATION!

But unless you are very, very careful, your data may not be in the location that you think it is — even if you thought you had bound it to a specific NUMA node.

Read More »

Tags: , , , , ,