Cisco Blogs


Cisco Blog > High Performance Computing Networking

Open MPI v1.5 processor affinity options

March 9, 2012 at 5:00 am PST

Today we feature a deep-dive guest post from Ralph Castain, Senior Architecture in the Advanced R&D group at Greenplum, an EMC company.

Jeff is lazy this week, so he asked that I provide some notes on the process binding options available in the Open MPI (OMPI) v1.5 release series.

First, though, a caveat. The binding options in the v1.5 series are pretty much the same as in the prior v1.4 series. However, future releases (beginning with the v1.7 series) will have significantly different options providing a broader array of controls. I won’t address those here, but will do so in a later post.

Read More »

Tags: , , , , , ,

Hardware Locality 1.2.1 and 1.3rc1 released

August 23, 2011 at 4:27 am PST

In the vein of awesome software releases (ahem…), Hardware Locality (hwloc) v1.2.1 has been released.  As the “.1″ implies, this is a bug fix release of a bunch of little things that crept in the 1.2 series.  A full list of the news-worthy items can be found here.

But more awesome than that is the fact that Hwloc 1.3rc1 has also been released.  The Hwloc 1.3 series brings in some major new features.  The list of new features can be found below.

Read More »

Tags: , , ,

More on Memory Affinity

March 18, 2011 at 7:35 am PST

There was a great comment chain on my prior post (“Unexpected Linux Memory Migration“) which brought out a number of good points.  Let me clarify a few things from my post:

  • My comments were definitely about HPC types of applications, which are admittedly a small subset of applications that run on Linux.  It is probably a fair statement to say that the OS’s treatment of memory affinity will be just fine for most (non-HPC) applications.
  • Note, however, that Microsoft Windows and Solaris do retain memory affinity information when pages are swapped out.  When the pages are swapped back in, if they were bound to a specific locality before swapping, they are restored to that same locality.  This is why I was a bit surprised by Linux’s behavior.
  • More specifically, Microsoft Windows and Solaris seem to treat memory locality as a binding decision — Linux treats it as a hint.
  • Many (most?) HPC applications are designed not to cause paging.  However, at least some do.  A side point of this blog is that HPC is becoming commoditized — not everyone is out at the bleeding edge (meaning: some people willingly violate the “do not page” HPC mantra and are willing to give up a little performance in exchange for the other benefits that swapping provides).

To be clear, Open MPI has a few cases where it has very specific memory affinity needs that almost certainly fall outside the realm of just about all OS’s default memory placement schemes.  My point is that other applications may also have similar requirements, particularly as core counts are going up, and therefore communication between threads / processes on different cores will become more common.

Read More »

Tags: , , , , ,

Unexpected Linux memory migration

March 4, 2011 at 7:20 am PST

I learned something disturbing earlier this week: if you allocate memory in Linux to a particular NUMA location and then that memory is paged out, it will lose that memory binding when it is paged back it.

Yowza!

Core counts are going up, and server memory networks are getting more complex; we’re effectively increasing the NUMA-ness of memory.  The specific placement of your data in memory is becoming (much) more important; it’s all about location, Location, LOCATION!

But unless you are very, very careful, your data may not be in the location that you think it is — even if you thought you had bound it to a specific NUMA node.

Read More »

Tags: , , , , ,

Sockets, cores, and hyperthreads… oh my!

October 15, 2010 at 5:00 am PST

Core counts are going up.  Cisco’s C460 rack-mount server series, for example, can have up to 32 Nehalem EX cores.  As a direct result, we may well be returning to the era of running more than one MPI process per server.  This has long been true in “big iron” parallel resources, but commodity Linux HPC clusters have tended towards the one-MPI-job-per-server model in recent history.

Because of this trend, I have an open-ended question for MPI users and cluster administrators: how do you want to bind MPI processes to processors?  For example: what kinds of binding patterns do you want?  How many hyperthreads / cores / sockets do you want each process to bind to?  How do you want to specify what process binds where?  What level of granularity of control do you want / need?  (…and so on)

We are finding that every user we ask seems to have slightly different answers.  What do you think?  Let me know in the comments, below.

Read More »

Tags: , , ,