Cisco Blogs


Cisco Blog > High Performance Computing Networking

Open MPI v1.5.2 released

March 9, 2011 at 12:30 pm PST

We’re very pleased to release Open MPI version 1.5.2 today.  The v1.5 series is our “feature development” series; this release includes lots of tasty new features; see the full announcement here.

Here’s an abbreviated list of new features:

  • Now using Hardware Locality for affinity and topology information
  • Added ummunotify support for OpenFabrics-based transports.  See the README for more details.
  • Add OMPI_Affinity_str() optional user-level API function (i.e., the “affinity” MPI extension).  See the Open MPI README for more details.
  • Added support for ARM architectures.
  • Updated ROMIO from MPICH v1.3.1 (plus one additional patch).
  • Updated the Voltaire FCA component with bug fixes, new functionality.  Support for FCA version 2.1.
  • Added new “bfo” PML that provides failover on OpenFabrics networks.
  • Added the MPI_ROOT environment variable in the Open MPI Linux SRPM for customers who use the BPS and LSF batch managers.
  • Added Solaris-specific chip detection and performance improvements.
  • Added more FTB/CIFTS support.
  • Added btl_tcp_if_seq MCA parameter to select a different ethernet interface for each MPI process on a node.  This parameter is only useful when used with virtual ethernet interfaces on a single network card (e.g., when using virtual interfaces give dedicated hardware resources on the NIC to each process).
  • Added new mtl_mx_board and mtl_mx_endpoint MCA parameters.

Tags:

Unexpected Linux memory migration

March 4, 2011 at 7:20 am PST

I learned something disturbing earlier this week: if you allocate memory in Linux to a particular NUMA location and then that memory is paged out, it will lose that memory binding when it is paged back it.

Yowza!

Core counts are going up, and server memory networks are getting more complex; we’re effectively increasing the NUMA-ness of memory.  The specific placement of your data in memory is becoming (much) more important; it’s all about location, Location, LOCATION!

But unless you are very, very careful, your data may not be in the location that you think it is — even if you thought you had bound it to a specific NUMA node.

Read More »

Tags: , , , , ,

Community-contributed Perl and Python bindings for hwloc

January 22, 2011 at 7:30 am PST

I love open source communities.

Two hwloc community members have taken it upon themselves to provide high-quality native language bindings for Perl and Python.  There’s active work going on, and discussions occurring between the hwloc core developers and these language providers in order to provide good abstractions, functionality, and experience.

  • The Perl CPAN module is being developed by Bernd Kallies: you can download it here (I linked to the directory rather than a specific tarball because he keeps putting up new versions).
  • The Python bindings are being developed by Guy Streeter (at Red Hat); his git repository is available here.

Read More »

Tags: ,

Hardware Locality (hwloc) v1.1 released

December 16, 2010 at 4:20 pm PST

I’m very pleased to announce that we just released Hardware Locality (hwloc) version 1.1.  Woo hoo!

There’s bunches of new stuff in hwloc 1.1:

  • A memory binding interface is the Big New Feature.  It’s available in both the C API and via command line options to tools such as hwloc-bind.
  • We improved lotopo’s logical vs. physical ID numbering.  Logical numbers are now all prefixed with “L#”; physical numbers are prefixed with “P#”.  That’s that, then.
  • “cpusets” are now “bitmaps”, and now have no maximum size; they’re dynamically allocated (especially for machines with huge core counts).
  • Arbitrary key=value caching is available on all objects.

…more after the break.

Read More »

Tags: , ,

hwloc 1.0 released!

May 18, 2010 at 12:00 pm PST

At long last, we have released a stable, production-quality version of Hardware Locality (hwloc).  Yay!

If you’ve missed all my prior discussions about hwloc, hwloc provides command line tools and a C API to obtain the hierarchical map of key computing elements, such as: NUMA memory nodes, shared caches, processor sockets, processor cores, and processing units (logical processors or “threads”). hwloc also gathers various attributes such as cache and memory information, and is portable across a variety of different operating systems and platforms.

In an increasing NUMA (and NUNA!) world, hwloc is a valuable tool for high performance.

Read More »

Tags: