Cisco Blogs


Cisco Blog > High Performance Computing Networking

hwloc 1.0 released!

May 18, 2010 at 12:00 pm PST

At long last, we have released a stable, production-quality version of Hardware Locality (hwloc).  Yay!

If you’ve missed all my prior discussions about hwloc, hwloc provides command line tools and a C API to obtain the hierarchical map of key computing elements, such as: NUMA memory nodes, shared caches, processor sockets, processor cores, and processing units (logical processors or “threads”). hwloc also gathers various attributes such as cache and memory information, and is portable across a variety of different operating systems and platforms.

In an increasing NUMA (and NUNA!) world, hwloc is a valuable tool for high performance.

Read More »

Tags: , , , ,

hwloc hits 1.0rc1

April 17, 2010 at 12:00 pm PST

Woo hoo!  The portable hardware locality project (hwloc) has finally hit release candidate status.  Much has changed since the v0.9 series, all of it for the better.  There’s an impressive array of features and other goodness contained in the upcoming v1.0 release (if I do say so myself — although the INRIA guys did most of the heavy lifting).  Check out the release announcement, or read below the jump for an abbreviated list of the new stuff.

I don’t normally make hooplah over release candidates, but we’d actually like to get people to give this stuff a whirl before it hits v1.0 so that we can iron out any kinks.

And if you’re wondering why a high-performance networking blog cares about a server-side software project that appears to have nothing to do with networking, read some of my prior posts.  Short version: this stuff already somewhat matters for networking performance.  It’s going to matter (much) more as time goes on.

Read More »

Tags:

SGE debuts topology-aware scheduling

January 23, 2010 at 12:00 pm PST

I just ran across a great blog entry about SGE debuting topology-aware scheduling.  Dan Templeton does a great job of describing the need for processor topology-aware job scheduling within a server.  Many MPI jobs fit exactly within his description of applications that have “serious resource needs” — they typically require lots of CPU and/or network (or other I/O).  Hence, scheduling an MPI job intelligently across not only the network, but also across the network and resources inside the server, is pretty darn important.  It’s all about location, location, location!

Particularly as core counts in individual server are going up. 

Particularly as networks get more complicated inside individual servers. 

Particularly if heterogeneous computing inside a single server becomes popular.

Particularly as resources are now pretty much guaranteed to be non-uniform within an individual server.

These are exactly the reasons that, even though I’m a network middleware developer, I spend time with server-specific projects like hwloc — you really have to take a holistic approach in order to maximize performance.

Read More »

Tags: , , , ,