Traffic. It’s a funny thing. On my daily drive to work, I see (what appear to be) oddities and contradictions frequently. For example, although the lanes on my side of the highway are running fast and clear, the other side is all jammed up. But a half mile later, the other side is running fast and clear, and my lanes have been reduced to half-speed. A short distance further, I’m zipping along again at 55mph (ahem).
Sometimes the reasons behind traffic congestion are obvious. For example, when you drive through a busy interchange, it’s easy to understand how lots of vehicles entering and exiting the roadway can force you to slow down. But sometimes the traffic flow issues are quite subtle; congestion may be caused by a non-obvious confluence of second- and third-order effects.
The parallels from highway traffic to networking are quite obvious, but the analogy can go much deeper when you consider that modern computational clusters span multiple different networks — we’re entering an era of Non-Uniform Network Architectures (NUNAs).
Read More »
Tags: HPC, mpi, NUMA, NUNA, process affinity, traffic
I just ran across a great blog entry about SGE debuting topology-aware scheduling. Dan Templeton does a great job of describing the need for processor topology-aware job scheduling within a server. Many MPI jobs fit exactly within his description of applications that have “serious resource needs” — they typically require lots of CPU and/or network (or other I/O). Hence, scheduling an MPI job intelligently across not only the network, but also across the network and resources inside the server, is pretty darn important. It’s all about location, location, location!
Particularly as core counts in individual server are going up.
Particularly as networks get more complicated inside individual servers.
Particularly if heterogeneous computing inside a single server becomes popular.
Particularly as resources are now pretty much guaranteed to be non-uniform within an individual server.
These are exactly the reasons that, even though I’m a network middleware developer, I spend time with server-specific projects like hwloc — you really have to take a holistic approach in order to maximize performance.
Read More »
Tags: HPC, hwloc, mpi, NUMA, NUNA
Everything old is new again — NUMA is back!
With NUMA going mainstream, high performance software — MPI applications and otherwise — might need to be re-tuned to maintain their current performance levels.
A less-acknowledged aspect of HPC systems is the multiple levels of networks that are traversed to get data from MPI process A to MPI process B. The heterogeneous, multi-level network is going to become more important (again) in your applications’ overall performance, especially as per-compute-server-core-counts increase.
That is, it’s not going to only be about the bandwidth and latency of your “Ethermyriband” network. It’s also going to be about the network (or networks!) inside each compute server.
A Cisco colleague of mine (hi Ted!) previously coined a term that is quite apropos for what HPC applications now need to target: it’s no longer just about NUMA — NUMA effects are only one of the networks involved.
Think bigger: the issue is really about Non-Uniform Network Access (NUNA). Read More »
Tags: HPC, mpi, NUMA, NUNA, process affinity