In part 1 of this series, I discussed various peer-wise technologies and techniques that MPI implementations typically use for communication / computation overlap. MPI-3.0, published in 2012, forced a change in the overlap game. Specifically: most
I’ve mentioned computation / communication overlap before (e.g., here, here, and here). Various types of networks and NICs have long-since had some form of overlap. Some had better quality overlap than others, from an HPC perspective. But with
A few months ago, I posted an entry entitled “HPC in L3“. My only point for that entry was to remove the “HPC in L3? That’s a terrible idea!” knee-jerk reaction that us old-timer HPC types have. I mention this because we
Most people immediately think of short message latency, or perhaps large message bandwidth when thinking about MPI. But have you ever thought about what your MPI implementation has to do before your application even calls MPI_INIT? Hint: it’s
I periodically write about network traffic, and how general / datacenter network traffic analysis is related to MPI / HPC. In my last entry, I mentioned how network traffic has many characteristics in common with distributed computing. Routing
I’ve written about network traffic before (see this post and this post). It’s the subject of endless blog posts, help forums, and instructional guides across the internet. In a High Performance Computing (HPC) context, there are some
Jeff Hammond has recently started developing the BigMPI library. BigMPI is intended to handle all the drudgery of sending and receiving large messages in MPI. In Jeff’s own words: [BigMPI is an] Interface to MPI for large messages, i.e. those
It seems like we’ve gotten a rash of “how do I setup my new cluster for MPI?” questions on the Open MPI mailing list recently. I take this as a Very Good Thing, actually — it means more and more people are tinkering with and