Articles
EuroMPI’13 Cisco slides: Open MPI Process Affinity User Interface
1 min read
The slides below are from my presentation at EuroMPI’13 about Open MPI’s flexible process affinity interface (in OMPI 1.7.2 and later). I described this system in a prior blog entries (one, two, three), but many people keep asking me about it. Josh Hursey from U. Wisconsin, LaCrosse, wrote this IMUDI paper about the interface (IMUDI […]
EuroMPI’13 Cisco slides: UCS, Nexus, usNIC
1 min read
A few people asked me to post the slides that I just presented in the Cisco vendor session at EuroMPI’13. In short, I gave a brief overview of our servers and switches, and then some technical details of how we use SR-IOV in our usNIC, etc. Here’s the slides:
MPI newbie: Building MPI applications
4 min read
In a previous post, I gave some (very) general requirements for how to setup / install an MPI installation. This is post #2 in the series: now that you’ve got a shiny new computational cluster, and you’ve got one or more MPI implementations installed, I’ll talk about how to build, compile, and link applications that […]
MPI newbie: Requirements and installation of an MPI
4 min read
I often get questions from those who are just starting with MPI; they want to know common things such as: How to install / setup an MPI implementation How to compile their MPI applications How to run their MPI applications How to learn more about MPI This will be the first blog entry of several […]
Ultra low latency Ethernet (UCS “usNIC”): questions and answers
4 min read
I have previously written a few details about our upcoming ultra low latency solution for High Performance Computing (HPC). Since my last blog post, a few of you sent me emails asking for more technical details about it. So let’s just put it all out there.
Short message latency and NUMA effects
2 min read
I’ve previously written a bunch about the effects of location, Location, LOCATION! on MPI applications. Here’s another subtle NUMA effect that a well-tuned MPI implementation can hide from you: intelligently distributing traffic between multiple network interfaces. Yeah, yeah, most MPI implementations have had so-called “multi-rail” support for a long time (i.e., using multiple network interfaces […]
How many network links do you have for MPI traffic?
2 min read
If you’re a bargain basement HPC user, you might well scoff at the idea of having more than one network interface for your MPI traffic. “I’ve got (insert your favorite high bandwidth network name here)! That’s plenty to serve all my cores! Why would I need more than that?” I can think of (at least) […]
Open MPI and the MPI-3 MPI_T interface
3 min read
Open MPI recently revamped its entire run-time parameter system (a.k.a., “MCA parameter system”) as part of its implementation effort for the “MPI_T” interface from MPI-3. The MPI_T interface is a standardized interface designed for MPI tools, but can be used by regular MPI application programs, too. Specifically, MPI_T provides programatic access to two types of […]
Why MPI is Good for You (part 3)
2 min read
I’ve previously posted on “Why MPI is Good for You” (blog tag: why-mpi-is-good-for-you). The short version is that it hides the typical application programmer from lots and lots of underlying network stuff; stuff that they really, really don’t want to be involved in. Here’s another case study… Cisco’s upcoming ultra-low latency MPI transport is implemented […]
The History and Development of the MPI standard
1 min read
Today’s guest posting comes from Jesper Larsson Träff; he’s Faculty of Informatics, Institute of Information Systems in the Research Group for Parallel Computing at the Vienna University of Technology (TU Wien). Have you ever wondered why MPI is designed the way that it is? The slides below are from Jesper’s talk about the History and Development of […]
MPI Quiz
1 min read
A fun scenario was proposed in the MPI Forum today. What do you think this code will do? MPI_Comm comm, save; MPI_Request req; MPI_Init(NULL, NULL); MPI_Comm_dup(MPI_COMM_WORLD, &comm); MPI_Comm_rank(comm, &rank); save = comm; MPI_Isend(smsg, 4194304, MPI_CHAR, rank, 123, comm, &req); MPI_Comm_free(&comm); MPI_Recv(rmsg, 4194304, MPI_CHAR, rank, 123, save, MPI_STATUS_IGNORE);
Cisco’s Philosophy on Open Source
1 min read
Last weekend, I was fortunate enough to be able to attend the Midwest Open Source Software Conference (MOSSCon 2013). I met some fascinating people, listened to some great talks, and learned a bunch of new things. All in all, a win. I also presented a talk on two things: The general open source philosophy at […]
Speaking about Open MPI / FOSS at Midwest Open Source Convention this weekend
1 min read
I’ve been a bit remiss about posting recently; it’s conference-paper-writing season, folks — sorry. But I thought I’d mention that I’ll be speaking at the Midwest Open Source Software Convention (MOSSCon) this weekend. I’ll be talking about my work in Open MPI, Hardware Locality (hwloc), and other open source projects, as well as Cisco’s role […]
New Addition to the Cisco MPI Team
1 min read
I’m very pleased to welcome a new member to the Cisco USNIC/MPI Team: Dave Goodell. Welcome, Dave! (today was his first day) Dave joins us from the MPICH team at Mathematics and Computer Science division at Argonne National Laboratory.
Presenting Open MPI, USNIC, and Cisco open source at MOSSCon’13
1 min read
I was just recently informed that my talk was accepted at the Midwest Open Source Software Conference (MOSSCon). w00t! MOSSCon will be held at the University of Louisville, in Louisville, Kentucky, USA, on May 18-19, 2013. It’s being organized by people from the Kentucky Open Source Society (KYOSS) and other open source / maker-oriented groups […]
Latency Analogies (part 2)
2 min read
In a prior blog post, I talked about latency analogies. I compared levels of latencies to your home, your neighborhood, a far-away neighborhood, and another city. I talked about these localities in terms of communication. Let’s extend that analogy to talk about data locality.
Latency Analogies
1 min read
Multiple readers have told me that it is difficult for them to understand and/or visualize the effects of latency on their HPC applications, particularly in modern NUMA (non-uniform memory access) and NUNA (non-uniform network access) environments. Let’s breaks down the different levels of latency in a typical modern server and network computing environments.
Social media login no longer required for comments
1 min read
A number of you complained when blogs.cisco.com switched to requiring a social medial login to leave comments. It turns out that you were not alone. Industry-wide, it seems that many people do not want to associate their personal Facebook/Twitter/etc. logins with work-related social media (i.e., this effect was seen at more than just Cisco). The […]
EuroMPI 2013: papers due soon!
1 min read
Consider this a public service announcement: don’t forget that EuroMPI 2013 papers are due soon! EuroMPI is the place to see where the documented standard of MPI hits reality, both in terms of implementations and applications. Come talk to real implementors, real users, and hear about state-of-the art techniques and performance optimizations.
MPI for mobile devices (or not)
2 min read
Every once in a while, the idea pops up again: why not use all the world’s cell phones for parallel and/or distributed computations? There’s gazillions of these phones — think of the computing power! After all, an army of ants can defeat a war horse, right? Well… yes… and no.
MPI Forum: What’s Next?
Now that we’re just starting into the MPI-3.0 era, what’s next? The MPI Forum is still having active meetings. What is left to do? Isn’t MPI “done”? Nope. MPI is an ever-changing standard to meet the needs of HPC. And since HPC keeps changing, so does MPI.
Ain’t your father’s TCP
TCP? Who cares about TCP in HPC? More and more people, actually. With the commoditization of HPC, lots of newbie HPC users are intimidated by special, one-off, traditional HPC types of networks and opt for the simplicity and universality of Ethernet. And it turns out that TCP doesn’t suck nearly as much as most (HPC) […]
Modern GPU Integration in MPI
Today’s guest post is from Rolf vandeVaart, a Senior CUDA Engineer with NVIDIA. GPUs are becoming quite popular as accelerators in High Performance Computing clusters. For example, check out Titan; a recent entry into the Top 500 list from Oak Ridge Laboratories. Titan has 18,688 nodes (299,008 CPU cores) coupled with 18,688 NVIDIA Tesla K20 […]
Process and memory affinity: why do you care?
3 min read
I’ve written about NUMA effects and process affinity on this blog lots of times in the past. It’s a complex topic that has a lot of real-world affects on your MPI and HPC applications. If you’re not using processor and memory affinity, you’re likely experiencing performance degradation without even realizing it. In short: If you’re not booting […]
Message size: big or small?
3 min read
It’s the eternal question: should I send lots and lots of small messages, or should I glump multiple small messages into a single, bigger message? Unfortunately, the answer is: it depends. There’s a lot of factors in play.
I CAN HAS MPI
2 min read
The Cisco and Microsoft joint Cross-Animal Technology Project, a well-established player in the field of multi-species collaborative initiatives, is pleased to introduce its next project: a revolution in High Performance Computing (HPC): LOLCODE language bindings for the Message Passing Interface (MPI). CATP believes that cats are natural predatory programmers. Who better to take advantage of all […]
MPI and Java: redux
1 min read
In a prior blog entry, I discussed how we are resurrecting a Java interface for MPI in the upcoming v1.7 release of Open MPI. Some users have already experimented with this interface and found it lacking, in at least two ways: Creating datatypes of multi-dimensional arrays doesn’t work because of how Java handles them internally […]
MPI_REQUEST_FREE is Evil
2 min read
It was pointed out to me that in my last blog post (Don’t leak MPI_Requests), I failed to mention the MPI_REQUEST_FREE function. True enough — I did fail to mention it. But I did so on purpose, because MPI_REQUEST_FREE is evil. Let me explain…
Don’t leak MPI_Requests
1 min read
With the Mayan apocalypse safely behind us, now we can now safely discuss MPI again. An MPI application developer came to me the other day with a potential bug in Open MPI: he noticed that Open MPI was consuming vast amounts of memory such that trying to allocate memory from his application failed. Ouch! It turns out, […]
McMPI
3 min read
Today’s guest blog entry comes from Daniel Holmes, an Applications Developers at the EPCC. I met Jeff at EuroMPI in September, and he has invited me to write a few words on my experience of developing an MPI library. My PhD involved building a message passing library using C#; not accessing an existing MPI library […]
EuroMPI 2013: CFP
1 min read
It’s that time of year again — time to start preparing for Euro MPI 2013! Next year, we’ll be heading to Madrid, Spain September 15-18. Here’s a snipit from the call for papers: Topics of interest include, but are not limited to: MPI implementation issues and improvements Extensions to and shortcomings of MPI Tools […]
Cisco ultra low latency support for MPI
1 min read
My team demonstrated our new ultra-low latency Ethernet solution in the Cisco booth at SC this past week (it was so busy that I didn’t get to post this until it was all over!). The short version is that we have implemented operating system bypass and NIC hardware offload via the Linux OpenFabrics verbs API […]
MPICH 3.0 RC released
1 min read
The MPICH folks have released an RC candidate for MPICH 3.0: A new preview release of MPICH, 3.0rc1, is now available for download. The primary focus of this release is to provide full support for the MPI-3 standard. Other smaller features including support for ARM v7 native atomics are also included.
MPI-3 standard available in hardcover
1 min read
The MPI-3.0 standard is now available in hardcover (it’s green!). The book is available for cost by Dr. Rolf Rabenseifner at HLRS; no profit is being made by these sales. Here’s an excerpt from Rolf’s original announcement: As a service (at costs) for users of the Message Passing Interface, HLRS has printed the new Standard, […]
Cisco @SC2012
1 min read
Going to Salt Lake City for Supercomputing 2012 next week? So are we! Be sure to drop by and see us in the Cisco booth (#2517). I’ll be there, demonstrating and talking about our latest developments in ultra low latency Ethernet (hint: it includes 250ns port-to-port Ethernet switch latency and our latest MPI/OS-bypass technology on the […]