Cisco Blogs


Cisco Blog > High Performance Computing Networking

Modern GPU Integration in MPI

February 8, 2013 at 5:00 am PST

Today’s guest post is from Rolf vandeVaart, a Senior CUDA Engineer with NVIDIA.

GPUs are becoming quite popular as accelerators in High Performance Computing clusters. For example, check out Titan; a recent entry into the Top 500 list from Oak Ridge Laboratories. Titan has 18,688 nodes (299,008 CPU cores) coupled with 18,688 NVIDIA Tesla K20 GPUs.

To help ease the programming burden working with GPU memory in MPI applications, support has been added to several MPI libraries such that the MPI library can directly send and receive the GPU buffers without the user having to stage them in host memory first. This has sometimes been referred to as “CUDA-aware MPI.”

Read More »

Tags: , , ,

Process and memory affinity: why do you care?

January 31, 2013 at 5:00 am PST

I’ve written about NUMA effects and process affinity on this blog lots of times in the past.  It’s a complex topic that has a lot of real-world affects on your MPI and HPC applications.  If you’re not using processor and memory affinity, you’re likely experiencing performance degradation without even realizing it.

In short:

  1. If you’re not booting your Linux kernel in NUMA mode, you should be.
  2. If you’re not using processor affinity with your MPI/HPC applications, you should be.

Read More »

Tags: , , , ,

Message size: big or small?

January 28, 2013 at 6:15 am PST

It’s the eternal question: should I send lots and lots of small messages, or should I glump multiple small messages into a single, bigger message?

Unfortunately, the answer is: it depends.  There’s a lot of factors in play.

Read More »

Tags: ,

MPI and Java: redux

January 18, 2013 at 5:00 am PST

In a prior blog entry, I discussed how we are resurrecting a Java interface for MPI in the upcoming v1.7 release of Open MPI.

Some users have already experimented with this interface and found it lacking, in at least two ways:

  1. Creating datatypes of multi-dimensional arrays doesn’t work because of how Java handles them internally
  2. The interface only supports a subset of MPI-1.1 functions

These are completely valid criticisms.  And I’m incredibly thankful to the Open MPI user community for taking the time to kick the tires on this interface and give us valid feedback.

Read More »

Tags: , , ,

MPI_REQUEST_FREE is Evil

January 15, 2013 at 11:06 am PST

It was pointed out to me that in my last blog post (Don’t leak MPI_Requests), I failed to mention the MPI_REQUEST_FREE function.

True enough — I did fail to mention it.  But I did so on purpose, because MPI_REQUEST_FREE is evil.

Let me explain…

Read More »

Tags: ,