MPI newbie: Building MPI applications

In a previous post, I gave some (very) general requirements for how to setup / install an MPI installation.

This is post #2 in the series: now that you’ve got a shiny new computational cluster, and you’ve got one or more MPI implementations installed, I’ll talk about how to build, compile, and link applications that use MPI.

To be clear: MPI implementations are middleware — they do not do anything remarkable by themselves. MPI implementations are generally only useful when you have an application that uses the MPI middleware to do something interesting.

The good news is that there are many MPI applications to choose from. There’s oodles of MPI benchmarks — which will be the topic of a future MPI Newbie post — and further oodles of real, number-crunching, simulation-driving, science-generating MPI applications.

But let’s take something simple for this discussion: how about the canonical “Hello world” in MPI example?

#include <stdio.h> #include "mpi.h"

int main(int argc, char* argv[]) {
int rank, size;

MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
printf(“Hello, world, I am %d of %d\n”, rank, size);
MPI_Finalize();

return 0;
}

You can copy-n-paste the above code into your favorite text editor and save it in a file named “hello_c.c”, or you can download it from here.

A developer’s first inclination is to use their favorite compiler to compile this code — such as gcc, or icc. It’s just C code, right?

However, if you look closely at the code, there’s two things that should pop out at you:

It includes “mpi.h”. Where will the compiler find this file?
It calls MPI functions. Where will the linker find libraries that provide these symbols?

Before we even talk about those two items, things get even more confusing because:

Everyone has a different opinion on which MPI implementation they prefer
MPI implementations are large, complex software packages that, just like any other large complex software packages, include bugs and “quirky” behaviors
Different compiler suites (e.g., the GNU compilers vs. the Intel compilers) can produce binaries that are incompatible with each other

As a direct result, it is not uncommon to find multiple MPI implementations installed on a single shiny new computational cluster. Indeed, it’s not uncommon to find multiple versions of the same MPI implementation installed on a cluster (!).

And because there might be multiple MPI implementations installed, then by definition, they can’t all be installed in the default compiler and linker search paths. For example, this might not be an uncommon sight:

shell$ cd /opt shell$ ls -1 openmpi-1.6.5-gcc openmpi-1.6.5-icc mpich-3.0.4-gcc mpich-3.0.4-icc

Notice how both Open MPI version 1.6.5 and MPICH 3.0.4 are installed twice. Why is that?

The reason for this is that different compiler suites can generate binaries that are incompatible with each other (I’m skipping lots of details here — there can be many, different, subtle definitions of “incompatible” here).

In general, MPI implementations tend to advocate building the MPI implementation with the same compiler suite that you intend to build your application. That is, if you intend to build your application with the Intel compiler suite, then build your MPI (and probably all other dependent libraries) with the Intel compiler suite. Likewise, if you want to use the GNU compiler suite, then build everything — including MPI — with the GNU compiler suite.

In the above, example, let’s say that we choose to build our application against Open MPI v1.6.5 built with gcc. Specifically, we want to use the mpi.h and MPI libraries from somewhere in the /opt/openmpi-1.6.5-gcc tree.

How do you find the mpi.h and MPI libraries in that tree?

YOU DON’T.

Many MPI implementations — including Open MPI and MPICH — include “wrapper compilers” that add all the relevant compiler and linker command line flags to the invocation command line for you. For example, instead of using “gcc”, you use the equivalent MPI C “wrapper” compiler:

shell$ mpicc hello_c.c -o hello shell$

That’s it!

“mpicc” magically added all the relevant compiler -I flags and linker -l and -L flags (and any other relevant flags) to both find the “mpi.h” header file and appropriate MPI libraries.

There’s a wrapper compiler for each of C, C++, and Fortran applications:

C: mpicc
C++: mpic++ or mpicxx
Fortran: mpifort (or, mpif77 or mpif90 in older MPI implementations)

The point is that the MPI implementations reserve the right to rename their underlying libraries, move the location of the mpi.h file, …and so on.

So just use the wrapper compiler as if it were the real compiler, and the right command line flags will be added for you. The implication here is that you can add any valid compiler/linker flags to the wrapper command line, and they will be passed down to the underlying compiler, just as you would expect. For example:

shell$ mpicc hello_c.c -o hello -DNDEBUG -O3

works just as you would expect it to: all the tokens from “hello_c.c” to “-O3” are passed to the underlying compiler, as well as additional tokens to tell the compiler where to find MPI’s header files, libraries, etc.

That being said, sometimes you can’t use the MPI wrapper compilers. In such cases, both Open MPI and MPICH provide two portable mechanisms for you to find out what the underlying command line flags are.

Option 1: “show the magic” options to the wrapper compilers

With Open MPI, you specify the “–showme” option with any of the wrapper compilers. For example:

shell$ mpicc hello_c.c -o hello --showme gcc hello_c.c -o hello -I/opt/openmpi-1.6.5-gcc/include -L/opt/openmpi-1.6.5-gcc/lib -lmpi

MPICH’s wrapper compilers provide the “-show” option for the same functionality (note the single dash, not a double dash).

Option 2: pkg-config

Another option is that both Open MPI and MPICH support the pkg-config package that is standard in many Linux distros (and other OSs). For example:

shell$ export PKG_CONFIG_PATH /opt/openmpi-1.6.5-gcc/lib/pkgconfig shell$ pkg-config ompic-c --cflags -I/opt/openmpi-1.6.5-gcc/include

Open MPI provides a different package name for each language: ompi-c, ompi-cxx, ompi-fort. MPICH provides a single package name: mpich.

This was quite a long blog post, so let’s summarize:

Using the wrapper compilers is almost always what you want to do. No fuss, no muss — everything is taken care of for you. You don’t need to specify the location of MPI’s header files, libraries, etc.
However, in the rare case where you can’t use the wrapper compilers for some reason, use the –showme/-show flags, or use pkg-config(1).

Enjoy!