Java Bindings for Open MPI

Today’s guest blog post is from Oscar Vega-Gisbert and Dr. Jose Roman from the Department of Information Systems and Computing at the Universitat Politècnica de València, Spain.

We provide an overview of how to use the Java bindings included in Open MPI. The aim is to expose MPI functionality to Java programmers with minimal performance penalties.

Our approach is based on the Java Native Interface (JNI), a mechanism that allows the application programmer to call native subroutines and libraries from Java and vice versa. Native code is typically written in C/C++ and runs directly on the operating system where the Java virtual machine is running. The Java MPI bindings consist of a thin interface on top of the Open MPI C native library, which is invoked via JNI. The Java-JNI layer just performs minimal bookkeeping required for language translation, such as wrapping MPI object handlers with Java objects. Hence, our Java bindings essentially are a 1-to-1 mapping to the MPI C bindings—they are not intended as a class library with higher-level semantics and functionality beyond the MPI specification.

JNI communication usually takes place in the Java-C direction, that is, a Java method invokes an MPI primitive via JNI and returns the result back to Java. However, we also make provision for communication in the opposite direction, that is, when the MPI C library needs to invoke a method in the Java application code. This is required for callback-based functionality such as performing a reduction with a user-defined operation.

The following listing illustrates the Java bindings with a simple example.

public static void main(String args[]) throws MPIException
{
    MPI.Init(args);

    int rank = MPI.COMM_WORLD.getRank(),
        size = MPI.COMM_WORLD.getSize(),
        nint = 100; // Intervals.
    double h = 1.0 / (double)nint,
           sum = 0.0;
    for (int i = rank + 1; i <= nint; i += size) {
        double x = h * ((double)i - 0.5);
        sum += (4.0 / (1.0 + x * x));
    }

    double sBuf[] = { h * sum },
           rBuf[] = new double[1];

    MPI.COMM_WORLD.reduce(sBuf, rBuf, 1, MPI.DOUBLE, MPI.SUM, 0);
    if (rank == 0) System.out.println("PI: "+ rBuf[0]);

    MPI.Finalize();
}

In order to use Java bindings in Open MPI, they must be enabled during configuration. If the JDK can be found in a standard location, the simplest way to do this is:

shell$ ./configure --enable-mpi-java ...

After configuration and compilation, a class file MPI.jar will be created. It will be copied to the destination directory during make install.

For convenience, an mpijavac wrapper compiler has been provided for compiling Java-based MPI applications. It ensures that all required MPI libraries and class paths are defined. Once the application has been compiled, the user can run it with the standard mpirun command line:

shell$ mpirun <options> java <java-options> <my-app>

For convenience, mpirun has been updated to detect the java command and ensure that the required MPI libraries and class paths are defined to support execution. Therefore, it is not necessary to specify the Java library path to the MPI installation, nor the MPI classpath. Any class path definitions required for the application should be specified either on the command line or via the CLASSPATH environmental variable.

Overall organization

There is an mpi package that contains all classes of the MPI Java bindings: Comm, Datatype, Request, etc. Most of these classes have a direct correspondence with classes defined by the MPI standard. MPI primitives are just methods included in these classes. The convention used for naming Java methods and classes is the usual camel-case convention, e.g., the equivalent of MPI_File_set_info(fh,info) is fh.setInfo(info), where fh is an object of the class File.

Apart from the classes, the mpi package contains predefined public attributes under a convenience class called MPI. Examples are the predefined communicator MPI.COMM_WORLD or predefined datatypes such as MPI.DOUBLE. Also, MPI initialization and finalization are methods of the MPI class and must be invoked by all MPI Java applications; see the example above.

Point-to-point communication

Point-to-point communication is realized via methods of the Comm class. The standard send() and recv() operations take the usual arguments: the message, consisting of the buffer, number of elements and datatype, and the envelope, consisting of the process rank and the tag (the communicator is implicit by the object that effects the method call).

The following example shows a simple communication between two processes. It also exemplifies the use of the Status class returned by the recv() primitive. If status is not required, then it can just be omitted from the call, and hence there is no need for an equivalent to MPI_STATUS_IGNORE.

Comm comm = MPI.COMM_WORLD;
int me = comm.getRank();

if (me == 0) {
    comm.send(data, 5, MPI.DOUBLE, 1, 1);
} else if (me == 1) {
    Status status = comm.recv(data, 5, MPI.DOUBLE, MPI.ANY_SOURCE, 1);
    int count = status.getCount(MPI.DOUBLE);
    int src = status.getSource();
    System.out.println("Received "+ count +" values from "+ src);
}

Apart from the standard communication primitives, we can also use the primitives for synchronous, buffered and ready modes, as well as the corresponding non-blocking variants. When invoking a non-blocking method, a Request object is returned, which can be subsequently used to manage the completion of the operation.

How to specify buffers

In general, in MPI primitives that require a buffer (either send or receive), the Java API admits a Java array. Since Java arrays can be relocated by the Java runtime environment, the MPI Java bindings need to make a copy of the contents of the array to a temporary buffer, then pass the pointer to this buffer to the underlying C implementation. From the practical point of view, this implies an overhead associated to all buffers that are represented by Java arrays. The overhead is small for small buffers but increases for large arrays.

An alternative is to use direct buffers provided by standard classes available in the Java SDK such as ByteBuffer. For convenience, we provide a few static methods new[Type]Buffer() in the MPI class to create direct buffers for a number of basic datatypes. Elements of the direct buffer can be accessed with methods put() and get(), and the number of elements in the buffer can be obtained with the method capacity(). The following example illustrates its use.

int myself = MPI.COMM_WORLD.getRank();
int tasks = MPI.COMM_WORLD.getSize();

IntBuffer in = MPI.newIntBuffer(MAXLEN * tasks),
out = MPI.newIntBuffer(MAXLEN);

for (int i = 0; i < MAXLEN; i++)
    out.put(i, myself); // fill the buffer with the rank

Request request = MPI.COMM_WORLD.iAllGather(
                  out, MAXLEN, MPI.INT, in, MAXLEN, MPI.INT);
// do other work here
request.waitFor();
request.free();

for (int i = 0; i < tasks; i++) {
    for (int k = 0; k < MAXLEN; k++) {
        if (in.get(k + i * MAXLEN) != i)
            throw new AssertionError("Unexpected value");
    }
}

Direct buffers are available for predefined datatypes: BYTE, CHAR, SHORT, INT, LONG, FLOAT, and DOUBLE.

Direct buffers are not a replacement for arrays because they have higher allocation and deallocation costs compared to arrays. In some cases arrays will be a better choice, typically in small buffers. One can easily convert a direct buffer into an array and vice versa. There is also the possibility of a non-direct Buffer, that employs an array internally.

Even though in most cases it is possible to choose between arrays and direct buffers, there are some restrictions in certain operations. All non-blocking methods must use direct buffers and only blocking methods can choose between arrays and buffers. When a method in the API declares the buffer argument as an Object, then it can be either an array or a direct or non-direct buffer.

Buffer arguments must always be either arrays or buffers. If one wants to send or receive a simple variable such as an int it must be declared as an array: int k[] = { value }, as in the first example above. In any communication operation, if the buffer argument is an array (or a non-direct buffer), the datatype specified in the call must match the type of the array elements, e.g., MPI.INT for int[].

In a C program, it is common to specify an offset in an array with &array[i] or (array+i), for instance to send data starting from a given position in the array. The equivalent form in the Java bindings is to slice() the buffer to start at an offset, as shown below. Making a slice() on a buffer is only necessary when the offset is not zero. This slicing mechanism works for both arrays and direct buffers.

import static mpi.MPI.slice;
...
int numbers[] = new int[SIZE];
...
MPI.COMM_WORLD.send(slice(numbers, offset), count, MPI.INT, 1, 0);

Derived datatypes

The Java bindings also allow the definition of derived datatypes, such as contiguous, vector or indexed. The example below uses a derived datatype to send the 3×3 leading block of an N×N matrix that has been stored as a one-dimensional array.

double A[] = new double[N*N]; // flattened 2D matrix
Datatype blk = Datatype.createVector(3, 3, N, MPI.DOUBLE);
blk.commit();
int rank = MPI.COMM_WORLD.getRank();
if (rank == 0) {
    MPI.COMM_WORLD.send(A, 1, blk, 1, 0);
} else if (rank==1) {
    MPI.COMM_WORLD.recv(A, 1, blk, 0, 0);
}
blk.free();

We also allow for struct-like derived datatypes, which must be implemented by subclassing the Struct class. Although we already have the classes DoubleComplex and FloatComplex, the following example shows how to create a struct in order to represent a complex number.

public class Complex extends Struct
{
    // This section defines the offsets of the fields.
    private final int real = addDouble(),
    imag = addDouble();

    // This method tells the super class how to create a data object.
    @Override protected Data newData() { return new Data(); }

    public class Data extends Struct.Data
    {
        // These methods read from the buffer:
        public double getReal() { return getDouble(real); }
        public double getImag() { return getDouble(imag); }

       // These methods write to the buffer:
       public void putReal(double r) { putDouble(real, r); }
       public void putImag(double i) { putDouble(imag, i); }
    } // Data
} // Complex

Once the struct has been defined, it can be used to create an object that corresponds to the new type. Below is an example of how to access the elements of an array of complex numbers stored in a direct byte buffer.

Complex type = new Complex();
ByteBuffer buffer = MPI.newByteBuffer(type.getExtent() * count);

for (int i = 0; i < count; i++) {
    Complex.Data c = type.getData(buffer, i);
    c.putReal(0);
    c.putImag(i * 0.5);
}

MPI.COMM_WORLD.send(buffer, count, type.getType(), 1, 0);

We provide several predefined, composite datatypes such as DOUBLE_INT, which are implemented with the Struct mechanism mentioned above. This allows using the native implementation of the MPI_MAXLOC and MPI_MINLOC reduction operations via JNI.

Other features

The Open MPI Java bindings also provide the following features:

Exception handling.
User-defined operators for reduction operations.
Collective communication operations, including non-blocking primitives and neighbor communication introduced in MPI-3.
Support for inter-communicators, including client-server communication schemes.
Parallel I/O with the File class.
One-sided communication primitives such as put() and get(), operating on an object that has been previously declared with the constructor Win().