Resurrecting MPI and Java
Back in the ’90s, there was a huge bubble of activity about Java in academic circles. It was the new language that was going to take over the world. An immense amount of research was produced mapping classic computer science issues into Java.
Among the projects produced were several that tried to bring MPI to Java. That is, they added a set of Java bindings over existing C-based MPI implementations. However, many in the HPC crowd eschewed Java for compute- or communication-heavy applications because of performance overheads inherent to the Java language and runtime implementations.
Hence, the Java+MPI=HPC efforts didn’t get too much traction.
But even though the computer science Java bubble eventually ended, Java has become quite an important language in the enterprise. Java run-time environments, compilers, and programming models have steadily improved over the years. Java is now commonly used for many different types of compute-heavy enterprise applications.
Hadoop, for example, is a Java-based environment used for big data crunching. Not only is the Hadoop run-time itself written in Java, but client applications that implement the “reduce” part of the “map/reduce” model are also usually written in Java.
It is the Hadoop community who has presented the idea of re-introducing Java for MPI. Their idea is that the “reduce” applications are getting larger and more computationally expensive. Hence, they want to parallelize their computations by spreading across multiple processor cores.
With parallelization, they want to have efficient inter-process communication (IPC) between cores. MPI is a well-established IPC API, so why re-invent the wheel? Just add some reasonable Java MPI bindings to an underlying C MPI implementation (or even a Java class library with some nice Java-ish abstractions).
We have a side project going on in Open MPI to do just that. We revived one of the old MPI Java bindings projects, brushed off the dust, spruced it up a bit, and have integrated it as a set of bindings on top of the Open MPI core.
There’s still some issues to work out, such as some language-specific peculiarities in Java, how to integrate with MPI’s static computation model at run-time, etc. But this might be an interesting start.