Many in the HPC research community are starting to work on “exascale” these days — the ability to do 10^18 floating point operations per second. Exascale is such a difficult problem that it will require new technologies in many different areas before it can become a reality. Case in point is this entry at Inside HPC today entitled, “InfiniBand Charts Course to Exascale“.
It cites The Exascale Report and a blog entry by Lloyd Dickman at the IBTA about their course going forward. It’s a good read — Lloyd’s a smart, thoughtful guy.
That being said, there’s a key piece missing from the discussion: the (networking) software. More specifically: the current OpenFabrics Verbs API abstractions are (probably) unsuitable for exascale, a fact that Fab Tillier (Microsoft) and I presented at the OpenFabrics workshop in Sonoma last year (1up, 2up).
There are two (indirect) notable takeaways from our slides:
- If you look at the last slide, our list of networking API requirements doesn’t look much like the current verbs API. Advancements will need to be made in OpenFabrics networking hardware and the corresponding software API stack.
- “OpenFabrics hardware” is actually a fairly wide class of networking gear these days — it also includes multiple forms of Ethernet. Hence, improvements towards Exascale in the OpenFabrics networking APIs will also benefit Ethernet (!).
Taking an intuitive leap from those points: getting to Exascale might be about things like increasing network bandwidth and decreasing network latency.
…but it might not. There’s still a lot of guesses about how exascale will actually turn out; there’s still a lot that we don’t know. Lloyd’s analysis of the trends is quite good, but my whacky thought is: what if exascale doesn’t follow the trends?
For example, it’s possible that exascale will be realized via a truckload of low-power processors (e.g., the Intel Atom, or something that evolves from it) connected via local networking only (e.g., groups of 8 son-of-Atoms share a networking adapter on a n-dimensional networking grid). This could keep power requirements for the processors and networking nice and low. In this case, 100+Gbps networking might not be necessary.
And — holy schnikies — that might even work with 1 or 10Gbps Ethernet…!
Is that scenario going to happen? Who knows? (I don’t!) It’s pretty far-fetched, but it’s not outside the realm of possibility. But just for fun, consider the devil’s advocate position: if we have to invent 20 new technologies for exascale, if we don’t have to invent new networking hardware, that would reduce the complexity a bit, and generally be a Good Thing.
Regardless of what networking hardware is used in exascale, I think the networking software will need some revamping, per the points that Fab and I discussed in our slides. Writing software for multiple cores is hard; writing petascale-quality software is even harder. Writing exascale software (with today’s software technology) is likely darn near impossible. Software — on many different levels — will need to evolve.
I don’t know where the road to exascale will take us. But it sure will be fun to follow it and see!
(NOTE: Fab+my slides don’t seem to appear on the Sonoma 2010 workshop web site for some reason, so I cached them here on this blog entry — I’ll ping the Sonoma organizers about it…)
(UPDATE: Added a credit and link to The Exascale Report)