MPI for mobile devices (or not)

Every once in a while, the idea pops up again: why not use all the world’s cell phones for parallel and/or distributed computations? There’s gazillions of these phones — think of the computing power!

After all, an army of ants can defeat a war horse, right?

Well… yes… and no.

It is true that the “army of ants” philosophy is an interesting idea. Vendors have capitalized on this idea in the past. A recent example of this philosophy was SiCortex: they bundled together lots and lots of lower power, less-capable processors with a nice network. Their machines were actually able to get some pretty nice performance characteristics, all the while saving some power compared to their full-featured server processor bretheren.

Cool!

It would therefore seem logical to conclude that cell phones (and/or mobile devices in general) — which actually have quite powerful processors these days, and don’t draw much power — should be able to be bound together into an incredibly powerful distributed machine.

Right?

Unfortunately, probably not.

The Big Issue is power.

Yes, the processors in mobile devices don’t draw very much power. But consider the following:

HPC codes — or most types of code that would would require enough computational power such that you need to distribute it across a large number of (mobile device) processors — are designed to run the CPU at full capacity. Such codes are computing and/or communicating, both of which take significant amounts of power.
Mobile devices have limited battery capacity. Most mobile devices comfortably make it through a full day on a single charge, but this is with the processor running at (far) less than 100% capacity throughout the day.

A second big issue is network usage. Wireless networks are getting better and better, but they still don’t come remotely close to wired networks in terms of both latency and bandwidth (e.g., bisection bandwidth), particularly for high-traffic scenarios.

Meaning: even if you got 10,000 of today’s best cell phones and put them in a machine room and plugged them all into AC power, you’d still have wifi networking issues: they’re all sharing the same wifi frequency spectrum.

It’s not outside the realm of possibility to create a new class of distributed processing codes that are specifically designed to run as less than full capacity on the processor, and are sparse in their communication patterns. But then you have to balance such a design with the practical reality: can you find a scenario where it’s more efficient to run in the above-described model vs., for example, running on a single, serial processor?

All this doesn’t even begin to address the management of a large number of mobile devices — even in a well-regulated scenario (e.g., the 10,000 cell phones in a data center): ensure they all boot up properly, they all get the right block of networking addresses, they all have the right versions of software installed, etc.

Don’t get me wrong: the “army of ants” scenarios are quite sound, technically speaking. But they require a level of integration — bundling all processor “ants” into a power and networking package that can be utilized as a single computation resource. Not 10,000 individual devices, each with their own power, networking, and software management issues.