Ain't your father's TCP

TCP? Who cares about TCP in HPC?

More and more people, actually. With the commoditization of HPC, lots of newbie HPC users are intimidated by special, one-off, traditional HPC types of networks and opt for the simplicity and universality of Ethernet.

And it turns out that TCP doesn’t suck nearly as much as most (HPC) people think, particularly on modern servers, Ethernet fabrics, and powerful Ethernet NICs.

I’ll cut to the chase: I surprised myself by being able to get ~10us half-round-trip ping-pong MPI latency over TCP (using NetPIPE). The slidedeck below discusses how that works.

A little background: I’ve posted several times about Cisco’s forthcoming ultra-low latency Ethernet product. While working on that code, I was doing some performance testing last week, and wanted to compare to the best performance that TCP could give me. I discovered many things:

Van Jacobson’s “TCP in 30 [assembly] instructions” explanation. And I discovered that that explanation is from 1993.
Modern Ethernet NICs have quite a bit of acceleration built-in (lots of TCP offloading).
Operating system TCP protocol stacks are very highly optimized.
But most Ethernet NICs are tuned for general traffic — HPC has different requirements. With a trivial bit of tuning, such NICs (and therefore TCP) can exhibit some very interesting performance in HPC applications.

Check out these slides explaining what I found:

Just to be complete, here’s the hardware I ran these tests on (all of which is available today):

Cisco UCS C240 M3 servers (Intel “Sandy Bridge” E5-2690 CPUs, with all “high performance” options enabled in the BIOS)
Cisco 1225 VIC
Linux RHEL 6.2 (with various HPC-common performance optimizations enabled)

Are these results definitive? Absolutely not — you can see the tradeoffs listed at the end of the slide deck. As with any HPC application, YMMV. But it certainly is interesting, and definitively shows how the rest of the world is benefiting from the trickle-down effect of HPC.