Cisco Blogs

Cisco Blog > High Performance Computing Networking

Top 5 Reasons the HPC Community Should Care About libfabric

I’ve mentioned libfabric on this blog a few times: it’s a set of next-generation APIs that allow direct access to networking hardware (e.g., high-speed / low latency NICs) from Linux userspace (kernel access is in the works).

To give you a little perspective: the libfabric APIs are aimed at a lower layer than MPI.  libfabric seeks to unify and extend competing networking APIS – sockets, Linux Verbs, PSM, etc., to allow the production of extremely high performance code that is truly portable.

“…ummm, sure.  Why do I care?” you say.

I’m glad you asked!  Sean Hefty — one of the principal designers of libfabric — and I came up with this handy Top 5 list to tell you exactly why you care.

Read More »

Tags: , ,

“Using Advanced MPI” book (i.e., MPI-3 for the rest of us)

I’m stealing this text directly from Torsten Hoefler‘s blog, because I think it’s directly relevant to many of this blog’s readers:

Our book on “Using Advanced MPI” will appear in about a month — now it’s the time to pre-order on Amazon at a reduced price. It is released by the prestigious MIT Press, a must read for parallel computing experts.

He’s right.  Go pre-order the book now (I just did!).

Read More »

Tags: ,

Supercomputing is upon us!

It’s that time of year again — we’re at about T-2.5 weeks to the Supercomputing conference and trade show;  SC’14 is in New Orleans, November 16-21.

Are you going to get some tasty gumbo and supercharged computing power?  

If so, come say hi!  The Cisco booth is 2715.

Read More »

Tags: , ,

The “vader” shared memory transport in Open MPI: Now featuring 3 flavors of zero copy!

Today’s blog post is by Nathan Hjelm, a Research Scientist at Los Alamos National Laboratory, and a core developer on the Open MPI project.

The latest version of the “vader” shared memory Byte Transport Layer (BTL) in the upcoming Open MPI v1.8.4 release is bringing better small message latency and improved support for “zero-copy” transfers.

NOTE: “zero copy” in the term typically used, even though it really means “single copy” (copy the message from the sender to the receiver).  Think of it as “zero extra copies, i.e., one copy instead of two.”

Read More »

Tags: , , , ,

HPC schedulers: What is a “slot”?

Today’s guest post comes from Ralph Castain, a principle engineer at Intel.  The bulk of this post is an email he sent explaining the concept of a “slot” in typical HPC schedulers. This is a little departure from the normal fare on this blog, but is still a critical concept to understand for running HPC applications efficiently.  With his permission, I re-publish Ralph’s email here because it’s a great analogy to explain the “slot” concept, which is broadly applicable to HPC users.

The question of “what is a [scheduler] slot” when discussing schedulers came up yesterday and was an obvious source of confusion, so let me try to explain the concept using a simple model.

Suppose I own a fleet of cars at several locations around the country. I am in the business of providing rides for people. Each car has 5 seats in it.

Read More »

Tags: ,