Avatar

Jeff Squyres

The MPI Guy

UCS Platform Software

Dr. Jeff Squyres is Cisco's representative to the MPI Forum standards body and is Cisco's core software developer in the open source Open MPI project. He has worked in the High Performance Computing (HPC) field since his early graduate-student days in the mid-1990's, and is a chapter author of the MPI-2 and MPI-3 standards.

Jeff received both a BS in Computer Engineering and a BA in English Literature from the University of Notre Dame in 1994; he received a MS in Computer Science and Engineering from Notre Dame two years later in 1996. After some active duty tours in the military, Jeff received his Ph.D. in Computer Science and Engineering from Notre Dame in 2004. Jeff then worked as a Post-Doctoral research associate at Indiana University, until he joined Cisco in 2006.

In Cisco, Jeff is part of the VIC group (Virtual Interface Card, Cisco's virtualized server NIC) in the larger UCS server group. He works in designing and writing systems-level software for optimized network IO in HPC and other high-performance types of applications. Jeff also represents Cisco to several open source software communities and the MPI Forum standards body.

Articles

The state of libfabric in Open MPI

1 min read

Yesterday morning, I gave a presentation at the 2015 OpenFabrics Software Developers’ Workshop.  I discussed the status of libfabric support in Open MPI. Here’s a copy of my slides:

Cisco usNIC libfabric provider presentation

1 min read

Earlier this morning, I gave a presentation at the 2015 OpenFabrics Software Developers’ Workshop.  I discussed Cisco’s experiences with writing providers for both the Linux Verbs API and the Libfabric API. Here’s a copy of my slides:

MPI-3.1! …not quite yet

1 min read

The MPI Forum met for our quarterly meeting last week in Portland, Oregon. The main goal of the meeting was to pass the MPI-3.1 standard into law.  MPI-3.1 contains a bunch of errata from MPI-3.0, and a small number of new things.

A Farewell to LAM/MPI

1 min read

With a little sadness, I note that LAM/MPI was officially retired recently. LAM/MPI’s hosting provider, Indiana University, made the decision not to renew the lam-mpi.org domain any more.  As of a few weeks ago, LAM/MPI’s web site is no more, and its domain is in the process of expiring. LAM/MPI was a highly popular implementation […]

Open MPI: behind the scenes

2 min read

Working on an MPI implementation isn’t always sexy.  There’s a lot of grubby, grubby work that needs to happen on a continual basis to produce a production-quality MPI implementation that can be used for real-world HPC applications. Sure, we always need to work on optimizing short message latency. Sure, we need to keep driving MPI’s […]

MPI 3.1: coming soon to an implementation near you

1 min read

The next MPI Forum meeting will be in Portland, OR, USA, in early March. One of the major topics on the agenda will be voting on the MPI 3.1 standard. You might be wondering what’s new in MPI-3.1. I’m glad you asked.

Tree-based launch in Open MPI (part 2)

2 min read

In my prior blog entry, I described the basics of Open MPI’s tree-based launching system over ssh (yes, there are still some valid / good reasons for using ssh over a native job scheduler / resource manager’s parallel launch mechanisms…). That entry got a little long, so I split the rest of the discussion into […]

Tree-based launch in Open MPI

2 min read

I’ve mentioned it before: the run-time systems of MPI implementations are frequently unsung heroes. A lot of blood, sweat, tears, and innovation goes into parallel run time systems, particularly those that can scale to very large systems.  But they’re  not discussed often, mainly because they’re not as sexy and ultra-low latency numbers, or other popular […]

Holiday wishes

1 min read

As usual, in the post-Supercomputing / post-US-Thanksgiving-holiday lull, the work that we have all put off since we started ignoring it to prepare for Supercomputing catches up to us.  Inevitably, it means that my writing here at the blog falls behind in December.  Sorry, folks! To make up for that, here’s a little ditty I […]

Top 5 Reasons the HPC Community Should Care About libfabric

2 min read

I’ve mentioned libfabric on this blog a few times: it’s a set of next-generation APIs that allow direct access to networking hardware (e.g., high-speed / low latency NICs) from Linux userspace (kernel access is in the works). To give you a little perspective: the libfabric APIs are aimed at a lower layer than MPI.  libfabric […]

“Using Advanced MPI” book (i.e., MPI-3 for the rest of us)

1 min read

I’m stealing this text directly from Torsten Hoefler‘s blog, because I think it’s directly relevant to many of this blog’s readers: Our book on “Using Advanced MPI” will appear in about a month — now it’s the time to pre-order on Amazon at a reduced price. It is released by the prestigious MIT Press, a must […]

libfabric support of usNIC in Open MPI

3 min read

I’ve previously written about libfabric.  Here’s some highlights: libfabric is a set of next-generation, community-driven, ultra-low latency networking APIs The APIs are not tied to any particular networking hardware model Cisco is actively helping define, design, and develop the libfabric APIs as part of the community My fellow team member Reese Faucette recently contributed a […]

usNIC support for the Intel MPI Library

1 min read

Cisco is pleased to announce the intention to support the Intel MPI Library™ with usNIC on the UCS server and Nexus switches product lines over the ultra low latency Ethernet and routable IP transports, at both 10GE and 40GE speeds. usNIC will be enabled by a simple library plugin to the uDAPL framework included in […]

Supercomputing is upon us!

1 min read

It’s that time of year again — we’re at about T-2.5 weeks to the Supercomputing conference and trade show;  SC’14 is in New Orleans, November 16-21. Are you going to get some tasty gumbo and supercharged computing power?   If so, come say hi!  The Cisco booth is 2715.

The “vader” shared memory transport in Open MPI: Now featuring 3 flavors of zero copy!

3 min read

Today’s blog post is by Nathan Hjelm, a Research Scientist at Los Alamos National Laboratory, and a core developer on the Open MPI project. The latest version of the “vader” shared memory Byte Transport Layer (BTL) in the upcoming Open MPI v1.8.4 release is bringing better small message latency and improved support for “zero-copy” transfers. NOTE: “zero copy” […]

HPC schedulers: What is a “slot”?

3 min read

Today’s guest post comes from Ralph Castain, a principle engineer at Intel.  The bulk of this post is an email he sent explaining the concept of a “slot” in typical HPC schedulers. This is a little departure from the normal fare on this blog, but is still a critical concept to understand for running HPC […]

usNIC provider contributed to libfabric

1 min read

Today’s guest post is by Reese Faucette, one of my fellow usNIC team members here at Cisco. I’m pleased to announce that this past Friday, Cisco contributed a usNIC-based provider to libfabric, the new API in the works from OpenFabrics Interfaces Working Group. (Editor’s note: I’ve blogged about libfabric before) Yes, the road is littered with […]

MPI-3.1

1 min read

As you probably already know, the MPI-3.0 document was published in September of 2012. We even got a new logo for MPI-3.  Woo hoo! The MPI Forum has been busy working on both errata to MPI-3.0 (which will be collated and published as “MPI-3.1”) and all-new functionality for MPI-4.0. The current plan is to finalize […]

Overlap of communication and computation (part 2)

2 min read

In part 1 of this series, I discussed various peer-wise technologies and techniques that MPI implementations typically use for communication / computation overlap. MPI-3.0, published in 2012, forced a change in the overlap game. Specifically: most prior overlap work had been in the area of individual messages between a pair of peers.  These were very […]

Overlap of communication and computation (part 1)

3 min read

I’ve mentioned computation / communication overlap before (e.g., here, here, and here). Various types of networks and NICs have long-since had some form of overlap.  Some had better quality overlap than others, from an HPC perspective. But with MPI-3, we’re really entering a new realm of overlap.  In this first of two blog entries, I’ll […]

HPC over UDP

2 min read

A few months ago, I posted an entry entitled “HPC in L3“.  My only point for that entry was to remove the “HPC in L3? That’s a terrible idea!” knee-jerk reaction that us old-timer HPC types have. I mention this because we released a free software update a few days ago for the Cisco usNIC […]

Unsung heros: MPI run time environments

3 min read

Most people immediately think of short message latency, or perhaps large message bandwidth when thinking about MPI. But have you ever thought about what your MPI implementation has to do before your application even calls MPI_INIT? Hint: it’s pretty crazy complex, from an engineering perspective. Think of it this way: operating systems natively provide a […]

Traffic in parallel

3 min read

In my last entry, I gave a vehicles-driving-in-a-city analogy for network traffic. Let’s tie that analogy back to HPC and MPI.

Still more traffic

1 min read

I periodically write about network traffic, and how general / datacenter network traffic analysis is related to MPI / HPC. In my last entry, I mentioned how network traffic has many characteristics in common with distributed computing. Routing decisions, for example, are made independently at each network switch. Consider if you were looking down at […]

Traffic (redux)

2 min read

I’ve written about network traffic before (see this post and this post). It’s the subject of endless blog posts, help forums, and instructional guides across the internet. In a High Performance Computing (HPC) context, there are some fascinating aspects about network traffic that are fairly different than other types of network traffic.

BigMPI: You can haz moar counts!

2 min read

Jeff Hammond has recently started developing the BigMPI library. BigMPI is intended to handle all the drudgery of sending and receiving large messages in MPI. In Jeff’s own words: [BigMPI is an] Interface to MPI for large messages, i.e. those where the count argument exceeds INT_MAX but is still less than SIZE_MAX. BigMPI is designed […]

First public tools for the MPI_T interface in MPI-3.0

2 min read

Today’s guest post is written by Tanzima Islam, Post Doctoral Researcher at Lawrence Livermore Laboratory, and Kathryn Mohror and Martin Schulz, Computer Scientists at Lawrence Livermore Laboratory. The latest version of the MPI Standard, MPI 3.0, includes a new interface for tools: the MPI Tools Information Interface, or “MPI_T”. MPI_T complements the existing MPI profiling […]