Avatar

Jeff Squyres

The MPI Guy

UCS Platform Software

Dr. Jeff Squyres is Cisco's representative to the MPI Forum standards body and is Cisco's core software developer in the open source Open MPI project. He has worked in the High Performance Computing (HPC) field since his early graduate-student days in the mid-1990's, and is a chapter author of the MPI-2 and MPI-3 standards.

Jeff received both a BS in Computer Engineering and a BA in English Literature from the University of Notre Dame in 1994; he received a MS in Computer Science and Engineering from Notre Dame two years later in 1996. After some active duty tours in the military, Jeff received his Ph.D. in Computer Science and Engineering from Notre Dame in 2004. Jeff then worked as a Post-Doctoral research associate at Indiana University, until he joined Cisco in 2006.

In Cisco, Jeff is part of the VIC group (Virtual Interface Card, Cisco's virtualized server NIC) in the larger UCS server group. He works in designing and writing systems-level software for optimized network IO in HPC and other high-performance types of applications. Jeff also represents Cisco to several open source software communities and the MPI Forum standards body.

Articles

Can I MPI_SEND (and MPI_RECV) with a count larger than 2 billion?

1 min read

This question is inspired by the fact that the “count” parameter to MPI_SEND and MPI_RECV (and friends) is an “int” in C, which is typically a signed 4-byte integer, meaning that its largest positive value is 231, or about 2 billion. However, this is the wrong question. The right question is: can MPI send and […]

MPI over 40Gb Ethernet

1 min read

Half-round-trip ping-pong latency may be the first metric that everyone looks at with MPI in HPC, but bandwidth is one of the next metrics examined. 40Gbps Ethernet has been available for switch-to-switch links for quite a while, and 40Gbps NICs are starting to make their way down to the host. How does MPI perform with […]

Java Bindings for Open MPI

1 min read

Today’s guest blog post is from Oscar Vega-Gisbert and Dr. Jose Roman from the Department of Information Systems and Computing at the Universitat Politècnica de València, Spain. We provide an overview of how to use the Java bindings included in Open MPI. The aim is to expose MPI functionality to Java programmers with minimal performance […]

April 11, 2014

OPEN AT CISCO

New Open MPI stable series launched: v1.8

1 min read

The Open MPI project released version v1.8 last week.  This is a major release that heralds the beginning of a new production-ready series, full MPI-3.0 support, and a new OpenSHMEM implementation. Open MPI is developed in a tick-tock fashion: Odd-numbered series are focused on feature development and expansion Even-numbered series are focused on stability and […]

Networks for MPI

3 min read

It seems like we’ve gotten a rash of “how do I setup my new cluster for MPI?” questions on the Open MPI mailing list recently. I take this as a Very Good Thing, actually — it means more and more people are tinkering with and discovering the power of parallel computing, HPC, and MPI.

Belated April Fool’s blog post

1 min read

I was on vacation last week, and had a nice April Fool’s blog post queued up to be posted at 8am US Eastern time on 1 April 2014. It should have appeared whilst I was relaxing on a beach… but due to a bug in our WordPress installation, it didn’t.  And I didn’t find out […]

EuroMPI/ASIA 2014: Call for Workshop papers

2 min read

Held in conjunction with EuroMPI/ASIA 2014 (see the associated call for papers), September 9-12, 2014.  In-cooperation status with ACM and SIGHPC. This year, EuroMPI/ASIA 2014 will hold two workshops.  Accepted workshop papers will be included in ACM’s ICPS conference proceedings of the EuroMPI/ASIA 2014. Workshop information is available on the EuroMPI/ASIA 2014 web site, and […]

Open MPI 1.7.5 released

1 min read

After a metric ton of work by the entire community, Open MPI has released version 1.7.5. Among the zillions of minor updates and new enhancements are two major new features: MPI-3.0 conformance OpenSHMEM support (Linux only) See this post on the Open MPI announcement list for more details.

HPC in L3

2 min read

As an HPC old-timer, I’m used to thinking of HPC networks as large layer-2 (L2) subnets.  All HPC traffic (e.g., MPI traffic) is therefore designed to stay within a single L2 subnet. The next layer up — L3 — is the “networking” layer in the OSI network model; it adds more abstractions than are available […]

What’s Next for MPI?

1 min read

MPI-3 has been out for over a year and a half.  MPICH supports all of the mandatory MPI-3 behavior and some of its optional semantics.  Open MPI supports all of MPI-3 except the new one-sided semantics.  New functionality is becoming mature in both, and that maturity is trickling down to the implementations that are derived […]

Open MPI 1.7.4 released!

1 min read

It took us longer than we intended, but we finally released Open MPI v1.7.4.   Woo hoo!  (we got nice coverage from El Reg, too) This is a monster release; it represents hundreds (thousands? millions?) of person-hours of work.  Consider this a ginormous “thank you!” to the entire Open MPI community! Special thanks goes to […]

More Network Locality (Netloc) progress

1 min read

We announced the Network locality project at SC’13, and generated a LOT of interest (far more than I even anticipated!).  As a refresher, here’s a link to a a blog entry we wrote about Netloc back in November. There is still much work to be done; we’re actively continuing work in multiple areas:

InsideHPC podcast: MPI collaboration with OpenFabrics

1 min read

In my last blog post, I described a new collaboration between the MPI community and the OpenFabrics verbs community. The collaboration started with the OpenFrameworks group asking the MPI community to list its requirements for a lower layer network API to the OpenFabrics OpenFrameworks working group. In that last blog post, I posted an abbreviated […]

A fun thing happened on the way to the OpenFrameworks discussion today…

1 min read

A few months ago, Sean Hefty from Intel started an effort to design a new low-level network API to replace libibverbs. That is, it’s not libibverbs 2.0 — it’s a new API that aims to both expand the scope of what libibverbs did, and also to address many of its much-criticized shortcomings.  Sean and Paul […]

Process affinity: Hop on the bus, Gus!

7 min read

Today’s blog post is written by Joshua Ladd, Open MPI developer and HPC Algorithms Engineer at Mellanox Technologies. At some point in the process of pondering this blog post I noticed that my subconscious had, much to my annoyance, registered a snippet of the chorus to Paul Simon’s timeless classic “50 Ways to Leave Your […]

MPI_FESTIVUS(3)

2 min read

NAME MPI_Festivus – An MPI function for the rest of us

Call for Workshops: EuroMPI/Asia 2014

2 min read

The 21st European MPI Users’ Group Meeting, EuroMPI/AISA 2014, will be held in Kyoto, Japan, 9th – 12th September, 2014. In addition to the main conference’s technical program, EuroMPI/ASIA 2014 is soliciting proposals for one-day or half-day workshops to be held in conjunction with the main conference.  It is intended that those workshops are aim to discuss on […]

Open MPI: Binding to core by default

2 min read

After years of discussion, the upcoming release of Open MPI 1.7.4 will change how processes are laid out (“mapped”) and bound by default.  Here’s the specifics: If the number of processes is <= 2, processes will be mapped by core If the number of processes is > 2, processes will be mapped by socket Processes […]

Call for Papers: EuroMPI/Asia 2014

2 min read

The 21st European MPI Users’ Group Meeting, EuroMPI/AISA 2014, will be held in Kyoto, Japan, 9th – 12th September, 2014. Background and topics EuroMPI is the preeminent meeting for users, developers and researchers to interact and discuss new developments and applications of message-passing parallel computing, in particular in and related to the Message Passing Interface […]

10 Years of Open MPI

1 min read

Today’s the day. Today marks 10 years since the first commit in the original Open MPI CVS source code repository (which was later converted to Subversion): $ svn log -r 1 http://svn.open-mpi.org/svn/ompi ------------------------------------------------------------ r1 | jsquyres | 2003-11-22 11:36:58 -0500 (Sat, 22 Nov 2003) First commit ------------------------------------------------------------

The Network Locality Project (netloc)

3 min read

Today’s guest post comes from Dr. Joshua Hursey, an Assistant Professor in the Computer Science Department at the University of Wisconsin, La Crosse. For a number of years, developers tuning High Performance Computing (HPC) applications and libraries have been harnessing server topology information to significantly optimize performance on servers with increasingly complex memory hierarchies and […]

At SC’13 next week? Come say hi!

1 min read

Are you going to be in Denver at SC’13 next week? Good! You need to stop by the Cisco booth (#2535) and say hi to your friendly neighborhood Open MPI developers: Dave Goodell, Reese Faucette, and Jeff Squyres.

Lawrence Berkeley Labs talk: Cisco Userspace NIC (usNIC)

1 min read

Here’s the slides from my second talk, which is a deep technical dive into both how the usNIC technology works, and how we use that technology in the BTL plugin that we wrote for Open MPI (which is upstream starting with Open MPI v1.7.3).

Lawrence Berkeley Labs talk: (Open) MPI, Parallel Computing, Life, the Universe, and Everything

1 min read

Many thanks to the crew at LBL for hosting my talks yesterday.  There were many insightful questions and comments throughout both talks. Here’s the slides from my first talk, entitled “(Open) MPI, Parallel Computing, Life, the Universe, and Everything.”  This is a general MPI/Open MPI talk, where I discussed the current state of Open MPI, […]

My new favorite Open MPI mpirun feature: tab completion

2 min read

Today’s guest author is Nathan Hjelm, a Scientist 2 at Los Alamos National Laboratory. We recently added scripts to support tab completion of mpirun flags and run-time MCA configuration variables to the Open MPI trunk development. The scripts support both bash and zsh and have a number of useful features (depending on the shell). Can’t […]

Speaking at Lawrence Berkeley National Lab next week

1 min read

Are you in the Northern California Bay Area and want to hear about Open MPI and/or Cisco’s usNIC technology next week? If so, you’re in luck! I’ll be speaking at Lawrence Berkeley Lab (LBL) next Thursday, November 7, 2013, at 2:30pm.  Click through to see the location and directions and whatnot (LBL requests that you […]

Hardware and software queuing

3 min read

I’ve talked before about how getting high performance in MPI is all about offloading to dedicated hardware.  You want to get software out of the way as soon as possible and let the underlying hardware progress the message passing at max speed. But the funny thing about networking hardware: it tends to have limited resources. […]

EuroMPI’13 Cisco slides: Open MPI Process Affinity User Interface

1 min read

The slides below are from my presentation at EuroMPI’13 about Open MPI’s flexible process affinity interface (in OMPI 1.7.2 and later).  I described this system in a prior blog entries (one, two, three), but many people keep asking me about it. Josh Hursey from U. Wisconsin, LaCrosse, wrote this IMUDI paper about the interface (IMUDI […]

EuroMPI’13 Cisco slides: UCS, Nexus, usNIC

1 min read

A few people asked me to post the slides that I just presented in the Cisco vendor session at EuroMPI’13.  In short, I gave a brief overview of our servers and switches, and then some technical details of how we use SR-IOV in our usNIC, etc. Here’s the slides:

MPI newbie: Building MPI applications

4 min read

In a previous post, I gave some (very) general requirements for how to setup / install an MPI installation. This is post #2 in the series: now that you’ve got a shiny new computational cluster, and you’ve got one or more MPI implementations installed, I’ll talk about how to build, compile, and link applications that […]

MPI newbie: Requirements and installation of an MPI

4 min read

I often get questions from those who are just starting with MPI; they want to know common things such as: How to install / setup an MPI implementation How to compile their MPI applications How to run their MPI applications How to learn more about MPI This will be the first blog entry of several […]

Ultra low latency Ethernet (UCS “usNIC”): questions and answers

4 min read

I have previously written a few details about our upcoming ultra low latency solution for High Performance Computing (HPC).  Since my last blog post, a few of you sent me emails asking for more technical details about it. So let’s just put it all out there.

Short message latency and NUMA effects

2 min read

I’ve previously written a bunch about the effects of location, Location, LOCATION! on MPI applications. Here’s another subtle NUMA effect that a well-tuned MPI implementation can hide from you: intelligently distributing traffic between multiple network interfaces. Yeah, yeah, most MPI implementations have had so-called “multi-rail” support for a long time (i.e., using multiple network interfaces […]

How many network links do you have for MPI traffic?

2 min read

If you’re a bargain basement HPC user, you might well scoff at the idea of having more than one network interface for your MPI traffic. “I’ve got (insert your favorite high bandwidth network name here)! That’s plenty to serve all my cores! Why would I need more than that?” I can think of (at least) […]

Open MPI and the MPI-3 MPI_T interface

3 min read

Open MPI recently revamped its entire run-time parameter system (a.k.a., “MCA parameter system”) as part of its implementation effort for the “MPI_T” interface from MPI-3. The MPI_T interface is a standardized interface designed for MPI tools, but can be used by regular MPI application programs, too. Specifically, MPI_T provides programatic access to two types of […]

Why MPI is Good for You (part 3)

2 min read

I’ve previously posted on “Why MPI is Good for You” (blog tag: why-mpi-is-good-for-you).  The short version is that it hides the typical application programmer from lots and lots of underlying network stuff; stuff that they really, really don’t want to be involved in. Here’s another case study… Cisco’s upcoming ultra-low latency MPI transport is implemented […]

The History and Development of the MPI standard

1 min read

Today’s guest posting comes from Jesper Larsson Träff; he’s Faculty of Informatics, Institute of Information Systems in the Research Group for Parallel Computing at the Vienna University of Technology (TU Wien). Have you ever wondered why MPI is designed the way that it is?  The slides below are from Jesper’s talk about the History and Development of […]

MPI Quiz

1 min read

A fun scenario was proposed in the MPI Forum today.  What do you think this code will do? MPI_Comm comm, save; MPI_Request req; MPI_Init(NULL, NULL); MPI_Comm_dup(MPI_COMM_WORLD, &comm); MPI_Comm_rank(comm, &rank); save = comm; MPI_Isend(smsg, 4194304, MPI_CHAR, rank, 123, comm, &req); MPI_Comm_free(&comm); MPI_Recv(rmsg, 4194304, MPI_CHAR, rank, 123, save, MPI_STATUS_IGNORE);

May 27, 2013

OPEN AT CISCO

Cisco’s Philosophy on Open Source

1 min read

Last weekend, I was fortunate enough to be able to attend the Midwest Open Source Software Conference (MOSSCon 2013).  I met some fascinating people, listened to some great talks, and learned a bunch of new things. All in all, a win. I also presented a talk on two things: The general open source philosophy at […]

Speaking about Open MPI / FOSS at Midwest Open Source Convention this weekend

1 min read

I’ve been a bit remiss about posting recently; it’s conference-paper-writing season, folks — sorry. But I thought I’d mention that I’ll be speaking at the Midwest Open Source Software Convention (MOSSCon) this weekend. I’ll be talking about my work in Open MPI, Hardware Locality (hwloc), and other open source projects, as well as Cisco’s role […]

New Addition to the Cisco MPI Team

1 min read

I’m very pleased to welcome a new member to the Cisco USNIC/MPI Team: Dave Goodell.  Welcome, Dave!  (today was his first day) Dave joins us from the MPICH team at Mathematics and Computer Science division at Argonne National Laboratory.

April 10, 2013

OPEN AT CISCO

Presenting Open MPI, USNIC, and Cisco open source at MOSSCon’13

1 min read

I was just recently informed that my talk was accepted at the Midwest Open Source Software Conference (MOSSCon).  w00t! MOSSCon will be held at the University of Louisville, in Louisville, Kentucky, USA, on May 18-19, 2013.  It’s being organized by people from the Kentucky Open Source Society (KYOSS) and other open source / maker-oriented groups […]

Latency Analogies (part 2)

2 min read

In a prior blog post, I talked about latency analogies.  I compared levels of latencies to your home, your neighborhood, a far-away neighborhood, and another city.  I talked about these localities in terms of communication. Let’s extend that analogy to talk about data locality.

Latency Analogies

1 min read

Multiple readers have told me that it is difficult for them to understand and/or visualize the effects of latency on their HPC applications, particularly in modern NUMA (non-uniform memory access) and NUNA (non-uniform network access) environments. Let’s breaks down the different levels of latency in a typical modern server and network computing environments.

Social media login no longer required for comments

1 min read

A number of you complained when blogs.cisco.com switched to requiring a social medial login to leave comments. It turns out that you were not alone. Industry-wide, it seems that many people do not want to associate their personal Facebook/Twitter/etc. logins with work-related social media (i.e., this effect was seen at more than just Cisco).  The […]

EuroMPI 2013: papers due soon!

1 min read

Consider this a public service announcement: don’t forget that EuroMPI 2013 papers are due soon! EuroMPI is the place to see where the documented standard of MPI hits reality, both in terms of implementations and applications.  Come talk to real implementors, real users, and hear about state-of-the art techniques and performance optimizations.

MPI for mobile devices (or not)

2 min read

Every once in a while, the idea pops up again: why not use all the world’s cell phones for parallel and/or distributed computations? There’s gazillions of these phones — think of the computing power! After all, an army of ants can defeat a war horse, right? Well… yes… and no.

MPI Forum: What’s Next?

Now that we’re just starting into the MPI-3.0 era, what’s next? The MPI Forum is still having active meetings.  What is left to do?  Isn’t MPI “done”? Nope.  MPI is an ever-changing standard to meet the needs of HPC.  And since HPC keeps changing, so does MPI.

Ain’t your father’s TCP

TCP?  Who cares about TCP in HPC? More and more people, actually.  With the commoditization of HPC, lots of newbie HPC users are intimidated by special, one-off, traditional HPC types of networks and opt for the simplicity and universality of Ethernet. And it turns out that TCP doesn’t suck nearly as much as most (HPC) […]

Modern GPU Integration in MPI

Today’s guest post is from Rolf vandeVaart, a Senior CUDA Engineer with NVIDIA. GPUs are becoming quite popular as accelerators in High Performance Computing clusters. For example, check out Titan; a recent entry into the Top 500 list from Oak Ridge Laboratories. Titan has 18,688 nodes (299,008 CPU cores) coupled with 18,688 NVIDIA Tesla K20 […]

Process and memory affinity: why do you care?

3 min read

I’ve written about NUMA effects and process affinity on this blog lots of times in the past.  It’s a complex topic that has a lot of real-world affects on your MPI and HPC applications.  If you’re not using processor and memory affinity, you’re likely experiencing performance degradation without even realizing it. In short: If you’re not booting […]

Message size: big or small?

3 min read

It’s the eternal question: should I send lots and lots of small messages, or should I glump multiple small messages into a single, bigger message? Unfortunately, the answer is: it depends.  There’s a lot of factors in play.

I CAN HAS MPI

2 min read

The Cisco and Microsoft joint Cross-Animal Technology Project, a well-established player in the field of multi-species collaborative initiatives, is pleased to introduce its next project: a revolution in High Performance Computing (HPC): LOLCODE language bindings for the Message Passing Interface (MPI). CATP believes that cats are natural predatory programmers.  Who better to take advantage of all […]

MPI and Java: redux

1 min read

In a prior blog entry, I discussed how we are resurrecting a Java interface for MPI in the upcoming v1.7 release of Open MPI. Some users have already experimented with this interface and found it lacking, in at least two ways: Creating datatypes of multi-dimensional arrays doesn’t work because of how Java handles them internally […]