November 20, 2009

More MPI Forum feedback needed

First the Fortran WG asked for some specific guidance (thank you very much for all who replied!), now the main Forum itself is conducting a community-wide survey to solicit feedback to help shape the MPI-3 standards process.  To protect from spam, the survey requires a password: mpi3.

In this survey, the MPI Forum is asking as many people as possible for feedback on the MPI-3 process—what features to include, what features to not include, etc.

We encourage you to forward this survey on to as many interested and relevant parties as possible.

It will take approximately 10 minutes to complete the questionnaire.

Read More.

Jeff Squyres Posted by Jeff Squyres at 02:02PM PST

Permalink, Comments (0), Trackbacks (0)

Tags: forum mpi

November 16, 2009

Come see us at SC09!

I have nothing deep to say for this week’s blog entry since I’m sitting here in the Portland convention center feverishly working to finish my SC09 slides.  My partner in Fortran crime, Craig Rasmussen, is sitting next to me, feverishly working on our prototype Fortran 2003 MPI bindings implementation so that we can hand out proof-of-concept tarballs at the MPI Forum BOF on Wednesday evening.

All in all—it’s a normal beginning to Supercomputing.  Wink

The #SC09 twitter feed is going crazy with about 6 billion tweets.  Just make sure you use the patented SC09 Fist Bump when in Portland.

Also be sure to drop by and see me in the Cisco Booth (#1847—get a Cisco t-shirt!).  I’ll be walking around the floor for the Gala opening, but I have booth duty most mornings this week.  I’ll also be at the Open MPI BOF on Wednesday at 12:15pm and the MPI Forum BOF, also on Wednesday, but at 5:30pm.

Read More.

Jeff Squyres Posted by Jeff Squyres at 09:55AM PST

Permalink, Comments (0), Trackbacks (0)

Tags: mpi sc09

November 05, 2009

hwloc v0.9.2 released

It took a bunch of testing, but we finally got the first formal public release of hwloc (“Hardware Locality”) out the door.  From the announcement:

“hwloc provides command line tools and a C API to obtain the hierarchical map of key computing elements, such as: NUMA memory nodes, shared caches, processor sockets, processor cores, and processor “threads”.  hwloc also gathers various attributes such as cache and memory information, and is portable across a variety of different operating systems and platforms.”

hwloc was primarily developed with High Performance Computing (HPC) applications in mind, but it is generally applicable to any software that wants or needs to know the physical layout of the machine on which it is running.  This is becoming increasingly important in today’s ever-growing-core-count compute servers.

Read More.

Jeff Squyres Posted by Jeff Squyres at 08:00AM PST

Permalink, Comments (0), Trackbacks (0)

Tags: hpc hwloc numa topology

October 29, 2009

Other MPI-3 Forum activities

Since there were a goodly number of comments on the MPI-3 Fortran question from the other day (please keep spreading that post around—the most feedback we get, the better!), I thought I’d give a quick synopsis of the other MPI-3 Forum Working Groups.  That is just to let you know that there’s more going on in MPI-3 than just new yummy Fortran goodness!

The links below go to the wiki pages of the various working groups (WG).  Some wiki pages are more active than others; some wiki pages are fairly dormant, but that doesn’t necessarily mean that the WG itself is dormant.  Some WG’s simply choose to communicate more via email and/or regular teleconferences.  For example, the Tools WG has only sporatic emails on its mailing list, but it has a regularly-updated wiki and regular teleconferences + meeting times during the bi-monthly MPI Forum meetings.  Hence, each WG may work and communicate differently than its peers.

Read More.

Jeff Squyres Posted by Jeff Squyres at 06:00AM PST

Permalink, Comments (0), Trackbacks (0)

Tags: mpi mpi-3 standard

October 23, 2009

MPI-3 Fortran Community Feedback Needed!

As many of you know, I’m an active member of the MPI Forum.  We have recently completed MPI-2.2 and have shifted our sights to focus on MPI-3. 

For some inexplicable reason, I’ve become heavily involved in the MPI-3 Fortran working group.  There are some well-known problems with the MPI-2 Fortran 90 interfaces; the short version of the MPI-3 Fortran WG’s mission is to “fix those problems.” 

A great summary of what the Fortran WG is planning for MPI-3 is available on the Forum wiki page; we’d really appreciate feedback from the Fortran MPI developer community on these ideas. 

There is definitely one significant issue that we need feedback from the community before making a decision.  Craig Rasmussen from Los Alamos National Laboratory asked me to post the following “request for information” to the greater Fortran MPI developer community.  Please send feedback either via comments on this blog entry, email to me directly, or to the MPI-3 Fortran working group mailing list.

Read More.

Jeff Squyres Posted by Jeff Squyres at 05:43AM PST

Permalink, Comments (14), Trackbacks (0)

Tags: mpi-3 fortran

October 22, 2009

Parallel debugging

Debugging parallel applications is hard.  There’s no way around it: bugs can get infinitely more complex when you have not just one thread of control running, but rather you have N processes—each with M threads—all running simultaneously.  Printf-style debugging is simply not sufficient; when a process is running on a remote compute node, even the output from a print statement can take time to be sent across the network and then displayed on your screen—time that can mask the actual problem because it shows up significantly later than the actual problem occurred.

Tools are vital for parallel application development, and there are oodles of good ones out there.  I just wanted to highlight one really cool open source (free!) tool today called “Padb”.  Written by Ashley Pittman, it’s a small but surprisingly useful tool.  One scenario where I find Padb helpful is when an MPI job “hangs”—it just seems to stop progress, but does not die or abort.  Padb can go find all the individual MPI processes, attach to them, and generate stack traces and display variable and parameter dumps for each process in the MPI job.  This allows a developer to see where the application is hung—an important first step in the troubleshooting process.

Read More.

Jeff Squyres Posted by Jeff Squyres at 08:30AM PST

Permalink, Comments (0), Trackbacks (0)

Tags: mpi hpc debugging

October 14, 2009

SC’09 Happenings

Who’s going to SC’09?  I’ll be there!

I’m hosting the Open MPI Community Meeting BOF with George Bosilca from the University of Tennessee, Knoxville.  Be sure to come by to hear about where we are and where we’re going in the Open MPI project.  There’s also an MPI[-3] Forum BOF for anyone who wants to get a glimpse of where we’re going on the standards committee.  I highly recommend attending for anyone who works with MPI.

Additionally, I’ll be hanging out in the Cisco Booth (#1847); stop by and say hello!

(Editor’s note: fixed the link to the Cisco booth—thanks to Edric and others who pointed out that it was wrong!)

Read More.

Jeff Squyres Posted by Jeff Squyres at 05:00AM PST

Permalink, Comments (2), Trackbacks (0)

Tags: hpc sc09

October 08, 2009

GPU: HPC Friend or Foe?

General purpose computing with GPUs looks like a great concept on paper.  Indeed, SC’08 was dominated by GPUs—it was impossible not to be (technically) impressed with some of the results that were being cited and shown on the exhibit floor.  But despite that, GPGPUs have failed to become a “must have” HPC technology over the past year.  Last week’s announcements from NVIDIA look really great for the HPC crowd (aside from some embarrissing PR blunders)—they seem to address many of the shortcomings of prior generation GPU usage in an HPC environment: more memory, more cores, ECC memory, better / cheaper memory management, etc.  Will GPUs become the new hotness in HPC?

The obvious question here is “Why is Jeff discussing GPUs on an MPI blog?”

Read More.

Jeff Squyres Posted by Jeff Squyres at 08:00AM PST

Permalink, Comments (0), Trackbacks (0)

Tags: gpu gpu hpc hpc

September 30, 2009

Attaining High Performance Communications: A Vertical Approach

It’s finally been published! 

I wrote a chapter on MPI in the book Attaining High Performance Communications: A Vertical Approach, edited by Dr. Ada Gavrilovska from the Georgia Institute of Technology.

 

Book picture: Attaining High Performance Communications: A Vertical Approach

The chapter author list reads like a who’s-who in high performance computing: several of my colleagues from the MPI Forum wrote pieces of this book, as well as many bright graduate students and other noted dignitaries in HPC.

Read More.

Jeff Squyres Posted by Jeff Squyres at 09:30AM PST

Permalink, Comments (0), Trackbacks (0)

Tags: mpi book

September 25, 2009

What is MPI?

As I think most readers of this blog already know, when I say “MPI”, I mean “Message Passing Interface.”

I saw an confusing-and-amusing blog entry today over at insideHPC (and HPCwire): GigaSpaces and MPI Europe partner on financial messaging overseas

“MPI Europe?”, I thought.  “What’s that?  Is that some MPI-based ISV that I’ve never heard of?”

Read More.

Jeff Squyres Posted by Jeff Squyres at 11:07AM PST

Permalink, Comments (0), Trackbacks (0)

Tags: mpi standard

September 15, 2009

Lies, damn lies, and statistics

I’m a fan of InsideHPC; I read it every day.  I like John’s commentary; he does a great job of rounding up various newsworthy HPC-related articles.  But that doesn’t always mean that I agree with every posted item.  Case in point: I saw this article the other day, purportedly a primer on InfiniBand (referring to this HPCprojects article).  I actually know a bit about IB; I used to work in the IB group at Cisco.  Indeed, I’ve written a lot of OpenFabrics verbs-based code for MPI implementations.

There’s good information in that article, but also some fantastically unfounded and misleading marketing quotes:

  •  “With large data transfers, Ethernet consumes as much as 50 per cent of the CPU cycles; the average for InfiniBand is a loss of less than 10 to 20 per cent.”  He’s referring to software TCP overhead, not Ethernet overhead.  There’s an enormous difference—there’s plenty of Ethernet-based technologies that are in the 10-20% overhead range.
  • “There are also power savings to be had, and this is critical when HPC facilities are confronting major issues with power supplies, cooling and costs. The same study indicates that InfiniBand cuts power costs considerably to finish the same number of Fluent jobs compared to Gigabit Ethernet; as cluster size increases, more power can be saved.”  Wow.  Other than generating warm fuzzies for customers (“My network products are green!”), what exactly does that paragraph mean?  And how exactly was it quantified?
  • ...I’ll stop with just those 2.  grin

These quotes are classic marketing spin to make IB products look the better than the competition.

Read More.

Jeff Squyres Posted by Jeff Squyres at 05:57AM PST

Permalink, Comments (3), Trackbacks (0)

Tags: mpi networking

September 13, 2009

Announcing hwloc: portable hardware locality open source software

(this blog entry co-written by Brice Goglin and Samuel Thibault from the INRIA Runtime Team)

We’re pleased to announce a new open source software project: Hardware Locality (or “hwloc”, for short).  The hwloc software discovers and maps the NUMA nodes, shared caches, and processor sockets, cores, and threads of Linux/Unix and Windows servers.  The resulting topological information can be displayed graphically or conveyed programatically though a C language API.  Applications (and middleware) that use this information can optimize their performance in a variety of ways, including tuning computational cores to fit cache sizes and utilizing data locality-aware algorithms.

hwloc actually represents the merger of two prior open source software projects:

  • libtopology, a package for discovering and reporting the internal processor and cache topology in Unix and Windows servers.
  • Portable Linux Processor Affinity (PLPA), a package for solving Linux topological processor binding compatibility issues

Read More.

Jeff Squyres Posted by Jeff Squyres at 03:00PM PST

Permalink, Comments (0), Trackbacks (0)

Tags: hardware hwloc open mpi open source topology

September 03, 2009

MPI 2.2: done!

From the home office in Helsinki, Finland: MPI-2.2 is done!  It’s done it’s done it’s done!

Finally!  The MPI-2.2 document has been voted in by the MPI Forum.  The official PDF document will be published on http://www.mpi-forum.org soon.  HLRS is selling (at cost) MPI-2.2 books; contact Rolf Rabenseifner if you’re interested (I’ll be getting one!).

Read More.

Jeff Squyres Posted by Jeff Squyres at 11:59PM PST

Permalink, Comments (0), Trackbacks (0)

Tags: mpi mpi-2.2 standard

August 27, 2009

Non Uniform Network Access (NUNA)

Everything old is new again—NUMA is back!  With NUMA going mainstream, high performance software—MPI applications and otherwise—might need to be re-tuned to maintain their current performance levels.

A less-acknowledged aspect of HPC systems is the multiple levels of networks that are traversed to get data from MPI process A to MPI process B.  The heterogeneous, multi-level network is going to become more important (again) in your applications’ overall performance, especially as per-compute-server-core-counts increase.  That is, it’s not going to only be about the bandwidth and latency of your “Ethermyriband” network.  It’s also going to be about the network (or networks!) inside each compute server.

A Cisco colleague of mine (hi Ted!) previously coined a term that is quite apropos for what HPC applications now need to target: it’s no longer just about NUMA—NUMA effects are only one of the networks involved.  Think bigger: the issue is really about Non-Uniform Network Access (NUNA).

Read More.

Jeff Squyres Posted by Jeff Squyres at 05:00AM PST

Permalink, Comments (1), Trackbacks (0)

Tags: congestion connectivity mpi network type numa nuna

August 24, 2009

Platform Acquires HP-MPI

In a move that will surely cause some head-scratching, Platform has acquired the intellectual property of the-MPI-previously-known-as-HP-MPI.

The head scratching part is that Platform already owns Scali MPI.  It’s no secret that they recently moved all Scali development to an engineering team based in China.

Read More.

Jeff Squyres Posted by Jeff Squyres at 07:18AM PST

Permalink, Comments (0), Trackbacks (0)

Tags: hp-mpi mpi platform scali