Cisco Blogs


Cisco Blog > High Performance Computing Networking

Shared Receive Queues

October 25, 2011 at 5:00 am PST

In my last post, I talked about the so-called eager RDMA optimization, and its effects on resource consumption vs. latency optimization.

Let’s talk about another optimization: shared receive queues.

Shared receive queues are not a new idea, and certainly not exclusive to MPI implementations.  They’re a way for multiple senders to send to a single receiver while only consuming resources from a common pool.

Read More »

Tags: , , ,

MPI tradeoffs: space vs. time

October 22, 2011 at 7:42 am PST

@brockpalen asked me a question in Twitter:

@jsquyres [can you discuss] common #MPI implementation assumptions made for performance and/or resource constraints?

Good question.  MPI implementations are full of trade-offs between performance and resource consumption.  Let’s discuss a few easy ones.

Read More »

Tags: , ,

More MPI-3 newness: const

October 16, 2011 at 5:00 am PST

Way back in the MPI-2.2 timeframe, a proposal was introduced the add the C keyword “const” to all relevant MPI API parameters.  The proposal was discussed at great length.  The main idea was twofold:

  • Provide a stronger semantic statement about which parameter contents MPI could change, and which it should not.  This mainly applies to user choice buffers (e.g., the choice buffer argument in MPI_SEND).
  • Be more friendly to languages that use const(-like constructs) more than C.  The original proposal was actually from Microsoft, whose goal was to provide higher quality C# MPI bindings.

Additionally, the (not deprecated at the time) official MPI C++ bindings have had const since the mid-1990s — so why not include them in the C bindings?

Read More »

Tags: , ,

New things in MPI-3: MPI_Count

October 8, 2011 at 5:00 am PST

The count parameter exists in many MPI API functions: MPI_SEND, MPI_RECV, MPI_TYPE_CREATE_STRUCT, etc.  In conjunction with the datatype parameter, the count parameter is often used to effectively represent the size of a message.  As a concrete example, the language-neutral prototype for MPI_SEND is:

MPI_SEND(buf, count, datatype, dest, tag, comm)

The buf parameter specifies where the message is in the sender’s memory, and the count and datatype arguments indicate its layout (and therefore size).

Since MPI-1, the count parameter has been an integer (int in C, INTEGER in Fortran).  This meant that the largest count you could express in a single function call was 231, or about 2 billion.  Since MPI-1 was introduced in 1994, machines — particularly commodity machines used in parallel computing environments — have grown.  2 billion began to seem like a fairly arbitrary, and sometimes distasteful, limitation.

The MPI Forum just recently passed ticket #265, formally introducing the MPI_Count datatype to alleviate the 2B limitation.

Read More »

Tags: , ,

Shared memory as an MPI transport (part 2)

October 4, 2011 at 5:00 am PST

In my last post, I discussed the rationale for using shared memory as a transport between MPI processes on the same server as opposed to using traditional network API loopback mechanisms.

The two big reasons are:

  1. Short message latency is lower with shared memory.
  2. Using shared memory frees OS and network hardware resources (to include the PCI bus) for off-node communication.

Let’s discuss common ways in which MPI implementations use shared memory as a transport.

Read More »

Tags: , ,