Cisco Logo


High Performance Computing Networking

A few people have made remarks to me about the pair of CCI guest blog entries from Scott Atchley of Oak Ridge (entry 1, entry 2) indicating that they didn’t quite “get it”.  So let me try to put Scott’s words in concrete terms…

CCI is an API that represents a unification of low-level “native” network APIs.  Specifically: many network vendors are doing essentially the same things in their low-level “native” network APIs.  In the HPC world, MPI hides all these different low-level APIs.  But there are real-world non-HPC apps out there that need extreme network performance, and therefore write their own unification layers for verbs, portals, MX, …etc.  Ick!

So why don’t we unify all these low-level native network APIs?

NOTE: This is quite similar to how we unified the high-level network APIs into MPI.

Two other key facts are important to realize here:

  1. A CCI open source reference implementation that supports multiple different network types is available for download.
  2. At least one vendor has firmware implementing CCI; it’s not just a(nother) software abstraction layer.

There are some important consequences of these facts:

If CCI uses plugins just like libibverbs, why did we bother?  I.e., why didn’t we just use libibverbs?

Don’t forget that one of the tenants of CCI is simplicity (the others are portability, performance, scalability, and robustness).  In short: CCI is far simpler than the libibverbs API.

The CCI implementation is currently open to a limited number of early beta users.  While the group is working on making the software ready for 1.0, read the full CCI API specification on the Oak Ridge CCI site; post comments below if you’re interested in more detail.

In an effort to keep conversations fresh, Cisco Blogs closes comments after 90 days. Please visit the Cisco Blogs hub page for the latest content.

6 Comments.


  1. I’m one of the developers of the Charm++ runtime system for HPC applications, among which NAMD is the most widely-used. For the last 15 years, we’ve maintained our own native machine layers (Elan, Myrinet, SHMEM, Infiniband, Blue Gene DCMF/PAMI, LAPI, uGNI) because of a huge impedance mismatch between our execution model and what MPI provides. We’ve been hoping that various past proposals and projects would get some traction (e.g. GASnet), but none seems to have taken off. Could we get access to the CCI beta to do a port, see how it performs, and possibly contribute our expertise to this shared foundation?

       0 likes

    • Sure; I’ve got your email address because you posted a comment, so I’ll follow up with you in email.

         0 likes

  2. I’m not sure that a single abstraction layer makes sense for portable (cross O/S) code, especially if your application does things outside of networking (say file I/O). For example, the I/O model for highest performance is going to be different between Linux and Windows.

    I also have doubts about whether an API that requires the underlying library to have threads (in order to signal/deliver events) is a good approach, as it requires surfacing a bunch of ‘knobs’ for the application to control the threading policy (number of threads, affinity, priority, etc).

    Point being that simplicity, protability, and performance often conflict with one another.

    -Fab

       0 likes

    • Scott Atchley

      Fab, currently one major API works across all O/Ses and that is Sockets. No one would argue that Sockets exposes the capabilities of today’s networking hardware (no zero-copy, no OS bypass). If an application will only ever use Sockets and only run on standard Ethernet NICs, then there is little to gain by using CCI. If the application could run on more capable hardware (whether over Ethernet or another fabric), then CCI might make sense.

      Verbs works on most O/Ses and provides access to modern networking features, yet no one would argue that it is a simple API. We believe there is a middle ground.

      Forgive me for not knowing the optimal I/O model for Windows, perhaps you could give me an example. CCI is inherently an asynchronous API (similar to MPI). You initiate a send (small message or RMA) and poll for completion. If the app prefers to block via a native O/S method (e.g. select(), poll(), epoll(), kqueue(), WSA*(), etc.), CCI can provide a native OS handle to the application. If CCI does not allow for a high performance implementation in Windows, we would be very interested in what changes CCI would need in order to provide one.

      CCI gives the choice to the app. If the app does not want additional threads, it must poll for completions and to ensure progress if the underlying hardware does not provide progress. If the app does not want to burn the cycles to poll, it can block which requires someone to check for completions and signal the blocker. Whether that someone is a progress thread or the kernel is an implementation detail.

      There are trade-offs for simplicity, portability, and performance and we hope that CCI provides the ability for the app to choose the best combination given its needs.

         0 likes

      • Hi Scott,

        > currently one major API works across all O/Ses and that is Sockets

        Just to be clear, this is not just Sockets, but synchronous Sockets (blocking or non-blocking, but not aio/overlapped). There is a lot to be gained by moving away from synchronous I/O, though the learning curve for async I/O can be steep. Throw in the concept of scatter/gather and it gets even steeper.

        High performance asynchronous I/O applications in Windows can benefit from using I/O completion ports, rather than using per-I/O event objects and WaitForMultipleObjects (which has a limit to how many objects can be waited on concurrently). I/O completion ports allow aggregating completions from multiple files or sockets, and the application can poll the completion port, or block on it waiting for any I/O completion to be added. Multiple threads can poll events form an I/O completion port. MSMPI today uses I/O completion ports internally, so that we can block for completions if we exceed our polling limit. MSMPI supports blocking for completions for all of our communcation channels, SHM, NetworkDirect, and Sockets.

        It would be great to allow CCI endpoints to be associated with a user’s I/O completion port, allowing users to get completions for CCI events side by side with their file completions, all through one function (GetQueuedCompletionStatus). There are design issues that you’ll need to work out, though (who provides the OVERLAPPED structure that identifies the I/O operation and is returned by GetQueueCompletionStatus, for example).

        Cheers,
        -Fab

           0 likes

  3. Just to followup: a CCI beta 1 tarball has been publicly posted on the CCI Forum web page: http://cci-forum.com/ (under the “Getting Started” page).

       0 likes

  1. Return to Countries/Regions
  2. Return to Home
  1. All High Performance Computing Networking
  2. Return to Home