Cisco Blogs


Cisco Blog > High Performance Computing Networking

Can we count on MPI to handle large datasets?

April 22, 2011 at 2:25 pm PST

(today’s entry is guest-written by Fab Tillier, Microsoft MPI engineer extraordinaire)

When you send data in MPI, you specify how many items of a particular datatype you want to send in your call to an MPI send routine.  Likewise, when you read data from a file, you specify how many datatype elements to read.

This “how many” value is referred to in MPI as a count parameter, and all of MPI’s functions define count parameters as integers: int in C, INTEGER in Fortran.  This definition often limits users to 231 elements (i.e., roughly two billion elements) because int and INTEGER default to 32 bits on many of today’s platforms.

That may sound pretty big, but consider that a 231 byte file is not really that large by today’s standards — especially in HPC, where datasets can sometimes be terabytes in size.  Reading a ~2 gigabyte file can take (far) less than a second.  Read More »

Tags: , ,