Process-to-process copy in Linux
More exciting news on the Linux kernel front (thanks for the heads-up, Brice!): our friends at Big Blue have contributed a patch and started good conversation on the LKML mailing list about process-to-process copying. We still don’t have a good solution for being notified when registered memory is freed (my last post on this topic mentioned that the ummunotify patch had hit the -mm tree, but that eventually didn’t make it up to Linus’ tree), but hey — this is progress, too (albeit in a slightly different direction), so I’ll take it!
“Why do I care?” you say.
I’m glad you asked. Let me explain…
Most MPI implementations use some flavor of shared memory to delivery messages between two processes on the same node. That is, the source process copies the message into a shared memory block (that was previously setup, perhaps during MPI_INIT) and the destination process copies the message out of the shared memory block.
Most schemes like this are pipeline the source and destination process copies so that the receiver doesn’t have to wait for the entire message to arrive in shared memory before it can be copied out. The performance is actually fairly reasonable.
That being said, a direct process-to-process copy has two benefits:
- The performance can be greater (see Chris Yeoh’s post to the LKML, linked at the beginning of this entry)
- The fact that only 1 memcpy is occurring means less traffic on the internal networks/buses, which is significantly friendlier to other processes running on the same system (vs. doing 2 copies)
I’ll be following this thread; I hope to see Chris’ work make it into the main line kernel!