<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: MCAPI and MPI</title>
	<atom:link href="http://blogs.cisco.com/performance/mcapi-and-mpi/feed/" rel="self" type="application/rss+xml" />
	<link>http://blogs.cisco.com/performance/mcapi-and-mpi/</link>
	<description></description>
	<lastBuildDate>Wed, 19 Jun 2013 20:01:54 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	
	<item>
		<title>By: Jeff Squyres</title>
		<link>http://blogs.cisco.com/performance/mcapi-and-mpi/#comment-478892</link>
		<dc:creator>Jeff Squyres</dc:creator>
		<pubDate>Mon, 12 Dec 2011 19:22:28 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.cisco.com/?p=54071#comment-478892</guid>
		<description><![CDATA[I just had a phone chat with one of the authors of the article, Sven Brehmer at Polycore Software.  I&#039;ll blog about the call in a little bit.]]></description>
		<content:encoded><![CDATA[<p>I just had a phone chat with one of the authors of the article, Sven Brehmer at Polycore Software.  I&#8217;ll blog about the call in a little bit.
<p class="comment-like"><img class="comment-like-btn" title="Vote" onclick="cl_like_this('http://blogs.cisco.com/wp-admin/admin-ajax.php',478892)" src="http://blogs.cisco.com/wp-content/plugins/comments-likes/images/like.png" />&nbsp;&nbsp;&nbsp;<span id="comment-like-cnt-478892">0</span> likes</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Fab Tillier</title>
		<link>http://blogs.cisco.com/performance/mcapi-and-mpi/#comment-478806</link>
		<dc:creator>Fab Tillier</dc:creator>
		<pubDate>Mon, 12 Dec 2011 18:28:17 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.cisco.com/?p=54071#comment-478806</guid>
		<description><![CDATA[I find it interesting that the whole premise of the article seems to be that MPI applications fundamentally use the dynamic process model.  I&#039;m not sure a lot of MPI applications could handle a &quot;user&#039;s inadvertent unplugging of a physical network cable&quot;.  In my experience, MPI apps expect the cluster they run on to be effectively static for the duration of the job, else they fail.

The article seems to boil down to &quot;because MPI can run across the WAN, it is slow&quot;, without acknowledging that the MPI libraries know and take advantage of the fastest interconnect available between any given processes.]]></description>
		<content:encoded><![CDATA[<p>I find it interesting that the whole premise of the article seems to be that MPI applications fundamentally use the dynamic process model.  I&#8217;m not sure a lot of MPI applications could handle a &#8220;user&#8217;s inadvertent unplugging of a physical network cable&#8221;.  In my experience, MPI apps expect the cluster they run on to be effectively static for the duration of the job, else they fail.</p>
<p>The article seems to boil down to &#8220;because MPI can run across the WAN, it is slow&#8221;, without acknowledging that the MPI libraries know and take advantage of the fastest interconnect available between any given processes.
<p class="comment-like"><img class="comment-like-btn" title="Vote" onclick="cl_like_this('http://blogs.cisco.com/wp-admin/admin-ajax.php',478806)" src="http://blogs.cisco.com/wp-content/plugins/comments-likes/images/like.png" />&nbsp;&nbsp;&nbsp;<span id="comment-like-cnt-478806">0</span> likes</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Fab Tillier</title>
		<link>http://blogs.cisco.com/performance/mcapi-and-mpi/#comment-478782</link>
		<dc:creator>Fab Tillier</dc:creator>
		<pubDate>Mon, 12 Dec 2011 18:15:37 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.cisco.com/?p=54071#comment-478782</guid>
		<description><![CDATA[Microsoft Windows has had support for user-space to user-space copy for over a decade (since Windows 2000), allowing processes to move data in either direction via the ReadProcessMemory and WriteProcessMemory APIs.  

Microsoft MPI takes advantage of this, though I don&#039;t know if any other MPI libraries on Windows do (they really should).]]></description>
		<content:encoded><![CDATA[<p>Microsoft Windows has had support for user-space to user-space copy for over a decade (since Windows 2000), allowing processes to move data in either direction via the ReadProcessMemory and WriteProcessMemory APIs.  </p>
<p>Microsoft MPI takes advantage of this, though I don&#8217;t know if any other MPI libraries on Windows do (they really should).
<p class="comment-like"><img class="comment-like-btn" title="Vote" onclick="cl_like_this('http://blogs.cisco.com/wp-admin/admin-ajax.php',478782)" src="http://blogs.cisco.com/wp-content/plugins/comments-likes/images/like.png" />&nbsp;&nbsp;&nbsp;<span id="comment-like-cnt-478782">0</span> likes</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jeff Squyres</title>
		<link>http://blogs.cisco.com/performance/mcapi-and-mpi/#comment-478022</link>
		<dc:creator>Jeff Squyres</dc:creator>
		<pubDate>Mon, 12 Dec 2011 11:01:57 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.cisco.com/?p=54071#comment-478022</guid>
		<description><![CDATA[Yes, there were a few sweeping generalizations in that article that I found to be amusing.]]></description>
		<content:encoded><![CDATA[<p>Yes, there were a few sweeping generalizations in that article that I found to be amusing.
<p class="comment-like"><img class="comment-like-btn" title="Vote" onclick="cl_like_this('http://blogs.cisco.com/wp-admin/admin-ajax.php',478022)" src="http://blogs.cisco.com/wp-content/plugins/comments-likes/images/like.png" />&nbsp;&nbsp;&nbsp;<span id="comment-like-cnt-478022">0</span> likes</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jeff Squyres</title>
		<link>http://blogs.cisco.com/performance/mcapi-and-mpi/#comment-478017</link>
		<dc:creator>Jeff Squyres</dc:creator>
		<pubDate>Mon, 12 Dec 2011 10:58:34 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.cisco.com/?p=54071#comment-478017</guid>
		<description><![CDATA[Correct.  See also Jeff Hammond&#039;s comments (and to confirm: XPMEM support is being developed in Open MPI as well).]]></description>
		<content:encoded><![CDATA[<p>Correct.  See also Jeff Hammond&#8217;s comments (and to confirm: XPMEM support is being developed in Open MPI as well).
<p class="comment-like"><img class="comment-like-btn" title="Vote" onclick="cl_like_this('http://blogs.cisco.com/wp-admin/admin-ajax.php',478017)" src="http://blogs.cisco.com/wp-content/plugins/comments-likes/images/like.png" />&nbsp;&nbsp;&nbsp;<span id="comment-like-cnt-478017">0</span> likes</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jeff Hammond</title>
		<link>http://blogs.cisco.com/performance/mcapi-and-mpi/#comment-476717</link>
		<dc:creator>Jeff Hammond</dc:creator>
		<pubDate>Sun, 11 Dec 2011 18:29:49 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.cisco.com/?p=54071#comment-476717</guid>
		<description><![CDATA[Ashley,

You might look at http://www.ipdps.org/ipdps2010/ipdps2010-slides/CAC/slides_cac_Mor10OptMPICom.pdf and related work on Nemesis in MPICH2 by INRIA and Argonne.

See also XPMEM (http://code.google.com/p/xpmem/), which is developed by some folks associated with OpenMPI.

On Blue Gene/P, MPI can exploit the static TLB map to directly access memory in other processes with no overhead, but this exists because of the unique properties of CNK, e.g. the bijective mapping of virtual and physical addresses.]]></description>
		<content:encoded><![CDATA[<p>Ashley,</p>
<p>You might look at <a href="http://www.ipdps.org/ipdps2010/ipdps2010-slides/CAC/slides_cac_Mor10OptMPICom.pdf" rel="nofollow">http://www.ipdps.org/ipdps2010/ipdps2010-slides/CAC/slides_cac_Mor10OptMPICom.pdf</a> and related work on Nemesis in MPICH2 by INRIA and Argonne.</p>
<p>See also XPMEM (<a href="http://code.google.com/p/xpmem/" rel="nofollow">http://code.google.com/p/xpmem/</a>), which is developed by some folks associated with OpenMPI.</p>
<p>On Blue Gene/P, MPI can exploit the static TLB map to directly access memory in other processes with no overhead, but this exists because of the unique properties of CNK, e.g. the bijective mapping of virtual and physical addresses.
<p class="comment-like"><img class="comment-like-btn" title="Vote" onclick="cl_like_this('http://blogs.cisco.com/wp-admin/admin-ajax.php',476717)" src="http://blogs.cisco.com/wp-content/plugins/comments-likes/images/like.png" />&nbsp;&nbsp;&nbsp;<span id="comment-like-cnt-476717">0</span> likes</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jeff Hammond</title>
		<link>http://blogs.cisco.com/performance/mcapi-and-mpi/#comment-476714</link>
		<dc:creator>Jeff Hammond</dc:creator>
		<pubDate>Sun, 11 Dec 2011 18:26:09 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.cisco.com/?p=54071#comment-476714</guid>
		<description><![CDATA[&quot;Milliseconds matter. MCAPI can therefore be quick and responsive in a way that MPI can&#039;t be.&quot;

I find the insinuation that MPI has worse than millisecond latency to be rather hilarious.]]></description>
		<content:encoded><![CDATA[<p>&#8220;Milliseconds matter. MCAPI can therefore be quick and responsive in a way that MPI can&#8217;t be.&#8221;</p>
<p>I find the insinuation that MPI has worse than millisecond latency to be rather hilarious.
<p class="comment-like"><img class="comment-like-btn" title="Vote" onclick="cl_like_this('http://blogs.cisco.com/wp-admin/admin-ajax.php',476714)" src="http://blogs.cisco.com/wp-content/plugins/comments-likes/images/like.png" />&nbsp;&nbsp;&nbsp;<span id="comment-like-cnt-476714">0</span> likes</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ashley Pittman</title>
		<link>http://blogs.cisco.com/performance/mcapi-and-mpi/#comment-476023</link>
		<dc:creator>Ashley Pittman</dc:creator>
		<pubDate>Sun, 11 Dec 2011 11:27:30 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.cisco.com/?p=54071#comment-476023</guid>
		<description><![CDATA[Actually, I take that back, it looks like both MPICH2 and OpenMPI now do this via knem, perhaps Jeff could confirm?

http://runtime.bordeaux.inria.fr/knem/]]></description>
		<content:encoded><![CDATA[<p>Actually, I take that back, it looks like both MPICH2 and OpenMPI now do this via knem, perhaps Jeff could confirm?</p>
<p><a href="http://runtime.bordeaux.inria.fr/knem/" rel="nofollow">http://runtime.bordeaux.inria.fr/knem/</a>
<p class="comment-like"><img class="comment-like-btn" title="Vote" onclick="cl_like_this('http://blogs.cisco.com/wp-admin/admin-ajax.php',476023)" src="http://blogs.cisco.com/wp-content/plugins/comments-likes/images/like.png" />&nbsp;&nbsp;&nbsp;<span id="comment-like-cnt-476023">0</span> likes</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ashley Pittman</title>
		<link>http://blogs.cisco.com/performance/mcapi-and-mpi/#comment-473996</link>
		<dc:creator>Ashley Pittman</dc:creator>
		<pubDate>Sat, 10 Dec 2011 13:49:26 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.cisco.com/?p=54071#comment-473996</guid>
		<description><![CDATA[Nobody has in-kernel userspace to userspace memory copy working again yet do they?  Without this you have to use shared memory copy-in/copy-out buffers which halves the bandwidth.

At Quadrics we had two features here, firstly we could remap the whole of the BSS and heap allocators to shared memory so you could just memcpy() to and from remote address space and we had a modified kernel ptrace API that you could use to get the kernel to do direct userspace to userspace copy into a remote processes address space.]]></description>
		<content:encoded><![CDATA[<p>Nobody has in-kernel userspace to userspace memory copy working again yet do they?  Without this you have to use shared memory copy-in/copy-out buffers which halves the bandwidth.</p>
<p>At Quadrics we had two features here, firstly we could remap the whole of the BSS and heap allocators to shared memory so you could just memcpy() to and from remote address space and we had a modified kernel ptrace API that you could use to get the kernel to do direct userspace to userspace copy into a remote processes address space.
<p class="comment-like"><img class="comment-like-btn" title="Vote" onclick="cl_like_this('http://blogs.cisco.com/wp-admin/admin-ajax.php',473996)" src="http://blogs.cisco.com/wp-content/plugins/comments-likes/images/like.png" />&nbsp;&nbsp;&nbsp;<span id="comment-like-cnt-473996">0</span> likes</p>
]]></content:encoded>
	</item>
</channel>
</rss>
