Let’s revisit my stats from a prior blog post about who uses the MPI C++ bindings. Jed Brown was kind enough to school me in how terrible my prior statistical analysis was.
I’ve actually removed the offending stats from that entry and am re-doing them here; hopefully in a more meaningful way. I won’t even describe how bad / wrong my prior analysis was; let’s just go through the numbers again with a little something I like to call The Right Way…
There are eight possible demographics that respondents could have categorized themselves. While it is probably dangerous to make assumptions based on the entire population of respondents (for example, because of the way the survey was advertised, it’s probably skewed mostly towards the developer, educator, and researcher communities), it is probably reasonable to assume that at least within each demographic, we have a decent random sample of that particular demographic’s overall population.
I have recently learned (ahem) that it doesn’t make statistical sense to combine the demographic categories, so let’s look at the breakdown of how each answered the C++ bindings question (there’s a little rounding in the percentages):
|Yes||No||I don’t know|
|MPI application developer||43 (17%)||196 (77%)||14 (5%)|
|Library / middleware developer (that uses MPI)||13 (16%)||63 (77%)||6 (7%)|
|Project / program / general mangement||4 (17%)||15 (65%)||4(17%)|
|Academic educator / researcher||151 (66%)||52 (23%)||26 (11%)|
|Student||16 (24%)||34 (50%)||18 (26%)|
|MPI implementer||9 (22%)||29 (72%)||2 (5%)|
|User of MPI applications||55 (50%)||25 (23%)||30 (27%)|
|Other||6 (37%)||3 (19%)||7 (44%)|
Per my prior blog entry, I still think that the top three categories are the most important, from a defining-the-standard perspective: these are the people who are either writing code that uses MPI, or they manage people who do. Among these, the highest percentage of those who use the C++ bindings is 17%.
The fact that this number is higher than the Forum anticipated led to people wondering why respondents answered “yes”. Here’s some speculations:
- It is possible that respondents didn’t fully understand the question — perhaps they thought that MPI_Send() is a C++ binding if it is used in a C++ application.
- Perhaps the wording of the question was not specific enough: we asked if they had any C++ applications that used the C++ bindings. This conceivably includes toy and test applications.
- Or perhaps the respondents all write large C++ applications using the MPI C++ bindings — meaning that the 17% number is pretty accurate.
The next group of three categories is a little more nebulous, so let’s talk about each one:
- Academic educator / researcher: unfortunately, this category isn’t specific enough. We lumped together those who assign MPI C++ programs for homework and those who write C++ codes for / to perform research. I suspect (based on anecdotal evidence) that the 66% number is artificially high because a lot of educators use the C++ bindings in their curricula. But the data doesn’t indicate one way or another.
- Student: I think the only think I have to say about this category is: Egads! 26% of you don’t know if you’re using the C++ bindings or not? /me weeps.
- MPI implementer: My interpretation is that if a respondent marked their primary demographic as “MPI implementer” and said “yes, I use the C++ bindings”, it is because they’re writing test codes for the MPI C++ bindings.
The last grouping of demographics, in my not-so-humble-opinion, are not very relevant for this survey question. For example, users may or may not know how the application they are running was written.