Fosdem 2016, part 4: what netflows tells us
Now in part 4 we can combine the the IP to MAC address tables together with the user agents, captured by NBAR2 and exported using netflow. The result of all this logging is a list of MAC addresses, the IPs a particular MAC address was using at a certain time, and the user agents we saw for this IP during this time period. This data was loaded into a sqlite database for easy manipulation and querying.
What can we learn from this?
At Fosdem 2016 we saw 9711 unique MAC addresses. Of these we can, using the user-agent, classify 7623 or 78%. The others simply either did not send HTTP traffic or used non-informative user-agent strings like “Apache-HttpClient/UNAVAILABLE (java 1.4)” or my favourite: “(null)/(null) ((null))“. When we talk about ‘devices’ below, we are in fact saying ‘MAC address’ as we assume that every devices has only one MAC address, which does not change. However the data suggests that this is not true, as we will discuss later.
Of these 9711 MAC addresses we saw that 4451 (45%) where only active on the IPv6-only wifi, 1948 (20%) where only active on the legacy dualstack wifi, which provided both IPv4 and IPv6 connectivity, and 3152 (32%) were active on both networks. This seems to indicate that almost half of people where happy to be on the IPv6 only network and remain there.
Of these 9711 MAC addresses we can attribute 9568 to known manufacturers. The rest seems fake, especially MAC addresses like d2deadbeef02 or deadbabecaff, some people told me that they generate a new MAC address for their device every time it boots, and it seems that a few people are doing this.
First the big reveal: from the data we can see 4686 unique smartphone devices on the Fosdem network over the whole weekend. These are all the devices who identify themselves as either iOS iPhones, Firefox OS, Symbian, Blackberry, MeeGo, Jolla, Tizen, Windows Phone or Android non-tablets.
One thing to note on the analysis of the http user agents, is that the data is very noisy. A large proportion of Android devices for example also claim to be running iOS, a certain MAC address even sends traffic claiming to be 19 different operating systems. I took to manually manipulating the data to try to improve the quality of the data, for example I took the drastic action that no device not made by Apple (as determined by the mac address) can run Darwin/iOS or Mac OS X.
So this analysis is less of a science and more educated guessing.
Besides some devices having an identity crisis, we also noticed that some people used a lot of IP addresses. A particular android phone used up 77 IPv4, and a few blackberries used up to a thousand IPv6 addresses over the course of the two day event.
If we look at the distribution of detected operating systems:
We can see the Android clearly dominates, but notice that the second largest group is ‘unknown’.
Let’s group the operating systems together in ‘families’ and ignore the unknowns:
Fosdem is clearly an Android/Apple/Linux world.
An interesting question was: can Android work on a IPv6-only network? With the information we collected, we can now check which Android versions were used on the IPv6 only network segment. First we can check what versions of Android we detected across the board:
We seems to have a representation of almost all possible version… If then we look at the IPv6 only statistics we notice that almost all versions below 5 are missing or only a few examples remain:
This information is based on the adjacencies on the router for the IPv6-only network, even if we identified the nature of that MAC address later on the dual-stack network. So no successful connection was needed, just getting IPv6 working enough to get an adjacency on the router.
The data seems to support the statement that Android 5.0 and higher supports IPv6-only networks.
People who did not connect at all to the IPv6 only network, were primarily running the older versions:
But still a lot of people with modern versions went to both the legacy and the IPv6-only network:
For reference, the IOS version distribution shows that most iOS devices seem to be running the more recent releases.
With this in mind we might want to enable the firewall on the router next year, as quite a number of people are using older operating systems who might not be secure on an open Wifi network, directly connected to the internet.
Given that we now know what operating system was running on a particular MAC address, and we know when we saw this MAC address, we can plot the operating system family distribution over time:
The sudden decline on Sunday was me clearing the adjacency table as it seemed ‘too full’. Turns out it was normal and recovered quickly.
We can then look at the mobile Linux family section in more detail:
There were some people claiming that the FreeBSD people would come on Sunday, but the data does not show this:
Finally we can check if people fled to the legacy network after hitting the IPv6 network first with Android:
But it seems that the people that could be detected remained on the IPv6 only network, the green section declines gradually at the end of the day like all other graphs.
In fact graphing the number of the IPv4 versus IPv6 addresses in use is boringly similar to the previous graphs.
For completeness this is the count of operating systems found:
|Mac OS X||535|