Sometimes it is interesting to take a look at darknet data and see what you come across. If you are not familiar with the term “darknet,” I am using the definition used by some in the service provider community where a darknet is a set of address space which contains no real hosts. That means no client workstations to initiate conversations with servers on the Internet. It also means no advertised services from those ranges, such as a webserver, a DNS server, or a database server. There is really no reason to see any traffic destined for addresses within those ranges. From a network point of view, it should be as desolate and deserted as the town of Pripyat in the Ukraine, within the evacuation zone due to the Chernobyl disaster back in the 1980s. However, in practice, you do see traffic to those address ranges, which is what makes that traffic somewhat interesting. Traffic destined to those ranges could be the result of malware attempting to locate machines to infect, part of a research project or it could be as simple as a misconfiguration or a typographical error. One example of traffic resulting from a typo would come from attempting to ping a host and typing the wrong address in. However, it would be hard to believe that all of the traffic seen in a darknet is the result of a mistake.
Setting up a darknet does not have to be hard to do. If your organization has address space that is not being used, then all that you need to do is advertise a route for those addresses and leave them unused. In our case, we have advertised several ranges and we collect Netflow data for the traffic destined to them from a nearby Cisco router. That Netflow data is exported to a collector, such as nfcapd, where it is aggregated for further analysis.
In examining the first 3 weeks of June 2011 across a few of our darknets, here were a few interesting observations.
Take a look at the following table of network flows sorted by flow volume:
64% of the traffic flows destined for the ranges I examined were probing for hosts on TCP port 445, the port commonly used for SMB file sharing over TCP. The next highest percentage of traffic was for port 1433, the default port used for Microsoft SQL Server, which only accounted for 3.09% of the traffic flows.
Looking at the data makes you think that securing traffic to port 445 may be a very good idea based solely on the volume of traffic into the darknet. But what about the other traffic? When examining the port 1433 traffic some things were immediately obvious, such as the sequential scans by the same host for tcp/1433. In a sequential scan, typically you will see IP addresses scanned sequentially. You try x.x.x.1, then x.x.x.2, then x.x.x.3, etc. I have obfuscated the source and destination IP addresses on purpose, but note how this same source IP scans the ranges and then re-scans them again later on different dates:
Worth noting as well, the same behavior was blogged via SANS back in January 2010. See http://isc.sans.org/diary.html?storyid=7924 . Time may pass, new vulnerabilities and malware may hit the news, but the old traffic patterns due to scanners or malware in the wild do not necessarily go away. 94.9% of the flows destined to TCP/1433 all had a source port of 6000. How many hosts were responsible for those flows from port TCP/6000 over a 3-week period? Just 57. Out of the 1013 unique IPs our darknet ranges saw scanning for TCP/1433, 57 of them shared a repeated source port of TCP/6000. A pattern like this in your data would typically mean that it is expected behavior or that a common tool is being used. Since sourcing all traffic from port 6000 is not really expected behavior, then you can surmise that a common tool or common piece or family of malware may be on those 57 hosts.
Another interesting port to take a look at is UDP/5060. UDP/5060 is commonly used for the Session Initiation Protocol (SIP). Of the port 5060 traffic in the table above only 0.041% of the traffic was to TCP/5060, and the rest was to UDP/5060. In total, our darknets saw 673 unique IP addresses scanning for UDP/5060 during the 3-week period. The SIP scans also showed some similarities to the TCP/1433 scans. There are sequential scans where one IP address is attempting to locate servers listening for SIP on every IP address in the range. There are scans that share source ports. Instead of port 6000 observed in the TCP/1433 scans, the SIP scans used other source ports, such as UDP 5060, 5061, 5062 and 36209.
But there is another interesting data point to take serious note of…
Care to guess how many of those 673 IPs were also scanning for TCP/1433? A quarter of them? A third of them? Half of them? None of the above. Only three hosts were scanning for both TCP/1433 and UDP/5060. That was a curious finding. But does it still hold true for other ports? The next one I chose to examine more closely was port TCP/22, the port commonly used for SSH sessions. Our flows to TCP/22 came from 130 unique IPs. Of those 130 addresses, we had seen flows from four of the same IP addresses to port TCP/1433 and six to port UDP/5060. How many were shared in common between all three ports? The same three IP addresses from the previous results.
Let’s think about this a minute. 1013 different IP addresses scanning for SQL Server, 673 addresses scanning for SIP, and 130 addresses scanning for SSH. Out of all of those addresses, only three were scanning for all three services. The rest of them were specifically targeting certain services with their scans. This means that if you were specifically watching for rogue SQL Server connection attempts and actively blocking those hosts based on source IP addresses, it would provide little to no protection for other services like SSH or SIP.
For each service that is exposed to the Internet, separate consideration needs to be given to the protection mechanisms that are needed to make sure that the service is adequately protected. A blacklist mechanism for one service, such as for SSH connections, may have nearly zero relevance or provide no protection to another service such as SIP or SQL Server.