In the spirit of National Cyber Security Awareness Month (NCSAM) I offer up a recent tale of intrigue and mystery from an ongoing Cisco Security Research project...
One of Cisco Security Research and Operation's ongoing projects is to oversee a massive infrastructure of several high-volume Internet POPs that send large amounts of network traffic into one of our research labs. We are collecting NetFlow and packet dumps from a geographically distributed sensor network. These pcap files each contain several million packets, but due to a configuration error in the packet capture process, there was some amount of packet duplication. This short blog article will talk about why the duplication happened, how we prevented it from reoccurring, and a unique solution that was employed to remove the duplicate packets from all of the affected pcap files.
A Love Story: The Hub, The Switch, Packet Sniffing, and Cisco SPAN
Before there were network switches, there were network hubs. The hub was a wonderful little creature that dutifully forwarded every packet it saw to every other port on its body. This was a wonderful situation for the packet sniffer who wanted nothing more than to siphon up every packet on the network. He simply had to be plugged in to the hub on any port and immediately all the beautiful packets became available for consumption. Packet sniffer loves hub. Enter the brutally efficient switch. After the switch boots, it begins to build a very selfish layer 2 forwarding table (the CAM table) that maps switch ports with MAC addresses. After it learns what comes whence and what goes where, it then forwards or "switches" traffic only to the corresponding port. What a disaster for the packet sniffer! Now, the packet sniffer could only see traffic destined for the MAC address tha is registered to its port. Next comes Cisco SPAN (Switched Port Analyzer). SPAN is a feature on Cisco switches that provides a capability for packet sniffers to see some or all packets on a switch or VLAN (A Virtual Local Area Network where a physical network is partitioned into multiple smaller networks). To be clear, the switch is not simply forwarding each packet so that the sniffer can see them, in fact the switch is actually making a copy of each packet and forwarding it to the packet sniffer. Essentially, by configuring SPAN on a Cisco switch, the packet sniffer can now happily gobble up all of the packets on a network.
Given the above Switch/Packet Sniffer/SPAN love triangle, a common side effect of packet capturing on SPAN ports is duplicate packets. Duplicate packets most often occur when the packet capture source that is specified is a VLAN or port channel (a bundle of interfaces used to provide increased bandwidth and redundancy).
Cisco4948#configure terminal Enter configuration commands, one per line. End with CNTL/Z. Cisco4948(config)#monitor session 1 source interface vlan 5 !--- This configures vlan 5 as the source port. Cisco4948(config)#monitor session 1 destination interface fastethernet 0/3 !--- The configures interface Fast Ethernet 0/3 as the destination port. Cisco4948#show monitor session 1 Session 1 --------- Type : Local Session Source Ports : Both : vlan 5 Destination Ports : Fa0/3 Cisco4948#
Moreover it becomes increasingly convoluted when capturing bi-directional traffic. The simplest and most proper solution is to alter the source of the SPAN configuration and make the source a specified interface, for example “interface FastEthernet 0/0”:
Cisco4948#configure terminal Enter configuration commands, one per line. End with CNTL/Z. Cisco4948(config)#monitor session 1 source interface fastethernet 0/0 !--- This configures interface Fast Ethernet 0/0 as the source port. Cisco4948(config)#monitor session 1 destination interface fastethernet 0/3 !--- The configures interface Fast Ethernet 0/3 as the destination port. Cisco4948#show monitor session 1 Session 1 --------- Type : Local Session Source Ports : Both : Fa0/0 Destination Ports : Fa0/3 Cisco4948#
We were fortunate in our scenario to have done exactly as seen above; we changed the source of our SPAN from a VLAN to an interface. So let’s say you encounter a situation where it is not possible to change the source of your SPAN. Perhaps based on the environment and architecture your only option that allows you to see the necessary data flow is to have the source SPAN on a port channel or VLAN. If this is the case, duplicate packets are going to show up due to the SPAN source.
While we were indeed fortunate enough to resolve our SPAN issue in the most effective manner possible, we were still left with many pcap files consisting of billions of packets across thousands of files with some level of duplication. See for yourself!
[snarkbox:~/Projects] mike% capinfos -csd sample-02.pcap.gz File name: sample-02.pcap.gz Number of packets: 2928239 File size: 109090270 bytes Data size: 1953148233 bytes
Needless to say, we needed a tool to remedy the situation.
We didn't need something production quality, we just needed a stopgap tool that worked to remove the duplicate packets. Python, as it happens, is perfect for such rapid prototyping. A quick game-plan was drawn with the following requirements:
- Open a specified input pcap file (optionally gzip compressed)
- Open output file (optionally user specified)
- Descend into event loop:
- Read a packet from the input file
- Determine if the packet has already been seen within some defined interval and:
- If it has been seen: discard packet (increment duplicate counter)
- If it has not been seen: write the packet, including pcap header
- When the end of the input pcap file is reached, close the output file and report the results to the user
pdd.py: Pcap De-duplicator
The fruit of that labor was a 101 line Python program named pdd.py (that's right pdd = pcap de-duplicator). Note the program is only 80 lines without docstrings. It has the following calling conventions:
[snarkbox:~/Projects] mike% ./pdd/pdd.py -h usage: pdd.py [-h] -f INFILE_NAME [-w WINDOW_SIZE] [-v] [-o OUTFILE_NAME] [-z] parse a pcap file and remove duplicate packets, accepts gzip'd pcap files optional arguments: -h, --help show this help message and exit -f INFILE_NAME, --file INFILE_NAME pcap file to sift through -w WINDOW_SIZE, --window_size WINDOW_SIZE size of the sliding packet window, a larger window may find more duplicate packets but will increase run- time, default is 12 -v, --verbose be more verbose when reporting, -vv be even more verbose -o OUTFILE_NAME, --outfile OUTFILE_NAME output filename -z, --gzip gzip the output file
A sample invocation shown against a smaller pcap file:
[snarkbox:~/Projects] mike% ./pdd.py -f sample-01.pcap.gz -o sample-01.pdd.pcap -vv Using a window of 12, writing non-duplicates to sample-01.pdd.pcap dup: 60 byte packet at 2010-04-30 16:43:41.859558 and 2010-04-30 16:43:41.859554: Ethernet(src='\x00\x1aK\x00\x02\x1a', dst='\x00\x1d\xa1\xea\xec\x1b', data=IP(src='redacted', off=16384, dst='redacted', sum=44674, len=40, p=6, data=TCP(seq=3825337896, win=0, sum=45839, flags=4, dport=443, sport=33949))) dup: 60 byte packet at 2010-04-30 16:45:42.830688 and 2010-04-30 16:45:42.830685: Ethernet(src='\x00\x1aK\x00\x02\x1a', dst='\x00\x1d\xa1\xea\xec\x1b', data=IP(src='redacted', off=16384, dst='redacted', sum=44674, len=40, p=6, data=TCP(seq=1440424076, win=0, sum=3534, flags=4, dport=443, sport=40354))) dup: 60 byte packet at 2010-04-30 16:47:43.831652 and 2010-04-30 16:47:43.831559: Ethernet(src='\x00\x1aK\x00\x02\x1a', dst='\x00\x1d\xa1\xea\xec\x1b', data=IP(src='redacted', off=16384, dst='redacted', sum=44674, len=40, p=6, data=TCP(seq=3325172995, win=0, sum=41214, flags=4, dport=443, sport=40355))) dup: 60 byte packet at 2010-04-30 16:48:55.308183 and 2010-04-30 16:48:55.308180: Ethernet(src='\x00\x1d\xa1\xea\xec\x1b', dst='\x00PV\x8e?@', data=IP(src='redacted', off=16384, dst='redacted', sum=18987, len=40, p=6, ttl=43, data=TCP(seq=1751296284, win=0, sum=36867, flags=4, dport=2153, sport=80))) dup: 60 byte packet at 2010-04-30 16:48:55.332592 and 2010-04-30 16:48:55.332588: Ethernet(src='\x00\x1d\xa1\xea\xec\x1b', dst='\x00PV\x8e?@', data=IP(src='redacted', off=16384, dst='redacted', sum=18991, len=40, p=6, ttl=43, data=TCP(seq=275966928, win=0, sum=42310, flags=4, dport=2150, sport=80))) dup: 60 byte packet at 2010-04-30 16:49:44.832697 and 2010-04-30 16:49:44.832693: Ethernet(src='\x00\x1aK\x00\x02\x1a', dst='\x00\x1d\xa1\xea\xec\x1b', data=IP(src='redacted', off=16384, dst='redacted', sum=44674, len=40, p=6, data=TCP(seq=924029966, win=0, sum=47376, flags=4, dport=443, sport=40357))) dup: 60 byte packet at 2010-04-30 16:51:45.833485 and 2010-04-30 16:51:45.833481: Ethernet(src='\x00\x1aK\x00\x02\x1a', dst='\x00\x1d\xa1\xea\xec\x1b', data=IP(src='redacted', off=16384, dst='redacted', sum=44674, len=40, p=6, data=TCP(seq=2836595558, win=0, sum=58154, flags=4, dport=443, sport=37427))) dup: 60 byte packet at 2010-04-30 16:53:46.834433 and 2010-04-30 16:53:46.834430: Ethernet(src='\x00\x1aK\x00\x02\x1a', dst='\x00\x1d\xa1\xea\xec\x1b', data=IP(src='redacted', off=16384, dst='redacted', sum=44674, len=40, p=6, data=TCP(seq=434232427, win=0, sum=39254, flags=4, dport=443, sport=37428))) Of 163831 total packets, I wrote 163823 and found 8 duplicates
Another invocation, this time against a large file:
[snarkbox:~/Projects] mike% ./pdd.py -f sample-02.pcap.gz Using a window of 12, writing non-duplicates to sample-02.pcap.gz.pdd.22213 Of 2928239 total packets, I wrote 1467450 and found 1460789 duplicates
The Special Sauce
The real work of detecting duplicate packets inside pdd.py is accomplished by implementing a simple sliding window across the input stream of packets. Many readers will be familiar with the sliding window protocol used with TCP. As pdd.py starts, it will check each packet against entries already in its window. If the packet does not already exist in the window (not a duplicate) it is written to the output file and appended to the left side of the window. If the window is already full, the oldest packet is popped off and thrown away. For pdd.py, the sliding window was implemented using python's collections.deque() object. This object was chosen since it supports efficient push and pop operations against both ends of the queue. The built-in python list() object does support pushes and pops from both ends; however, it suffers a serious performance penalty when making changes to the head of the list. This whole process is depicted below in Figure 1.
The Python function employing the deque is shown below:
def deduplicate_pcap(infile, outfile, pcap, window_size, verbosity): """Uses a sliding window of recently seen packets to remove duplicates. infile: original pcap file outfile: newly created output file obeying the dpkt.pcap interface window_size: size of the sliding window of packets that get compared verbosity: level of verbosity as specified by the user """ sliding_window = deque() tot_count = 0 pkt_count = 0 dup_count = 0 for ts, pkt in pcap: tot_count += 1 for stored_pkt, stored_ts in sliding_window: if pkt == stored_pkt: dup_count += 1 found_dup(pkt, ts, stored_ts, verbosity) break else: outfile.writepkt(pkt, ts) pkt_count += 1 if len(sliding_window) >= window_size: # once deque is full pop off the rightmost (oldest) item sliding_window.pop() # add a new entry to the left side of the packet deque sliding_window.appendleft((pkt, ts)) print >> sys.stderr, "Of %d total packets, I wrote %d and found %d duplicates" % (tot_count, pkt_count, dup_count)
What Size Window?
Certainly there is some magic in choosing a window size. Clearly the larger the window size, the longer the execution time of pdd.py. Big O running time of searching the deque is always linear, meaning that it scales with the size of the window and pdd.py will iterate over the entire window for packets that aren't duplicated. Therefore, the user should choose the smallest effective window size. However, if too small of a size is chosen, duplicate packets could be missed. An optimal value should probably hinge upon the number of packets captured and the network topology as it relates to the expected number of duplicates. We're not really concerned with the packet rate or speed of the network since we're using an absolute packet window and not a time-based window. If the cause for duplication is a layer 2 configuration issue, the duplicates are likely to be laid out very tightly (in some cases sequentially). In this case, a smaller window will suffice. If something is causing the duplicate packets to be delayed before the sniffer sees them, then they might be more sparsely populated throughout the pcap file. As a default, we chose 12. It seemed to be a good compromise between speed and efficacy.
Another Option: Wireshark's editcap
At the time of the project, unbeknownst to the SR&O team, there existed prior art. As it so often happens in the byzantine world of computer security, someone else had encountered and solved our problem before we did. Inside the Wireshark suite exists a cache of command line tools including one called editcap. This handy tool allows a user to edit or translate the contents of a pcap file—including the ability to remove duplicate packets. As validation that indeed we were on the right track, editcap also employs a user-defined sliding window (and also offers the option to use a time-based window).
There's no doubt that other tools and options may also exist to resolve packet duplication issues, but we hope we have provided you with not only a few options/solutions from our own experiences, but also a clear understanding of the root cause and the effective solutions.
There are several key takeaways here:
- Know and understand the nature of your switching infrastructure – this helps with understanding where to place packet sniffers and ultimately the source and destination of the SPANs
- Know and understand the devices that are connected to these infrastructures – while we may not have discussed it, it is also important to know if servers have multiple/redundant connections to switches, how the keep alives and heartbeats are propogated, and so on, in order to understand the nature of packet flow and if you will see multiple/potentially duplicate packets on the same switch
- Use SPAN source interfaces vs port channels or VLANs where possible
- There is usually more than one way to skin a cat – sometimes it makes sense to find an existing solution while sometimes solving a problem on your own is a fun and rewarding exercise
Andrae Middleton actually co-wrote this blog with me. He wrote all of the switch configurations. Additionally, I could never have gotten anywhere in life without being under the tutelage of esteemed programming leviathans William McVey and Nathan Ramella. From them, I learned the wonderful Python programming language.