The Effectiveness of Antivirus on New Malware Samples
During the course of security research we often acquire new malware samples. We typically first try to determine what we have acquired and if it is a new or otherwise unknown malware sample or if it is a mutation of something that we have already seen. There are several ways in which a sample can be tested, but the simplest way is to compare the MD5 checksum of the malware sample against other known checksums — several services exist where you can look up the hash of a sample, such as Malware Hash Registry by Team Cymru, VirusTotal, and MalwareHash. These services work by analyzing samples against antivirus products from several vendors (often thirty or forty different products). If the sample has previously been analyzed, the results will often tell what percentage of antivirus products detect the sample. Most of the time this method is sufficient on samples that are more than a few days old; however, on samples that are recent (perhaps discovered within the last twenty-four hours) the effectiveness of this method is marginal, illustrating the highly reactive nature of the industry.
Since antivirus products are often used as a cure for poor user discretion, I thought I would track the effectiveness of antivirus products on new malware samples that we received and test some of the samples a week later to note how the coverage improved. I think the results will show that new malware samples have a window of opportunity where end users are particularly vulnerable to the new malware strains.
Every day, new samples were automatically tested against publicly available sources. Most of the samples were very recent. If the sample did not match a known md5 hash then the sample was manually submitted to the services. The best success rate for each sample was tracked (meaning how many different antivirus products detected the malware out of the number of products that were tested). The results showed that in many cases the coverage for a particular sample is poor at the beginning, creating a window when the new malware strains can be particularly effective. Later, the samples that tested poorly were tested again to see how the detection rate improved after a week. The threshold that I arbitrarily set for poor detection was a 30% effectiveness rate, meaning less than 30% of the antivirus products detected the sample at that time (and 70% failed to detect it). Similarly, I tracked success percentages in the 30-49% and 50-69% range, and used a 70% threshold as a relatively effective threshold.
Day of Sample Results
Over the course of a few days we collected 152 malware samples that were likely to be relatively new. Below is a breakdown of what was collected:
- Total Unique Samples: 152
- Detection Rate 30% or less: 43 samples (28%)
- Detection Rate 30-49%: 47 samples (31%)
- Detection Rate 50-69%: 34 samples (22%)
- Detection Rate 70% or greater: 28 samples (18%)
Of the relatively new malware specimens only 40% (62 of 152 samples) were detected by more than half of the antivirus products, while about 60% were detected by less than half of the products. A little over one quarter of the samples (28%) were detected by less than 30% of the antivirus products, which is an alarming statistic. Although we know that malware is continually evolving to avoid detection and that detection is continually improving to adapt to these changes, there is still a very distinct window of opportunity where detection is poor and end users should be on guard because the likelihood that an antivirus product will save them from poor decision making is less than 50%. Malware authors make these daily changes to foil antivirus products because it is an effective method.
Results One Week Later
To illustrate this reactionary approach to malware, I followed the 43 samples that were poorly detected (meaning less than 30% detection rate) and re-evaluated them one week later. After one week, 22 of these samples had detection rates above 70% and 35 of the samples were detected by greater than 50% of the antivirus products. The overall detection rate near the day of detection for these samples was 18.6% and that improved to 62.9% one week later.
Antivirus detection improvement over time for samples with less than 30% detect rate on day of sample acquisition
Though the improvement over one week was significant, this example underscores the fact that antivirus products are reactionary and are likely to be only modestly effective when dealing with new samples. This leaves a window of opportunity for miscreants, where end users are particularly vulnerable even if they have antivirus products deployed and updated. Antivirus is not a replacement for end-user discretion and defense in depth.