How We Apply Machine Learning in Cisco Advanced Threat Solutions
There is a lot of talk lately about machine learning when it comes to cyber security. It seems like you can’t have a conversation about one without the other. Many of organizations I’ve spoken to in the last couple months are interested in learning more, but often end up more confused after they begin researching. Take a look at the Demystifying Machine Learning in Endpoint Security for a good primer.
At Cisco, we have been using ML for decades, so the topic isn’t new. Just in security alone we have numerous teams, and more than 20 Ph.D’s in machine learning. Our teams use machine learning as a method for detecting and analyzing threats. It’s a method, not an outcome. That’s an important distinction in security. In the last few years we have seen many companies tout their machine learning, but they’re never willing to explain what that really means.
How is Cisco different?
In 2013 we acquired Cognitive Security, a company completely dedicated to machine learning. We quickly integrated their technology – now called Cognitive Intelligence – with our web security solutions to increase detections (see blogs). This was a passive approach to detection. Logs are sent from the proxy to Cognitive Intelligence for analysis. We analyze attributes of the logs, never needing to look at the payload, to discover anomalous activity in the course of normalcy. The outcome is simple: Cognitive Intelligence only alerts on hosts that it can definitively say are compromised. Since it only alerts you to confirmed infections, analysts don’t waste time and move purely into remediation and clean up.
This was just the beginning of embedding machine learning into our security portfolio. We quickly realized the value of this technology and began to use its strong analytics capabilities into other parts of the security stack. We incorporated the algorithms to correlate massive amounts of data and provide intelligence beyond what could be seen from a single vector. For example, if you can correlate network traffic data with outbound proxy communication to identify a compromised host that has admin privileges and is making lateral movements, it would be impossible to detect using a single technology. However, you could do it if you connected multiple pieces together. And that’s where we realized the true value of what we had.
Machine Learning applied to network telemetry
Cisco is obviously known for pioneering switches and routers. Basically, we built the backbone of the internet and most organizations’ infrastructure. This existing network infrastructure is a rich source of data. For example, Stealthwatch collects and analyzes network telemetry in order to pinpoint threats that might be lurking within. It also integrates with the Cognitive Intelligence machine learning engine, which correlates threat behaviors seen locally within the enterprise with those seen globally. It can detect anomalies, and is also smart enough to then classify actual individual pieces of “threat activity” (because what is anomalous might not necessarily be malicious), which leads to high-fidelity and critical alerts. It’s also the core technology behind Encrypted Traffic Analytics (ETA), which can detect malware in encrypted traffic without decryption, an industry first!
Machine Learning in Endpoint Security
When it comes to discussing endpoint security, it’s commonly accepted that signature based detection (such as file hashes) are a part of the solution, not THE solution. The complexity of changing file hash values or IP addresses ranges is trivial, which means adversaries can generate new SHA256 hashes for each infection. While a hash value may be sufficient to identify a single malicious file, it doesn’t help identify other related infections of polymorphic malware that may be associated with the same exploit or even the same attacker. The same hash will simply never be seen twice.
When we apply machine learning to these files, we are able to dissect each one that we analyse into pieces. It’s akin to looking at the individual pieces that make up the car versus the whole car. Yes, cars have tires, and engine, windshield, windows, a frame, and so on. But obviously not all cars are created equal. Same goes for malware. We can break down each individual threat into excruciating detail (over 400 different attributes). These attributes are used as discrete classifiers in the machine learning model, the increased level of detail results in a smarter, better trained algorithm, as well as higher fidelity results. This means our machine learning is better at picking up those new and reengineered threats. Threat actors often repackage their exploits in different formats, such as the Flash vulnerability from CVE-2018-4878 that was used in multiple exploits, including ROKRAT, and it’s follow up campaign. Machine learning is one of 14 different techniques AMP for Endpoints uses to detect and protect against threats.
Stitching the pieces together
One of the ways we’re stepping up the game at Cisco is by defining threat actor models using the machine learning and analytics engine, Cognitive Intelligence. By correlating telemetry from web proxy logs (Cisco & 3rd party), network telemetry (from Stealthwatch), SHA256 values and file behavior from AMP, it identifies how attackers operate, what they do, and even who they are. When we feed this amount of data into our machine learning algorithms, you get an unparalleled level of detections, and more importantly, we block more threats before they become a problem. We’ll explore various classifiers in depth in future blogs.
You can test AMP for Endpoints yourself with a free trial here: www.cisco.com/go/tryamp.
For a more in depth look at how we use apply machine learning inside Cisco security solutions, check out this technical video.