Threat Detection: A Big Data Approach to Security

Cisco recently announced the availability of Managed Threat Defense (MTD), an innovative managed security solution that applies real-time, predictive analytics to detect attacks and protect against advanced malware across extended networks. MTD helps our customers address the ever-changing nature of threats that threaten their most important asset—data. MTD is delivered through a cost-effective business model that allows our customers to leverage Cisco’s investment in security technology, global threat intelligence knowledge base, talent, and global reach.

To learn more about MTD, watch the video datasheet below:

While developing this solution, the MTD development team talked to dozens of customers around the world. As a result of these discussions, two dominant themes emerged:

The sheer quantity of security telemetry generated in most large enterprises makes security monitoring a Big Data problem
Customers need rapid application of both threat intelligence and security analytics in order to narrow the window between compromise and detection/mitigation

If security monitoring is a Big Data problem, then we needed to develop a Big Data solution. This realization led us to the most popular open-source Big Data framework: Hadoop. But, the bigger problem was how to address the second theme using Hadoop.

Historically, Hadoop was comprised of two major components: a distributed file system (Hadoop File System or HDFS) and a distributed computing paradigm (MapReduce). Together, these components allow customers to perform complex analytics on large amounts of data. But the original use case for Hadoop and MapReduce focused on batch analytics and batch analytics are by definition not real-time. Depending on the amount of data, the number of nodes in the cluster, the technical specifications of each node, and the complexity of the analytics, a given MapReduce job can take anywhere from minutes to hours to run. Given the rapidity with which advanced attackers can potentially exfiltrate data after a successful compromise, customers can’t wait hours (or even minutes in most cases) to know if malicious activity is occurring on their networks. They need to know in real-time or near real-time, and MapReduce wasn’t designed with this use case in mind.

Distributed, real-time streaming analytics solutions do exist. Apache Storm is probably the best known. A myriad of other projects, of various degrees of maturity, are also trying to address the need for real-time analytics, but none of these solutions ‘played nice’ with Hadoop. Hadoop had its own resource management paradigm that supported the requirements of MapReduce, but it wasn’t easy to run other applications side-by-side with MapReduce on nodes in a Hadoop cluster.

The MTD team is using Apache Storm because of its unprecedented ability to process Big Data, at scale, in real time. Enter Apache YARN (Yet Another Resource Negotiator). Apache YARN is a sub-project of Apache Hadoop. YARN separates Hadoop cluster resource management from MapReduce data processing so that other applications, apart from MapReduce, can co-exist on the same Hadoop node and share the same resources. Using YARN, organizations can run streaming analytics, such as Storm, and batch analytics on the same cluster.

The MTD development team decided to embrace YARN from its earliest production releases in fall 2013. We also worked extensively with the Apache Storm project team to ensure that Storm worked seamlessly with YARN. Using YARN-enabled Storm we can apply all sorts of interesting analytics and enrichments to security telemetry in near real-time, including:

comparison of incoming telemetry against threat intelligence indicators of compromise (IOC)
enrichment of telemetry with geo-location
DNS
host posture information
anomaly detection via scoring against stored models.

Hadoop was a natural fit for MTD. And together with a number of elements of the Hadoop ecosystem we’ve developed a powerful, solid, and scalable platform for security analytics, incorporating such functionality as full-packet capture, stream processing, batch processing, real-time search, and telemetry aggregation.

If you’re planning to attend Cisco Live-US the week of May 19, the Services development team will be demonstrating MTD in the Security booth, in the World of Solutions. MTD works with other Cisco security solutions, so when you visit us in the booth, you’ll also get to learn about other security solutions in Cisco’s comprehensive portfolio and how they can help you. In addition, we’ll be delivering a talk on MTD’s analytics framework in the Solution Theatre on Wednesday, May 21 at 10:30am. We hope you’ll stop by, say hello, and take a look at MTD.