When Fox-IT published their report regarding malvertisements coming from Yahoo, they estimated the attack began on December 30, 2013, while also noting that other reports indicated the attack may have begun earlier. Meanwhile, Yahoo intimated a different timeframe for the attack, claiming “From December 31 to January 3 on our European sites, we served some advertisements that did not meet our editorial guidelines – specifically, they spread malware.”
With so much uncertainty regarding this attack, Cisco TRAC decided to review what data we had to see if we could sort out some of the competing claims. Cisco Security Intelligence Operations data regarding the Yahoo incident supports the conclusion that the attack against Yahoo began on December 31. However, the malicious advertisements were just one attack in a long series of other attacks waged by the same group.
Read More »
Tags: malware, TRAC
There are many advantages in outsourcing functions to specialist providers that can supply services at lower cost and with more functionality than could be supplied in-house. However, companies should be aware that when buying services, you may also be buying risk. Organisations that have successfully implemented strategies to reduce the probability of experiencing a breach, and to decrease the time required to discover and remediate breaches, may still encounter embarrassing public breaches via third parties. Within the past two weeks, we have seen two examples of companies having their websites defaced apparently due to security lapses in service providers.
Read More »
Tags: dns, TRAC
In the last chapter of our five part Big Data in Security series, expert Data Scientists Brennan Evans and Mahdi Namazifar join me to discuss their work on a cloud anti-phishing solution.
Phishing is a well-known historical threat. Essentially, it’s social engineering via email and it continues to be effective and potent. What is TRAC currently doing in this space to protect Cisco customers?
Brennan: One of the ways that we have traditionally confronted this threat is through third-party intelligence in the form of data feeds. The problem is that these social engineering attacks have a high time dependency. If we solely rely on feeds, we risk delivering data to our customers that may be stale so that solution isn’t terribly attractive. This complicates another issue with common approaches with a lot of the data sources out there: many attempt to enumerate the solution by listing compromised hosts and in practice each vendor seems to see just a small slice of the problem space, and as I just said, oftentimes it’s too late.
We have invested a lot of time in looking at how to avoid the problem of essentially being an intelligence redistributor and instead look at the problem firsthand using our own rich data sources -- both external and internal - and really develop a system that is more flexible, timely, and robust in the types of attacks it can address.
Mahdi: In principle, we have designed and built prototypes around Cisco’s next generation phishing detection solution. To address the requirements for both an effective and efficient phishing detection solution, our design is based on Big Data and machine learning. The Big Data technology allows us to dig into a tremendous amount of data that we have for this problem and extract predictive signals for the phishing problem. Machine learning algorithms, on the other hand, provide the means for using the predictive signals, captured from historical data, to build mathematical models for predicting the probability of a URL or other content being phishing.
Read More »
Tags: analytics, Big Data, Cisco, cloud, database, email, innovation, Intelligence, operations, phishing, security, TRAC, TRAC Big Data Analysis
Following part three of our Big Data in Security series on graph analytics, I’m joined by expert data scientists Dazhuo Li and Jisheng Wang to talk about their work in developing an intelligent anti-spam solution using modern machine learning approaches on Hadoop.
What is ARS and what problem is it trying to solve?
Dazhuo: From a high-level view, Auto Rule Scoring (ARS) is the machine learning system for our anti-spam system. The system receives a lot of email and classifies whether it’s spam or not spam. From a more detailed view, the system has hundreds of millions of sample email messages and each one is tagged with a label. ARS extracts features or rules from these messages, builds a classification model, and predicts whether new messages are spam or not spam. The more variety of spam and ham (non-spam) that we receive the better our system works.
Jisheng: ARS is also a more general large-scale supervised learning use case. Assume you have tens (or hundreds) of thousands of features and hundreds of millions (or even billions) of labeled samples, and you need them to train a classification model which can be used to classify new data in real time.
Read More »
Tags: analytics, ARS, auto rule scoring, Big Data, Cisco, database, email, Hadoop, ham, innovation, Intelligence, offline learning, online learning, operations, security, spam, TRAC
Following part two of our Big Data in Security series on University of California, Berkeley’s AMPLab stack, I caught up with talented data scientists Michael Howe and Preetham Raghunanda to discuss their exciting graph analytics work.
Where did graph databases originate and what problems are they trying to solve?
Michael: Disparate data types have a lot of connections between them and not just the types of connections that have been well represented in relational databases. The actual graph database technology is fairly nascent, really becoming prominent in the last decade. It’s been driven by the cheaper costs of storage and computational capacity and especially the rise of Big Data.
There have been a number of players driving development in this market, specifically research communities and businesses like Google, Facebook, and Twitter. These organizations are looking at large volumes of data with lots of inter-related attributes from multiple sources. They need to be able to view their data in a much cleaner fashion so that the people analyzing it don’t need to have in-depth knowledge of the storage technology or every particular aspect of the data. There are a number of open source and proprietary graph database solutions to address these growing needs and the field continues to grow.
Read More »
Tags: analytics, Big Data, Cisco, database, Gremlin, InfiniteGraph, innovation, Intelligence, NoSQL, operations, security, Titan, TRAC, TRAC Big Data Analysis