In the last chapter of our five part Big Data in Security series, expert Data Scientists Brennan Evans and Mahdi Namazifar join me to discuss their work on a cloud anti-phishing solution.
Phishing is a well-known historical threat. Essentially, it’s social engineering via email and it continues to be effective and potent. What is TRAC currently doing in this space to protect Cisco customers?
Brennan: One of the ways that we have traditionally confronted this threat is through third-party intelligence in the form of data feeds. The problem is that these social engineering attacks have a high time dependency. If we solely rely on feeds, we risk delivering data to our customers that may be stale so that solution isn’t terribly attractive. This complicates another issue with common approaches with a lot of the data sources out there: many attempt to enumerate the solution by listing compromised hosts and in practice each vendor seems to see just a small slice of the problem space, and as I just said, oftentimes it’s too late.
We have invested a lot of time in looking at how to avoid the problem of essentially being an intelligence redistributor and instead look at the problem firsthand using our own rich data sources – both external and internal – and really develop a system that is more flexible, timely, and robust in the types of attacks it can address.
Mahdi: In principle, we have designed and built prototypes around Cisco’s next generation phishing detection solution. To address the requirements for both an effective and efficient phishing detection solution, our design is based on Big Data and machine learning. The Big Data technology allows us to dig into a tremendous amount of data that we have for this problem and extract predictive signals for the phishing problem. Machine learning algorithms, on the other hand, provide the means for using the predictive signals, captured from historical data, to build mathematical models for predicting the probability of a URL or other content being phishing.
Read More »
Tags: analytics, Big Data, Cisco, cloud, database, email, innovation, Intelligence, operations, phishing, security, TRAC, TRAC Big Data Analysis
Following part three of our Big Data in Security series on graph analytics, I’m joined by expert data scientists Dazhuo Li and Jisheng Wang to talk about their work in developing an intelligent anti-spam solution using modern machine learning approaches on Hadoop.
What is ARS and what problem is it trying to solve?
Dazhuo: From a high-level view, Auto Rule Scoring (ARS) is the machine learning system for our anti-spam system. The system receives a lot of email and classifies whether it’s spam or not spam. From a more detailed view, the system has hundreds of millions of sample email messages and each one is tagged with a label. ARS extracts features or rules from these messages, builds a classification model, and predicts whether new messages are spam or not spam. The more variety of spam and ham (non-spam) that we receive the better our system works.
Jisheng: ARS is also a more general large-scale supervised learning use case. Assume you have tens (or hundreds) of thousands of features and hundreds of millions (or even billions) of labeled samples, and you need them to train a classification model which can be used to classify new data in real time.
Read More »
Tags: analytics, ARS, auto rule scoring, Big Data, Cisco, database, email, Hadoop, ham, innovation, Intelligence, offline learning, online learning, operations, security, spam, TRAC
Following part two of our Big Data in Security series on University of California, Berkeley’s AMPLab stack, I caught up with talented data scientists Michael Howe and Preetham Raghunanda to discuss their exciting graph analytics work.
Where did graph databases originate and what problems are they trying to solve?
Michael: Disparate data types have a lot of connections between them and not just the types of connections that have been well represented in relational databases. The actual graph database technology is fairly nascent, really becoming prominent in the last decade. It’s been driven by the cheaper costs of storage and computational capacity and especially the rise of Big Data.
There have been a number of players driving development in this market, specifically research communities and businesses like Google, Facebook, and Twitter. These organizations are looking at large volumes of data with lots of inter-related attributes from multiple sources. They need to be able to view their data in a much cleaner fashion so that the people analyzing it don’t need to have in-depth knowledge of the storage technology or every particular aspect of the data. There are a number of open source and proprietary graph database solutions to address these growing needs and the field continues to grow.
Read More »
Tags: analytics, Big Data, Cisco, database, Gremlin, InfiniteGraph, innovation, Intelligence, NoSQL, operations, security, Titan, TRAC, TRAC Big Data Analysis
Recently I had an opportunity to sit down with the talented data scientists from Cisco’s Threat Research, Analysis, and Communications (TRAC) team to discuss Big Data security challenges, tools and methodologies. The following is part one of five in this series where Jisheng Wang, John Conley, and Preetham Raghunanda share how TRAC is tackling Big Data.
Given the hype surrounding “Big Data,” what does that term actually mean?
John: First of all, because of overuse, the “Big Data” term has become almost meaningless. For us and for SIO (Security Intelligence and Operations) it means a combination of infrastructure, tools, and data sources all coming together to make it possible to have unified repositories of data that can address problems that we never thought we could solve before. It really means taking advantage of new technologies, tools, and new ways of thinking about problems.
Read More »
Tags: analytics, API, Big Data, Cisco, database, Hadoop, HDFS, innovation, Intelligence, java, mapreduce, NoSQL, operations, security, Shark, Spark, SQL, telemetry, TRAC, TRAC Big Data Analysis
Hello, and welcome to my inaugural blog! I am happy to be here sharing my thoughts and experiences with you, because I have to tell you: I have the coolest job in the world.
I’ve spent my entire life in retail, starting as a part-time worker while in school and moving up through merchandising and operations to regional vice president at Shopko Stores, Inc., overseeing the work of 12,000 employees. Over more than 20 years, I fell in love with the whole process of retail. When I was invited to work in the retail technology sector, it seemed a natural extension of the work I was already doing. Relatively few tech companies build their solutions around store needs – too often, they tend to focus on technology for technology’s sake. In fact, sometimes retailers do the same thing! I saw an opportunity to impact how vendors – and retailers – think about technologies that truly add value to the business.
Today’s trend toward mobility, or BYOD, is a great example. I’m sorry if this shocks you, but mobility without a strategy has no value at all for the retailer! I have seen stores invest in Wi-Fi networks while continuing to build cell-based apps – this despite Wi-Fi’s higher speeds, more flexible capabilities, and ability to improve the shopping and selling experience. They don’t want employees surfing the Internet, so they block employee access to the network and information that could help improve sales. They understand that shoppers are “showrooming” – sharing opinions and comparison shopping online from the store – but do not leverage the same behavior to promote products and analyze customer trends.
Mobility is a vehicle for improving the business, an extension of overall strategy. (You might like to check out this Lippis Report on “Monetizing Public Wi-Fi in Business to Consumer Relationships.”) I work with companies to help determine how to use such vehicles to define the customer experience, collect and manage large masses of data, and make store operations more efficient. I also help design the Cisco solutions that solve these retail business problems.
Join me on a journey to learn how stores are approaching, managing, and dealing with today’s innovations and how they are meeting customer needs. We’ll talk about how stores are using today’s systems, the most recent trends, the latest research, and how retailers are dealing with this very rapidly changing industry. Please get back to me with your own stories and questions in the comments section.
One more word: I love retail trivia! Comment if you know the answer to this question: What retailer in the country has the highest amount of sales per square foot of its stores?
Tags: byod, Cisco, mobility, operations, retail, Rose Depoe, Shopko, store, trivia, wi-fi