AI in Cisco IT Operations: Finding Golden Needles in Ever Larger Haystacks
Customers often ask us whether artificial intelligence (AI) will be the kind of game changer that analysts are predicting. From what I’m seeing, the answer is an unconditional yes.
The trick is figuring out the right use cases. While any computer can calculate pi to a million places faster than I can sneeze, it takes an AI compute space to sort through billions of pieces of data to answer a single question. One drawback: AI doesn’t, by itself, know what questions to ask and what to do with the answers. But if you can frame the question the right way, AI can devour planet loads of information and find significant patterns. Think of it this way: AI can winnow through huge haystacks to find a needle—but first a human needs to define a needle.
Cisco IT began our journey to AI through many disconnected teams. Through the grapevine I’d heard about more than 40 projects, each funded by the group that’s using it (e.g., marketing, InfoSec, contact center). Different teams in Cisco – in IT, in Engineering, in Marketing – have already centralized their AI efforts to increase their scope.
Here are some of the ways we’re putting AI to work today.
Detecting Day Zero malware
Malware detection was our first foray into AI. We acquired Stealthwatch technology in 2015. It sifts through billions of data points about how traffic moves through our network to detect anomalous behavior that could indicate Day Zero malware. The hard part is teaching the AI engine what’s normal and what’s not. Here’s an analogy: if you have a dinner party with millions of guests (it’s a large house), how can you spot the potential thieves? Most security defenses look for signatures – in this case, mug-shots of known criminals as they come through the door. But to catch the thieves who are still unknown, you have to look at their behavior. it’s normal for a party guest to wander through your house, chatting with guests and hovering around the bar; but it’s not normal for them to go immediately to the locked room containing the safe and start looking behind the pictures. When the equivalent of that happens on our network, Stealthwatch raises an alarm or takes action to isolate the threat. We regularly find Day Zero attacks in this way. To do this, we have to ask Stealthwatch to sift through over 28 billion Netflow records every day, and continue to update it on what’s normal, acceptable behavior and what behavior is characteristic of malware attacks. But it enables us to see things that no other tool can.
Discovering malware in encrypted traffic
More than 50% of network traffic is now encrypted, and malware hidden in encrypted traffic can sneak through traditional defenses. But encrypted traffic is so tough to crack (pun intended) that you also need machine learning—telling the AI program to find malware without telling it how. Using an AI program called Encrypted Traffic Analytics, a new Stealthwatch upgrade, we’ve learned some of the clues, such as packet lengths, arrival times, and initial handshake data packets that signify malware even when the stream remains encrypted.. ETA has found malware in encrypted streams that would have slipped right by signature analysis or even AI-based behavior analysis.
Routing traffic over the best circuit based on predicted performance
Our midsize offices get two circuits: MPLS and VPN-over-Internet. Instead of leaving the secondary circuit idle most of the time, last year we started using Cisco Software Defined-WAN (SD-WAN) to intelligently provision secure WAN links and route application-specific traffic to the circuit that’s best for the job. The decision depends on the type of the traffic (voice, video, email, etc.) and current network conditions. (For more, check out this blog by my colleague Carol Goh.) Now we’re making the decision even better by using AI to predict future circuit behavior. Say we’re about a broadcast a live 60-minute webinar. If the MPLS circuit is performing great right now but signs indicate it might degrade in 30 seconds (or 17 minutes), it’s smarter to have the SD-WAN Manager route traffic to the backup circuit.
Identifying problems and recommending solutions before they’re noticed
Cisco Software-Defined Access (SD-A) includes an AI-driven data collection and analysis platform. Cisco IT has already deployed three main Cisco DNA-Center (DNA-C) clusters, one in each of the three global regions (Americas, Europe, Asia). These AI clusters are collecting large amounts of information regarding switch traffic and performance, tracking traffic from each application and user. (This has required us to migrate several thousand switches to Catalyst 9000 models, which act as sensors to stream telemetry data to the Cisco DNA-C for analysis.) Like any AI tool, Cisco DNA-C benchmarks normal behavior and performance. It identifies when performance is degrading and consults over a hundred common Cisco IT network issues. If it finds the right pattern, it will alert a network engineer, point out exactly where in the network path there is a problem, and recommend a solution based on that pattern. The central Cisco DNA-Controller can then automatically make recommended changes to all relevant network devices.
We’ve found the Wireless Assurance part of Cisco DNA-C to be extremely helpful. It can stitch together the path of a person walking across the building floor, connecting from one Access Point to the next, and see exactly where and when their voice or video session starts to run into problems, as well as identifying where in the client device, access point or wired network the problem root cause might be. If it matches one of the hundred-plus common issues, it will recommend a fix and walk the network engineer through that fix.
Data Center Management
Identifying change management problems before they happen, and recommending solutions
AI tools similar to the networking tools described for WAN and LAN are also at work in the far more complex environment of the data center. With thousands of different sets of application performance and security policies in place, enforced by the virtual overlay fabrics across ACI, it’s not easy to deploy new application policies without issues. Cisco Network Automation Engine (CNAE), a new AI tool, will automatically check for new policy conflicts among the millions of different potential connections to see where issues might arise, and recommends different policies to achieve the desired outcome. This keeps application security and performance at a maximum, with minimal provisioning delay due to misconfiguration anywhere in the data center. Cisco IT is running CNAE in the largest of our 3 ACI-fabric data centers today.
Identifying the “next best action” for customers who visit our website
Our small and medium business customers generally do their product research on cisco.com. Over the years we’ve experimented with various ways to follow up with web visitors. Email or phone follow up isn’t particularly effective and can seem like spam.
Now we’re using AI to discover the next best action based on the customer’s business need and previous interactions. Working with our Marketing Analytics team, we built a platform that collects and analyzes information from cisco.com and Salesforce to find out how customers were contacted, what content they were given, and whether the action successfully moved the customer up the sales chain (for example, inspiring them to reach out to us, watch a product video, place an order, etc.) As a result, we now know which customers are likely to respond to certain types of contacts, and at what point in the purchase decision. Preliminary results from 25 pilots in 7 countries are very strong: 4 times better customer response rate, 7-10 times fewer outbound communications that don’t result in a response, and lower costs. Even better, we’ve seen that the longer we tune the data, the better the response rates over time.
Improving the customer journey
Our contact center is one of the most prolific users of AI. A few examples:
- Self-service for callers and agents. Cisco IT worked with the contact center team to build Cisco Answers, an AI-driven knowledge tool.
- Intelligent routing: When customers contact us via voice, email, or chat. we use AI to predict what they need, and then connect them to the best available agent with the right expertise. First-call resolution and customer satisfaction have both improved.
- Business insights from recorded customer calls. Like a lot of companies, we record contact center interactions for agent training. These recordings are also a gold mine of information for marketing, product development, and more. With close to 100,000 of calls/day, a human couldn’t keep up—but AI can. We’ve started using Verint speech analytics to discover trends in our recorded contact center interactions. For example, if we see a spike in the phrase “software defined” in communications with people in a certain region, we might step up our marketing programs for Software Defined Networking in that region.
Optimizing inventory stores by predicting demand
Other grassroots AI projects are cropping up all over Cisco—in sales, marketing, supply chain, and others. Take supply chain. The Cisco server you order today probably isn’t built yet because we use just-in-time (JIT) manufacturing. To make it work, we need the right components on hand, just when we need them. The more accurately we can predict demand a month or two out, the less risk that we’ll under order something like chips, delaying shipment—or over order, tying up capital and creating the risk of loss or damage. It’s looking like AI will help us reduce inventory requirements for UCS server memory chips by a factor of ten.
AIn’t it great? Next steps
The use cases above are just a sampling of the different ways we’re leveraging the power of AI to automate extremely complex decision-making processes. We have the data. We have the compute power. What we need more of is people who understand big data, AI, and machine learning, and we’re currently re-skilling and hiring to close the IT talent gap.
How are you currently using AI in IT operations, and what are your hopes—practical and grand? I invite you to share your thoughts below.