Avatar

Never miss a crisis to drive meaningful change, and this summer will be one of those crisis-driven opportunities.  The entire industry has been operating on the principle that infrastructure should be carefully tested, validated, rolled out and then changed as little as possible. In a pre-Mythos world, this infrastructure might only have a dozen or so critical vulnerabilities over the course of a year, and it would take months before attackers would start to exploit the vulnerability.  So, an annual or semi-annual infrastructure update plan was reasonable and commonplace, especially in the data center.

Two things have fundamentally changed that are now driving this new moment.  The first is that with the discovery of Salt Typhoon and Volt Typhoon, the industry has been put on notice that infrastructure – switches, routers, firewalls, load balancers, etc. – is the new attack surface.  It used to take a very sophisticated attacker to find these infrastructure vulnerabilities and exploit them.  But these new threats, made directly on critical infrastructure, have changed this paradigm.

Secondly, a new generation of frontier AI models have taken a major step forward in their ability to find vulnerabilities in any software system. AI coding tools have been around for years. Previous generations of these tools proved to be a game changer for new greenfield projects, and a small team could achieve huge leaps in productivity because the models were able to understand the full context for new, smaller code bases. But a large, complex code base for a Cisco switch, router, or firewall is extremely complex and has been developed over many years, making it hard for these models to comprehend the entirety of the code base.

Until now.  New frontier models have a trillion parameters in their training, have very large context capability and can understand these complex software products in their entirety. This allows them to find obscure interdependencies that lead to vulnerabilities in the code. Here’s an unofficial milestone: our best human minds have been looking at these code bases for years and have not been able to find these vulnerabilities. AI tools have always been faster than humans and able to ingest more data than humans, but in my personal experience this is the first time an AI model has created a meaningful insight that a human could not. And since these models run at machine speed, they are finding vulnerabilities and corresponding exploitations at a much, much faster rate. Data from Cisco Talos confirms this shift in velocity. We are seeing a marked compression in the time between vulnerability disclosure and the first signs of active exploitation due to attackers’ adoption of AI automation at scale.

Take these two factors together – that infrastructure is the new attack surface and that AI tools are much better able to find and exploit vulnerabilities in that infrastructure – and you can see that we have a crisis on our hands. And that brings me back to my original point: Never miss a crisis to drive meaningful change.

A new operating model for infrastructure: Live Protect

The industry – including vendors, their customers and partners— MUST adapt to this new world. Gone are the days when one could “sweat the asset” and run old systems outside of their support window. Gone are the days of the annual infrastructure upgrade. We finally need to adopt the cloud operating model where we make a lot of little changes on a near continuous basis.  Like painting the Golden Gate Bridge – we don’t want to shut the bridge down and paint it all at once. Instead, it’s much less disruptive if we paint it in small increments while the bridge is in use. Then, once we finish and get to the other side, we simply start over. Taking a lot of little steps instead of one big one enables the job to get done without inconveniencing anyone.

This is the strategic intent behind Cisco Live Protect – a runtime capability embedded directly into NX-OS that allows administrators to apply Cisco-validated compensating controls to infrastructure between regular maintenance upgrades.

To be clear: Live Protect is not a patch. It does not replace the need for core lifecycle discipline or permanent software updates. Instead, it serves as a temporary, targeted shield that mitigates the risk of a specific vulnerability with a few clicks. It is intended to be a “finger in the dike”, an emergency control that is applied to a running system without disrupting that system between more frequent maintenance windows.

We achieve this by using eBPF (Extended Berkeley Packet Filter) to run sandboxed programs within the operating system kernel. This gives us the deep visibility and surgical control to intercept and block exploit attempts at the source without changing the kernel’s source code or requiring a system reboot.

The unique capabilities of eBPF allow us to make very fine grained, pin-point controls that shield a known vulnerability from being exploited. The shield can be as specific as “do not allow this particular process to access this particular file.”  Because these shields are so fine grained and specific, they are designed to have ultra-low false positive rates. In simple terms, that process should NEVER access that particular file. So, we don’t allow it. The other advantage is that the shield has a negligible performance impact.

At Cisco Live, we are running a demo that shows how applying a production shield onto a running switch does not impact the health of the switch. CPU, memory, latency – it’s all preserved as if the shield was never there. This is not like trying to jam a heavy EDR solution into critical systems. This is a laser-focused control that will stop exploitation of critical vulnerabilities between updates.

Importantly, Cisco Live Protect is not a “Do It Yourself” runtime rule. When a vulnerability is disclosed, Cisco Talos analyzes the exploit path and creates a shield. Cisco then validates the shield through our internal engineering processes to help verify it is targeted, effective, and performant. Once deployed, the shield acts as a stopgap, designed to protect the system while the team prepares a permanent patch. Once the patch is applied, the shield is auto-retired.

 

Live Protect changes the PSIRT workflow to be faster and more trustworthy.
Live Protect changes the PSIRT workflow to be faster and more trustworthy.

To ensure comprehensive validation and bolster resilience, we are also excited to announce a collaboration with leading AI red teaming company Armadin. We will share threat research around vulnerabilities with Armadin and work with their teams to test and validate the Live Protect shields we develop to validate shield effectiveness in blocking exploits.

The validation process looks for feasibility and exploit-relevant targeting; supported platforms, releases, modes and limits; and auto-retirement path of the shield after the software is patched. This is designed to give operators Cisco and third-party validated protection with little additional overhead for custom shield design, development, design, support and retirement.

It’s important that network, security, and infrastructure teams trust that these shields will protect supported infrastructure until a permanent patch can be safely deployed. Working with Talos and trusted Cisco collaborators like Armadin help ensure this trust.

Operationalizing security

Live Protect changes the operating model from an all-or-nothing response to a staged, controlled workflow, providing three distinct modes of operation:

  • Monitor Mode: Logs exploit attempts and policy hits without enforcement, allowing teams to verify the threat before taking action.
  • Enforce Mode: Actively blocks or mitigates threat activity in real time.
  • Disable Mode: Temporarily disables a policy without uninstalling it.
Live Protect shields are observable and reversible. Protections are not a black box. Operational safeguards: modes, evidence, rollback and retirement.
Live Protect shields are observable and reversible. Protections are not a black box. Operational safeguards: modes, evidence, rollback and retirement.

This phased approach gives infrastructure teams the ability to shrink the exposure window significantly, reducing the pressure to rush through emergency patches that haven’t been fully vetted while allowing for a more orderly, predictable maintenance schedule.

In a way, Live Protect is like addressing an engine failure on a racecar going 200 km/hour without stopping. Instead of having to unexpectedly pit in the middle of the race, imagine if intelligence in the vehicle is able to detect that an engine component is vulnerable to error, issue a temporary stopgap and continue to operate normally until the next scheduled pit.

This would be huge for the racing world, and it is huge for cybersecurity. Cisco-validated runtime protection provides an intermediate control for vulnerable Cisco infrastructure while giving security teams the time to deploy a permanent patch – without having to shut down or reboot equipment in the meantime. This keeps operations up and running without putting the organization at increased risk.

Where we go from here

Our recent Cisco AI Readiness Index underscores a critical tension in the market. While organizations are prioritizing AI deployment, they consistently rank infrastructure and security readiness as their primary hurdles. They know that an AI-native data center requires a level of performance and security that our legacy models cannot support. Live Protect is our answer to this specific challenge. By embedding security directly into the fabric of the network, we provide the resilience needed to support AI workloads without forcing organizations to choose between innovation and security.

We are currently shipping Live Protect on Cisco data center N9000 switches with plans to expand this capability across campus and brach Cisco Smart Switches, Catalyst wireless controllers, Secure Routers, the SD-WAN Manager, and other infrastructure platforms over the coming months.

Live Protect is an important step in our goal to embed resilience directly into the infrastructure fabric. As AI-powered tooling continues to evolve, the ability to apply Cisco-validated compensating controls will become a baseline requirement for any enterprise environment. We cannot stop the influx of vulnerability disclosures, but we can change how we respond to them. By moving toward a more dynamic, programmable operating model, we can maintain the uptime businesses require while ensuring our infrastructure remains secure.

I will be discussing this in more detail at Cisco Live where I will outline why Live Protect is such a transformative capability and puts us on a path to building infrastructure that is able to defend itself. The future of security is here, and Cisco is leading the way.

Some products or features described may be in various stages of development and offered on a when-and-if available basis. Cisco reserves the right to change delivery timelines and will have no liability for any delays or failures to deliver.

Authors

Tom Gillis

Senior Vice President and General Manager

Infrastructure & Security Group