End-to-end AI networking for the age of inferencing

Talking to customers over the last few months, I’ve noticed a major shift in the artificial intelligence (AI) conversation. We are no longer in the early days anymore. The “What’s the value of AI?” phase is behind us. Today, we’re seeing maturity, initial wins, and a collective move into the “age of inferencing.”

But even with this progress, some big questions keep surfacing: How do we scale these AI workloads without the network becoming a bottleneck? How do we keep it secure in a world of evolving threats like frontier AI models such as Mythos—which are causing us to rethink how we approach AI security and data integrity at a foundational level? And perhaps most importantly: How do we manage the new realities of AI without a total forklift upgrade of our existing environments?

Today, I’m excited to announce a series of innovations designed to deliver true “end-to-end AI networking,” making your AI infrastructure more scalable, more secure, and significantly simpler to manage.

The “black box” problem: Bridging K8s and the fabric

Today, about two-thirds (66%) of organizations are hosting generative AI models using Kubernetes for some or all inference workloads.¹ But Kubernetes often acts like a “black box.” It’s intentionally designed to abstract the infrastructure, which is great for agility until something goes wrong.

When a performance issue hits, the app team says, “Everything looks fine here,” and the network team says, “Everything looks fine here.” You’re left on a bridge call for hours, staring at a handful of IP addresses, trying to figure out if the failure is in the physical fabric, the virtual NIC, or the TCP stack. It’s not just a configuration problem; it’s an architectural blind spot.

With our acquisition of Isovalent—the industry standard for eBPF-powered networking—we are changing that. By integrating Isovalent’s technology into Cisco Nexus One, we are extending a consistent, standards-based network from the fabric directly into the container environment.

This isn’t just stitching tools together; it’s a seamless, continuous model. You get real-time, workload-to-workload visibility across the full path—from pod to fabric to external service. The network is no longer blind, and the app is no longer isolated. And because we can now synchronize security policies across both the network fabric and the Kubernetes environment, your compliance team finally has one place to verify, enforce, and prove security. When the workload moves, the policy moves with it.

The Isovalent integration is part of a broader Cisco strategy: establishing standards-based EVPN/VXLAN as the common networking framework for every hypervisor and container platform in the data center, with operational consistency through Nexus One, and delivering unified visibility with an integrated platform. Cisco has driven standards-based EVPN/VXLAN integration across multiple workload platforms, helping ensure that regardless of the virtualization or container technology that an enterprise chooses, the Nexus One fabric serves as the common networking substrate with consistent operations.

Accelerated troubleshooting: AgenticOps with Cisco AI Canvas for data center networking

Even with better visibility, troubleshooting can still be a manual, high-friction process. That’s why we’re extending our capabilities for AgenticOps with Cisco Cloud Control for multiple domains, including data center infrastructure.

Take that troubleshooting scenario from earlier as an example. With Cisco AI Canvas in Cisco Cloud Control, the NetOps and platform teams can join a shared space where they’re seeing the same view of the issue and the AI-powered remediation recommendations. When a ticket comes in about an application running slow, instead of the usual finger-pointing, AI Canvas will pull data from the original ServiceNow ticket, analyze the HTTP and TCP data from the Kubernetes layer, and identify the root cause. It might say, “You need to reconfigure the QoS policy here to improve traffic distribution,” offering a resolution that the teams can agree on.

It’s not just an alert; it’s an agent working on your team, providing the context needed to remediate issues in minutes rather than hours, while still keeping a human in the loop.

Efficiency at scale: Automating multi-tenancy and AI job segmentation

Today, multi-tenancy is often achieved through manual, disconnected systems. If you have 10 tenants, it’s manageable. If you have 500, the system breaks. Most operators end up dedicating servers to specific tenants, which leads to lower monetization of XPU infrastructure.

We’re changing that through our partnership with Rafay. When a tenant is created in the Rafay AI infrastructure orchestrator, Nexus One automatically provisions the network constructs—VRFs, VLANs, and subnets—aligned to where those GPUs sit. The fabric understands the tenant’s intent, and it enforces that isolation at line rate.

And now we’re going a step further. We know that one tenant doesn’t mean one job—a bank might be running training and inference simultaneously; a pharmaceutical company might have regulated R&D workloads next to non-regulated ones.

We are introducing Cisco AI Job Segmentation using our patent-pending VXLAN Endpoint Security Group (ESG). We can now map the workload identity—the specific Job ID—into the VXLAN header. The fabric now understands not just which tenant a packet belongs to, but which job it belongs to. We can enforce fine-grained security, like making sure only “Job A” can access “Store C,” even if both jobs are in the same tenant.

Security fused into the network: Live Protect and post-quantum cryptography in the age of frontier AI models

Finally, let’s talk about security and AI. For decades, we’ve had the same process: find vulnerability, develop patch, work to deploy the patch, and hopefully get that done before an exploit happens.

That’s no longer going to cut it in a world with frontier AI models like Mythos.

These frontier AI models fundamentally change the threat landscape by operationalizing AI to identify and weaponize vulnerabilities at machine speed. Waiting for a standard change window to patch is an invitation for trouble. That’s why we introduced Cisco Live Protect last year—a software feature embedded in our network operating system and control plane that allows us to apply compensating controls to a switch or router without a reboot. It provides a rapid, immediate response to new vulnerabilities, allowing you to patch in an orderly fashion when it fits your schedule.

We are also tackling the looming reality of “harvest now, decrypt later.” Attackers are currently vacuuming up encrypted data, betting they can crack it once quantum computing matures—a threat that frontier AI models only accelerate. To counter this, we are rolling out a phased, three-level post-quantum cryptography (PQC) roadmap for our Nexus One and Cisco N9000 Series Switches. We are embedding quantum-safe standards—from MACsec and SSH to hardware-level identity through tamper-proof Secure Unique Device Identifiers (SUDIs)—directly into the fabric. By following this roadmap, security teams can future-proof their investments, secure sensitive data streams today, and systematically neutralize the risk of future decryption, all without the need for a forklift upgrade.

The road ahead: Building the networking foundation for future AI innovation

By bridging the gap between the network and the container, we aren’t just solving today’s bottlenecks—we’re building the foundation for the next generation of AI innovation. I couldn’t be more excited about where we’re headed.

Power your AI innovation with Cisco AI Networking

Additional resources:

¹Kubernetes Established as the De Facto ‘Operating System’ for AI as Production Use Hits 82% in 2025 CNCF Annual Cloud Native Survey. Cloud Native Computing Foundation, January 20, 2026.

Cisco Blogs

Data Center

End-to-end AI networking: Cisco’s answer to the inferencing era

The “black box” problem: Bridging K8s and the fabric

Accelerated troubleshooting: AgenticOps with Cisco AI Canvas for data center networking

Efficiency at scale: Automating multi-tenancy and AI job segmentation

Security fused into the network: Live Protect and post-quantum cryptography in the age of frontier AI models

The road ahead: Building the networking foundation for future AI innovation

Power your AI innovation with Cisco AI Networking

Authors

Murali Gandluru

Vice President of Product Management

Data Center Networking

Leave a Comment Cancel reply