Every few years, a new class of workload arrives that breaks the assumptions of the previous generation and forces us to rethink not just the architecture, but the underlying physics of how bits move. Generative AI is one such moment.
What makes this inflection point different is that we’re not just asking networks to carry more traffic. We’re asking them to sustain a tightly synchronized, latency-sensitive compute environment that spans multiple data centers separated by tens or hundreds of kilometers. That’s a fundamentally new problem, and solving it requires co-innovation across silicon, systems, and optics in ways the industry hasn’t attempted before.
More GPUs, more intelligence
There is a simple and profound truth at the heart of the AI era: more GPUs unlock more intelligence. Every significant leap in AI capability over the past six years—from language models that could complete a sentence to systems that can reason, code, and create—has been driven by training on larger clusters of GPUs. Scaling from hundreds of GPUs to hundreds of thousands has been instrumental in getting us here.
But that trajectory has run into a hard physical wall: power. A single high-performance GPU draws 700 watts (W) or more at full load. A rack of GPU servers draws 80 to 150 kilowatts (kW). A training cluster large enough to develop a frontier AI model can consume 10 to 15 megawatts (MW), roughly equivalent to a small town’s electricity demand. And the most advanced models being trained today require clusters that approach or exceed 100 MW at a single site, representing 60,000 to 70,000 GPUs or more.
At this scale, power has become the binding constraint. Power availability and cost in densely populated areas, combined with the sheer magnitude of electricity required, means that the largest AI training clusters have outgrown what any single facility can support. Data centers are migrating to less-populated regions with cheaper energy, making interconnection of GPUs across data centers a prerequisite. When the GPUs needed to train the next generation of AI are spread across two or more sites, the network connecting them must perform as if they were in the same room. This is why scale-across exists.
The bandwidth hierarchy: From DCI to scale-across
To understand this new challenge, it helps to trace how data center bandwidth requirements have evolved. Each generation has been more demanding by orders of magnitude.
Traditional data center interconnect (DCI) set the baseline. DCI connects data centers to other data centers and end users over wide-area networks. It was built for redundancy, geographic reach, and enterprise workload distribution.
Front-end networks emerged next to handle traffic between users, applications, and cloud services—video streaming, social media, cloud-native applications—at roughly 7x the bandwidth of DCI.
The real step change, scale-up networks, emerged with AI. As data centers pivoted from general-purpose compute to AI powerhouses, standard servers gave way to GPUs and specialized accelerators. Within a rack, these devices are interconnected in scale-up domains at roughly 504x the bandwidth of DCI—connected by high-speed copper at 100 to 200 Gbps per lane across distances of up to 3 meters (m), appearing to the software stack as a single logical compute unit.
Scale-out networks then extended the AI fabric across an entire data center, connecting racks of GPUs at approximately 56x DCI bandwidth through high-speed Ethernet and InfiniBand switching fabrics. Once distances grow beyond a few meters—spanning rack rows and data center floors at reaches of 100 meters to 2 kilometers (km)—copper can no longer maintain signal quality at these speeds, and pluggable optics become important. As a result, technologies like co-packaged optics and linear pluggable optics emerged to address the power and density consequences of deploying optics at this scale.
And now we arrive at scale-across, the frontier where the physics get genuinely hard.
Scale-across networking: The promise—and the challenge
Scale-across is the answer to the geographically distributed GPU problem, and it’s not simply DCI with higher bandwidth. Traditional DCI connects CPUs across data centers and to end users, handling many low-bandwidth, loss-tolerant, asynchronous flows that grow linearly. Scale-across connects GPUs and scale-out networks, carrying a small number of extremely high-bandwidth, loss-intolerant, synchronous, long-lived flows that cannot tolerate dropped packets or timing mismatches without forcing a full restart of the AI job. And those flows are growing exponentially.
The scale difference alone is striking. Scale-across networks require somewhere between 12,000 and 32,000 ports—and context makes clear why. A 100 MW data center houses roughly 60,000 to 70,000 GPUs, each generating up to 800 Gbps of back-end traffic. Connecting those GPUs within a facility already demands thousands of high-speed ports; extending that cluster to a second site—while preserving the deterministic-latency, lossless performance of a live AI training job—requires thousands more coherent optical ports at the scale-across layer. By comparison, traditional DCI typically uses 1,000 to 2,000 ports to handle the same two facilities’ enterprise traffic. Both use coherent optics over distances exceeding 10 km, and both require robust security. But the scale, traffic characteristics, and performance tolerances are in an entirely different category.
Traditional lossless networks rely on reactive congestion control, which struggles over long fiber distances because the speed of light means roughly 100 MB of data is in flight on a 100 km link before flow control can even respond, consuming nearly half a modern switch’s buffer for a single port and priority. That is why deep-buffered routers, not switches, are the right tool here.
AI workloads, however, offer an important advantage: they are predictable enough to make proactive congestion control possible, orchestrating traffic to avoid congestion before it occurs. But link failures at scale are unpredictable, and when they happen, you also need reactive control with deep buffers to absorb the disruption without forcing the entire job to roll back to a checkpoint and incurring additional expense.
This is where silicon and coherent optics converge around a single imperative: reliability. At the scale of AI training, link failures are inevitable. A single security breach or episode of packet loss can erase thousands of GPU-hours of work. End-to-end hardware-based security, deep buffering for failure recovery, and proactive congestion control are now table stakes. Reliability is fundamental to Cisco converged AI infrastructure, embedded at every layer.
Power as the defining constraint and opportunity
Power has become the lens through which every architectural decision in AI networking must be evaluated.
At the silicon level, power efficiency is the deciding factor between a router that is viable for high-density AI scale-across and one that falls short.
At the optics level, the same logic applies, but the power challenge compounds as the network grows. Pluggable coherent optics reduce power consumption by eliminating transponders and associated client optics and allowing direct router-to-router connectivity. Freed-up power can be redirected to GPUs delivering compute performance. But coherent pluggables solve only part of the problem. As scale-across deployments grow from thousands of coherent ports to tens of thousands, the fiber infrastructure connecting those data centers must scale in parallel. More ports mean more fiber connections, and more fiber connections mean more optical amplification capacity along those routes. Each of those amplification sites consumes power of its own. The result is a two-sided power challenge: efficiency gains inside the data center at the router-optics interface must be matched by efficiency gains along the fiber plant that connects them. Finding the right balance between performance and power at every point in the network is now a first-order engineering problem.
The implication is clear: scale-across cannot be designed by optimizing silicon and optics independently. They must be co-designed from the ground up.
How Cisco is converging silicon and optics in scale-across solutions
At Cisco, we have been building toward this convergence for years. The combination of the Cisco Silicon One–powered routing systems and coherent optics portfolio offers an integrated approach designed specifically for what scale-across demands:
Cisco Silicon One: Cisco Silicon One P200 powers the Cisco 8223 and Cisco N9000 systems to deliver an industry-leading 51.2 Tbps capacity, tailored for distributed AI workloads. As of fiscal Q3 2026, Cisco also announced two new hyperscaler system design wins powered by Silicon One P200 for scale-across use cases. This is foundational to the expected forecast growth of Cisco AI orders in fiscal Q4 2026 to more than $3.6 billion. These systems converge routing and switching with impressive power efficiency, programmability, and security, enabling hyperscalers, neoclouds, and sovereign clouds to confidently architect geographically distributed AI environments.

Figure 1. Cisco Silicon One P200 with Cisco systems
Coherent modules: Cisco is the coherent market-share leader and pioneer in coherent pluggables. 400G ZR/ZR+ and 800G ZR/ZR+ coherent pluggables are already being deployed in scale-across networks, with over 750,000 400G DSP ports shipped and over 40,000 800G DSP ports shipped. The broad Cisco coherent pluggable portfolio supports the mature standards defined by OIF, OpenROADM, and OpenZR+ that have enabled the mass adoption of router-based optics.

Figure 2. Cisco QSFP-DD 800G ZR/ZR+ coherent pluggable

Figure 3. Cisco OSFP 800G ZR/ZR+ coherent pluggable
Open line systems: Cisco offers two options for customers depending on the use case:
- The new Cisco Open Transport 3000 Series open line system offers a multi-rail architecture that allows multiple fiber pairs to operate in parallel so it can handle multi-petabit traffic over long distances. It also supports both C-band and L-band wavelengths, optimizing power, space, and scalability for scale-across networks.
- The Cisco NCS 1014 metro open line system offers enhanced optical visibility and control that enables coherent pluggable deployments at scale in metro scale-across use cases. This includes integrated coherent probe, dynamic gain equalization, OTDR, and spectral power monitoring that simplify deploying and operating coherent optics that are disaggregated from line systems.
Together, these capabilities form a scale-across portfolio purpose-built for the reliability, power efficiency, and scalability that AI infrastructure operators require.
What’s next for scale-across
The scale-across era is still early. Networks that will power the next generation of AI intelligence must be co-designed, from the coherent DSP and photonic integration at the optical layer, through the silicon and its buffer architecture, to the system-level thermal and power envelope that determines what is actually deployable at hyperscale.
At Cisco, that’s exactly how we are approaching scale-across. The Cisco Silicon One adaptive systems and coherent optics portfolio are designed in close collaboration internally and with our customers to meet the specific demands of scale-across. As AI continues its exponential trajectory, these technologies will be the key to unlocking new levels of intelligence and enabling the next generation of AI infrastructure.
Explore Cisco Silicon One, a scalable and programmable unified networking architecture
Additional resources