Ultra Ethernet for Scalable AI Network Deployment

As data centers scale up, scale out, and scale across to meet the demands of artificial intelligence (AI) and high-performance computing (HPC) workloads, networks face growing challenges. Increasing network failures, fabric congestion, and uneven load balancing are becoming critical pain points, threatening both performance and reliability. These issues drive up tail latency and create bottlenecks, undermining the efficiency of large-scale distributed environments.

Figure 1. Challenges with load balancing and congestion management.

To tackle these challenges, the Ultra Ethernet Consortium (UEC) was formed in 2023, spearheading a new, high-performance Ethernet stack designed for these demanding environments. At its core is a scalable congestion control model optimized for microsecond-level latency and the complex, high-volume traffic of AI and HPC. As a UEC steering member, Cisco plays a pivotal role in shaping the foundational technologies driving next-generation Ethernet.

Boosting reliability and efficiency at every layer

This blog explores some of the latest and emerging UEC innovations across the Ultra Ethernet (UE) network stack—from link layer retry (LLR) and credit-based flow control (CBFC) at the link layer to packet trimming at the IP layer and packet spraying and advanced telemetry features at the transport layer.

Figure 2. Optimizing data center network stack for performance.

Reliability of link layer retry

LLR operates at the link layer and is designed to enhance reliability on sensitive network links. These links are often vulnerable to minor disruptions, such as intermittent faults or link failures, which can degrade performance and increase tail latency. LLR provides a hop-by-hop retransmission mechanism where packets are buffered at the sender until acknowledged by the receiver. Lost or corrupted packets are selectively retransmitted at the link layer, avoiding higher-level protocol involvement and reducing tail latency.

Figure 3. Reliable frame delivery with link level retries.

Advanced flow control

Priority flow control (PFC) enables lossless Layer 2 transmission by pausing traffic when buffers fill, but it requires large headroom, reacts slowly, and adds configuration overhead.

CBFC improves upon these shortcomings with a proactive credit system: senders only transmit when receivers confirm available buffer space. Credits are efficiently tracked with cyclic counters and exchanged via lightweight updates, ensuring data is only sent when it can be received. This prevents drops, reduces buffer requirements, and maintains a lossless fabric with better efficiency and simpler configuration, making it ideal for AI networking.

Smarter congestion recovery

Packet trimming operates at the IP layer and enables smarter congestion recovery by retaining packet headers while discarding the payload. When switches detect congestion, they trim and either return the header to the sender (back-to-sender [BTS]) or forward it to the destination (forward-to-destination [FTD]). This mechanism reduces unnecessary retransmissions of entire packets, easing congestion and improving tail latency.

Figure 4. Improving congestion recovery with packet trimming.

FTD mode allows the destination to immediately detect incomplete packets and initiate targeted recovery, such as requesting only missing data. The trimmed packet is typically just a few dozen bytes and contains essential control information to inform the receiver of the loss. This enables faster convergence and low-latency retransmissions.
BTS mode sends a trimmed notification back to the source, allowing it to detect congestion on that specific transmission and proactively retransmit without waiting for a timeout.

Both techniques enable graceful recovery without timeouts or loss by using retransmit scheduling that paces retries and, if needed, shifts them to alternate equal-cost multi-paths (ECMPs).

Flexible load balancing

Flexible load balancing with packet spraying uses traditional ECMP load balancing, which assigns each flow to a fixed path using hash-based port selection, but it lacks path control and can cause collisions. UE introduces an entropy value (EV) field that gives endpoints per-packet control over path selection.

By varying the EV, packet spraying dynamically distributes packets across ECMPs, preventing persistent collisions and ensuring optimal bandwidth utilization. This reduces traffic polarization, improves load balancing, and fully utilizes network bandwidth over time. UE allows in-order delivery when needed by fixing the EV, while still supporting adaptive spraying for other flows.

Real-time congestion management

Congestion management in the UE transport layer combines advanced congestion control with fine-grained telemetry and fast reaction mechanisms. Unlike traditional Ethernet, which relies on reactive signals such as explicit congestion notification (ECN) or packet drops that provide limited visibility into the location and severity of congestion, UEC adds embedded real-time in-band metrics directly into packet headers through congestion signaling (CSIG).

CSIG implements a compare-and-replace model, allowing each device along the path to update the packet with more severe congestion information without increasing the header size. The receiving network interface card (NIC) then reflects this information back to the sender, allowing end hosts to perform adaptive rate control, path selection, and load balancing earlier and with greater accuracy.

Figure 5. Advancing congestion control with real-time telemetry.

UE fabric supports CSIG-tagged packets for congestion management. As the packets traverse the network, each switch updates the CSIG tag if it detects worsening congestion—tracking available bandwidth, utilization, and per-hop delay. Heavily utilized links are immediately encoded in the tag, and the receiver reflects this congestion map back to the sender. Within a single round-trip time (RTT), the sender knows which links are congested and by how much, enabling proactive rate adjustment alternate path selection.

Cisco’s leadership in the future of Ultra Ethernet

Cisco is leading the evolution of UE standards, driving critical innovations for AI and machine learning (ML) networking as AI workload demands skyrocket. As UE specifications advance, Cisco remains at the forefront and ensures customers can adopt UE features such as congestion control, intelligent load balancing, and next-gen transport features.

Future-ready networking with Cisco Nexus 9000 Series Switches

Cisco Nexus 9000 Series Switches are engineered to deliver advanced Ethernet capabilities for the next-generation AI infrastructure. They streamline Day-0 deployments and optimize operations from Day 1 with seamless integration and upgradability. With Nexus 9000 switches, organizations can unlock the full potential of high-performance, flexible, and future-proof AI networking.

Figure 6. Powering AI networks with Cisco Nexus 9000 Series Switches.

Enabling scalable AI infrastructure

As AI and HPC workloads redefine data center networking, the UEC’s innovations—powered by Cisco’s leadership—enable data centers to scale with confidence; meet tomorrow’s challenges; and deliver reliable, high-performance infrastructure for the AI era.

Start your journey to scalable and reliable AI networking

Additional Resources:

Cisco Blogs

Data Center