Cisco Leap Frogs H.264 Video Collaboration with Real-Time AV1 Codec

June 26, 2019 - 4 Comments

Video Collaboration with Real-Time AV1 Codec

Today, at the Big Apple Video conference in New York, Cisco unveiled its real-time, high quality AV1 encoder optimized for video collaboration, that reduces bandwidth, enables next-generation content and avoids the patent issues that have plagued the deployment of HEVC aka H.265.

A lot has changed in the last 16 years. We have self-driving cars, smartphones, social media and virtual reality, but we still rely on H.264 for video compression – a technology that was introduced back in 2003 and is now showing its age. Unfortunately, as Jonathan Rosenberg detailed (here and here), addressing these issues with HEVC (aka H.265) comes with unacceptable patent cost, risk and uncertainty.

AV1 is a product of the Alliance for Open Media (AOM), which was founded in 2015 by Google, Mozilla, Cisco, Microsoft, Netflix, Amazon and Intel and today incorporates a huge consortium of leading technology companies. During AV1 development, and as well as contributing coding technologies, Cisco worked hard with our AOM partners to make sure that AV1 supports the features needed for future collaboration applications, especially for low delay, real-time, and error resilience.

AV1 addresses a growing problem in collaboration: the demand for higher quality and new services is growing as never before, and greater and greater loads will be placed upon networks.


Next-Gen Video Compression Technology

Adopting new codecs to replace the ageing H.264 is increasingly vital to manage this demand.

AV1 is not merely an HEVC replacement, but also addresses a growing problem in collaboration: the demand for higher quality and new services is growing as never before, and greater and greater loads will be placed upon networks. Adopting new codecs to replace the aging H.264 is increasingly vital to manage this demand. AV1 has a very extensive toolset which delivers state-of-the art compression performance. The formidable complexity of the AOM software led to worries that AV1 would be too slow to be practical. Since then speeds for AV1 implementations have increased greatly, but none have been close to real-time – is real-time encoding with AV1 even possible?

Today, our talk at the Big Apple Video conference demonstrated that it is. My colleague Xiaolin Shen and I demonstrated live, real-time AV1 encoding and transmission in a Webex video meeting, with HD video at 720p30 and high- framerate desktop share at 1080p30.  A world first! You can view the talk here. The implementation included a full cloud media stack, with AV1-enabled switching servers deployed on the internet, with complete end-to-end call signaling and resilient media transmission.

Running an encoder at these kinds of speeds inevitably results in some loss in performance. Real-time encoding is a matter of trade-offs: compression power relative to complexity, at realistic levels of CPU usage and speed. So another question is whether Cisco AV1 produces significant compression gains when going so fast.

Again, the answer is Yes. In our demo, we not only managed to encode live 720p30 camera video at half the bandwidth of H.264, but we also encoded high frame rate share at 1080p30 using around 2/3 of the bitrate of H.264 encoding 720p30, all encoded on a commodity laptop.

This means that we can substantially raise quality, while saving bits, all with a very usable CPU footprint. We have found that the real-world compression/speed trade-offs for AV1 are in fact excellent, and better than HEVC.

One valuable feature of AV1 is that each of the small number of profiles support all the tools, including scalability, screen content coding, and tools for AR and VR. In HEVC and H.264 such tools were in poorly supported specialist profiles, and adopting a new profile is very nearly as complex as adopting a whole new codec. At the same time, having these tools available can greatly aid performance. AV1’s simplified profiles will enable more advanced capabilities to be employed more broadly, leading to higher levels of interoperability and consistency across different vendors implementations.

In our presentation we also touched on the challenges of deploying a new codec in collaboration systems. This will take time and effort: legacy systems will be around for a long time, and meetings infrastructure needs to apply complex strategies to support a mixed codec world in the meantime. Cisco is well positioned to introduce AV1 into our portfolio leveraging a combination of multi stream and transcoding to provide backwards-compatibility. As AV1 permeates collaboration, however, it will begin to enable richer and better experiences, even in the most difficult network conditions.

In an effort to keep conversations fresh, Cisco Blogs closes comments after 60 days. Please visit the Cisco Blogs hub page for the latest content.


  1. This is massive. Great work team! Note, I temporarily carried the Cisco AV1 torch at IBC2019, and I'm super-proud to see Cisco go all-in on it. The #FutureOfTV? WebEx. #CiscoTakeMeBack

  2. EposVox

    I was refering to 1080p30 as high frame rate *share* (HFRS) – 30 fps is higher than most screen share refresh settings. You're right that 30 fps is not high for video share. "HFRS" is collaboration jargon I am afraid!

    We demonstrated 720p30 camera encode too. Generally speaking camera video in video conferening is not that hard (head and shoulders and so on). Screen share can be very easy but it can be extremely hard, much harder than camera video, because people are playing games or showing general purpose video. If you look at the demo video we showed playing full-motion, full-screen YouTube content playing through the screen share at 1080p30 with a low latency encode.

    The laptops we used for the demo were current generation MacBook Pros, so quite powerful, but we did not use all the cores by any means.

    Thanks for reading – I hope that helps!

  3. AV1 small name big future. This is the development Video needs to truely take the next step in Innovation

  4. Confusing question about this article – why is 1080p30 being referring to as "high frame rate" when it's not considered that by any standard and the exact same frame rate as 720p30 (you know… 30) which isn't referred to as "high frame rate"? Was "high resolution" meant? Or because for a screen share that's considered high frame rate compared to normal WebEx screen shares? Because inherently a webcam/camera input, even at 720p instead of 1080p, would be much tougher to encode than a screen share.
    Also what were the specs of the laptop?