WebRTC – Bringing Real Time Communications to the Web Natively
Through this blog, I attempt to take you on a journey into the latest disruptive Web Standard called WebRTC. My goal in writing this blog, is to provide readers with some background information and dive a bit deeper into what WebRTC has to offer from the standards, and application developer perspective.
Before I jump in, let me introduce Cisco’s WebRTC crew –
Cullen Jennings, Ethan Hugg, Enda Mannion, Suhas Nandakumar (that’s me :)).
The Web is evolving at a pace faster than ever before. The last few years has seen tremendous innovations in the Web Technologies, Applications, Infrastructure and Services. The advent of HTML5 has redefined the way Web Applications work by bringing in the capabilities & richness of native applications to the Web platform.
HTML5 technologies such as Web Workers, Browser-Native Media, Web Sockets and the like are redefining the roles and capabilities of the browser and the Web, and creating experiences that rival native applications.
Building along similar lines, is the introduction of WebRTC/RTCWeb technological standards into the HTML5 standards basket, which is concerned with bringing rich real-time, interactive communications natively to the browsers.
Real-time communications applications like softphones, conferencing applications are not new to the Web. Applications such as WebEx, JabberWeb and Skype already provide ways for people to communicate and collaborate on the Web today.
These applications do come with limitations however:
- Need to install plugin to get things working.
- With plugins comes the challenges of compatibility on the host platform.
- With plugins comes along security issues, since the application no longer runs in the sandboxed environment of the browser.
- With plugins comes the issue of maintenance. Newer versions of the browser or standards might break existing installations.
- Applications based on plugins lack the rich flexibility of native browser resources due to privilege restrictions. This in turn limits the innovation possible with these applications.
- Proprietary solutions brings in interoperability issues.
WebRTC – What is it ?
RTCWeb (Owned by IETF) and the WebRTC (Owned by W3C) standards is an evolving proposal to bringing the “Rich Interactive Secure Peer to Peer Communications” to the Web in a “Plugin-less Fashion”. These standard bodies together are responsible for defining the following aspects for enabling real-time communications as inherent part of the web infrastructure.
- APIs and access rules for end-user devices such as microphones, cameras etc.
- End-to-end security architecture and protocol.
- NAT traversal techniques for peer connectivity.
- Signaling mechanisms for setting up, updating and tearing down the sessions.
- Support for different media types.
- Media transport requirements.
- Quality of Service, congestion control and reliability requirements for the session over the Best-Effort Internet.
- Identity architecture and mechanisms for peer identification.
- Codecs for audio and video compression.
With such a detailed charter, WebRTC/RTCWeb has the potential to impact the way people communicate on the web. With the tremendous increase in the usage of browsers and always available nature of the Web, the combination of “Browser and the Web” revolutionizes real-time communications on one end and possibly poses potential challenges to legacy/traditional solutions of today.
The picture below captures various outcomes from the IETF and W3C standard bodies:
- GetUserMedia API specification defines requirements for a Web application to access end-users media sources such as camera, microphone
- PeerConnection API specifies SDP-based session description APIs and the state machine to session setup, update and tear-down between the peers.
- Data Channel API will enable peer-to-peer exchange of arbitrary data, with low latency and high throughput.
- Under the hood, the browser is responsible for:
- Ensuring end-to-end security for media and data sessions via DTLS.
- Performing NAT traversals procedures for connection setup based on Interactive Connection Establishment (ICE).
- Establishing media transport based on RTP and UDP.
- Setting up data-channel transport based on SCTP and UDP
- Enabling feedback reports for the session based on RTCP.
- Encoding and decoding audio and video streams.
These requirements may evolve over time till all the aspects of the standards are frozen.
Use-Cases and Architecture Preview
Shifting gears, let me introduce few sample use-cases that are quite easily achievable with WebRTC
1. Seamless Conferencing:
2. Personal Shopper/Instant Customer Care:
This use-case captures consumer-to-business scenario where a web application provider like Amazon, might provide “Click to Call” service to their customers with few WebRTC APIs. Such a service would enable converting a mundane search into rich 2-way audio and video interaction with the customer care representative thus implying higher transaction conversion ratios.
3. Multimedia based Rich 3D Games:
This scenario enables audio, video, and data streams into gaming environments with WebRTC, HTML5 and WebGL APIs. Such an combination provides options for combining real-time media with WebGL canvas innovatively
Architecturally, a WebRTC based system falls into following broad categories
Browser <-> Browser Browser <-> VOIP End Point
The web server can be any application server that provides required identity and authentication procedures for the end-users at the minimum.
In either case secure media flows directly between the peer. In the VoIP scenario an intermediate gateway setup is required to handle signaling and any required translations depending upon the capability limitations by the VoIP endpoint. This might include things like, “unable to perform ICE check”, “no support for secure RTP”, and so on. A detailed analysis of architectural solutions and potential differences between these systems are out of the scope of this blog.
Cisco’s Involvement with the RTCWeb
Cisco has been actively participating in standards development and the implementation.
With respect to standards participation, Cisco has taken leadership roles in help shape the requirements from both the IETF and W3C perspective.
At the IETF, a working group (WG) called RTCWeb is been responsible for driving “on-the-wire” standards. Cullen Jennings from Cisco is Co-Chairing this WG. He is also Co-Author on the W3C specification that is responsible for defining the browser API requirements. Aside from these, there has been lot of thought leadership established across several areas of standards such as QoS, codecs, API development, and signaling.
On the implementation front, Cisco open-sourced its VoIP code-base from our soft-phones with the following components
- RFC3261 Compliant SIP stack.
- RFC4566 Compliant SDP Engine
- Call Control Application Logic for Soft-phone Application.
The open source project can be found as a GitHub project under the name Ikran.
The Cisco team is working with the Mozilla for joint implementation of WebRTC standards into Firefox. For this purposes the components (2) and (3) from Ikran are being reused for implementing session control and session description aspects of the PeerConnection object.
WebRTC in Action – Getting Hands Dirty
It’s time to get hand’s dirty and try few demos in action. WebRTC for desktop is now in Firefox Nightly and also in Firefox Aurora releases. The difference being, Nightly versions has the latest and hottest up-to-date fixes while Aurora being pre-beta build is a slightly older but a stabler version.
For the purpose of this blog, let us consider using Firefox Aurora build, the setup instructions below applies for Firefox Nightly as well.
The demo page enables one to try out following aspects
– GetUserMedia based audio capture, video capture and picture snapshot.
– PeerConnection based 2-way audio/video call.
– DataChannel based session.
Let’s get started ….
Step1: Getting Firefox Aurora
Step2: Configure Aurora
Currently the code is behind preference setting. To enable the WebRTC code, browse to “about:config” and do the following
2a. Set media.navigator.enabled to true to enable calls to GetUserMedia only.
2b. Set media.navigator.permission.disabled to true to automatically gain permission to access camera/microphone
2c. Set media.peerconnection.enabled to true to enable PeerConnection functionality
Step3: Running the demos.
On your Aurora build, browse to WebRTC Demo Page and try out the demos listed above.
Interested in Learning More ?
1. Cullen Jennings has provided a detailed explanation about everything here.
2. Justin Uberti, from Google explains WebRTC implementation in Chrome here
3. IETF Standards Page
4. W3C Standards Page
5. Mozilla Wiki and Blog Pages
If interested, I am more than happy to discuss further with anyone who wants to hear the gory details.
Thanks for reading and enjoy the WebRTC revolution.