Investigating OpenStacks Multicast capabilities (Part 1 of 3)

This is my first blog post within the Data Center and Cloud technology area. I recently joined the Openstack@Cisco team under Lew Tucker focusing on advanced OpenStack System research as a Cloud Architect. As part of this role I performed a gap analysis on the functionality (or the lack thereof) of multicast within an OpenStack based private Cloud. Coming from Advanced Services I have seen multicast as a critical component of many datacenters providing group based access to data (streaming content, video conferencing, etc.) . Within a Cloud environment this requirement is almost if not more as critical as it is for enterprise data centers.

This blog will be the first in a series highlighting the current state of multicast capabilities within OpenStack. Here, I focused the analysis on OpenStack Icehouse running on top of Redhat 7 with OVS and a VLAN based network environment. I would like to thank the OpenStack Systems Engineering team for their great work on lying the foundation for this effort (preliminary tests on Ubuntu and Havana).

I used a virtual traffic generator called TeraVM to generate multicast based video traffic allowing for Mean Opinion Score calculation. The Mean Opinion Score or MOS is a calculated value showing the quality of video traffic based on latency, jitter, out of order packets and other network statistics. Historically, the MOS value was based on human perception of the quality of voice calls, hence the word opinion. Since then it has developed to an industry standardized way of measuring the quality of video and audio in networks. It is therefore a good way to objectively measure the performance of multicast on an OpenStack based Cloud. The MOS value ranges from 1 (very poor) to 5 (excellent). Anything above ~4.2 is typically acceptable for Service Provider grade video transmission.

I performed the multicast testing on a basic controller/compute node OpenStack environment, with neutron handling network traffic. In this blog, I focus my analysis solely on opensource components of OpenStack with Cisco products (CSR and N1K) being discussed in a follow-up blog. The tenant/provider networks are separated using VLANs. A Nexus 3064-X is used as the top of rack switch providing physical connectivity between the compute nodes. The nodes are based on UCS-C servers.

Multicast functionality can be split into two parts. The first defines how multicast traffic is handled within a Layer-2 domain and relates to the IGMP snooping capabilities of a Switch. The second part is used to define the Layer-3 multicast tree for anysource-specific(ASM) or source-specific multicast (SSM). This is mainly defined by the switches Protocol Independent Multicast (PIM) functionality. Based on those two main multicast protocols I defined the following three test scenarios:

Multicast source and receiver within the same tenant network and within same compute node. No involvement of physical switch. This scenario tests the IGMP features of the L3 agent and OpenVSwitch respectively.
Multicast source and receiver within the same tenant network but distributed across multiple compute nodes. Similar to (1) but additionally tests IGMP on physical switch.
Provider network, whereby the source and the receiver are on different networks in the OpenStack environment. Here, I only focus on having the source and the receiver on different compute nodes. This scenario will show how OpenStack handles Multicast routing and snooping.
The last scenario looks into having the Source outside the Openstack environment with the receivers distributed across the compute nodes.

My small test environment revealed the following results:

L3 agent does currently not provide any multicast routing functionalities. Hence, having the source inside an OpenStack environment requires a rendezvous point (RP) outside of the virtual cloud. Here, I set up the Nexus 3548 as the RP, which helped solve this problem for testing purposes. Having said that, to fully leverage OpenStack in a multicast enabled setup, multicast routing on the L3 agent is required.!
OpenvSwitch has no IGMP snooping capabilities (none of the IGMP versions is implemented yet). This causes multicast traffic to be flooded within a tenant network as soon as one receiver joins the stream. This is the worst case scenario as it means multicast traffic is flooded throughout the OpenStack environment.
IGMP snooping on the physical switch only solves this problem partially. As soon as a port receives a join request for a certain multicast group, IGMP snooping identifies this port as a receiving port and begins forwarding traffic. However, as this port is connected to a compute node, every VM on that compute node also receives the multicast stream. This means that IGMP snooping only helps to prevent sending multicast traffic to a whole compute node. As soon as one VM joins the multicast group, (3) is rendered ineffective and we hit (2) again.
Beside the obvious limitations of missing IGMP snooping and PIM support the MOS score reveals that OpenStack is capable of providing an average MOS value of 4.6 for an HD 8MBit video file (see screen capture of TeraVm statistics (normal multicast traffic, no scale testing).
The Nexus 3548 has a limitation in that it cannot handle a source and a receiver being directly connected on the same VLAN. This limitation is known and can be worked around with statically creating the outgoing interface for the (S,G) entry.

In summary, I found that OpenStack using the L3 agent and OpenvSwitch does not provide common multicast features. To enable a basic video streaming environment IGMP snooping is required. At the moment, OpenStack is not suitable for multicast specific applications.

As a little sneak preview in part 2 of this series I will be showing how to use Cisco products such as the Nexus 1000V or the Cloud Service Router (CSR) 1000V to introduce multicast functionality. Part 3 will look into an implementation of IGMP within OVS.