More VXLAN Q&A
So, the recent VXLAN announcement we made with some of our partners (including VMware, Citrix, and RedHat) certainly generated a lot of questions.
So let me cover some of the most common ones I have heard.
Why do we need a new protocol? We already have VLANs, MPLS, OTV, LISP and a whole host of other acronyms to contend with!
Let’s be clear—VXLAN is not some magical protocol that will eliminate the need for networking folks to do actual network design and traffic engineering. That being said, it does address a couple of issues introduced by large scale virtualization and the desire for fluid virtual machine (VM) mobility:
As vCloud Director replicates a vApp, it also replicates the VM’s MAC and IP addresses, which is less than ideal, so we need a way to isolate vApp traffic. vCDNI has been VMware’s answer to date, but its MAC-in-MAC scheme creates its own challenges around visibility for mgmt and troubleshooting, efficient link utilization and extending across L3 boundaries. Another approach, VLAN-based isolation, has a scalability issue, since 802.1Q only supports around 4K VLANs—might be reasonable now, but probably not down the road.
VXLAN uses a different encapsulation scheme, which gives us a lot more segments for isolation (up to 16M), gives us the transparency we have traditionally had for management, compliance, etc, and allows us to extend segments across L3 boundaries. Implementation of VXLAN is simple and elegant, using existing well-understood technologies: tunneling and multicast.
So, I can get rid off all those other technologies?
Probably not: it’s like saying “if I start implementing IPv6, why do I need IPv4?”. As I said earlier, VXLAN is not magical – it works in concert with rest of the networking infrastructure. While the server and network teams can work in a more loosely coupled manner, it does not mean the networks folks no longer have to do any kind of design or engineering to ensure things run properly.
There are a couple of things to bear in mind. vApps need VLANs for access to the outside world—otherwise why are you running them? ☺ Also, most environments are going to continue to remain a mix of physical (bare metal) and virtual to any practical approach is going to have to effectively support this dichotomy. Most customers are going to need to run some combination of VXLAN, OTV and LISP in their environments, depending on what their specific requirements are. Network functionality like that provided by LISP and OTV is required to complement VXLAN in two areas: Enable VXLAN to co-exist with VLANs and, support non-virtualized hosts and network services. Non-virtualized hosts and services are not VXLAN enabled at this early stage, but can fully operate in a cloud environment by leveraging the benefits of LISP, OTV and all the network intelligence we have developed over the years.
We just talked about what VXLAN brings to the party, so lets look at the other two:
At the end of the day, VXLAN provides you two functions: LAN extension and scalable network isolation. With regards to the former, as many of us have learned the hard way, LAN extension needs to be deployed with care lest you L2 problems in one location suddenly become L2 problems in all your locations—think things like broadcast storms or some worm outbreak, so the first thing that OTV does is provide you fault isolation amongst your locations. As noted earlier, one of the underlying mechanisms of VXLAN is multicast. One of the benefits of OTV is that it can provide a multicast transport over a network core that may not support multicast pervasively, or not even support multicast at all. Since we are talking about VMs in motion, one of the nice things about OTV is that it adapts the multicast distribution trees automatically, so as a VMs move around the network, OTV will dynamically reconfigure itself to guarantee optimal distribution of multicast traffic. Finally, since few locations are 100% virtualized, OTV can act as a bridge between VXLAN segments and devices, say physical hosts that are not VxLAN-conversant.
LISP is the key to true workload mobility. We have talked about how VXLAN lets you extend a vApp’s segment across L3 boundaries. The fly in the ointment is that the IP gateway for the vApps’ VMs never changes from the initial configuration. This can lead to sub-optimal routing at best, or instability and service interruption at worst. Say you have a vApp running in San Francisco and you vMotion some of the VMs to LA. Traffic between clients on the Internet and the servers now in LA will still need to go through San Francisco to enter and exit the data center. At the very least, this will burn bandwidth and add application latency, but if your San Francisco location happens to be going off-line because of brownouts (which is why you were moving workloads to LA in the first place) you have a problem. LISP gives you the mobility you were looking for, by enabling optimal reachability of moving servers even when they move within an extended LAN such as that provided by VXLAN and OTV. LISP delivers this granular IP mobility while maintaining the desirable characteristics of L3: optimal paths, minimal failure propagation and scale. For details on LISP and workload mobility, check out this white paper.
Its worthwhile noting that both LISP and OTV work silently in the background—they don’t undermine that N1KV + vCD integration (with VXLAN) that customers told us they appreciate.
How do I play!?
Well, we have a beta kicking off this month, so ping your Cisco rep to get more details. The Nexus 1000V with VXLAN works with vCD 1.5. It also works with vSphere 4.1 and 5.