To L2 or not to L2, that is the Question for VMotion:
This is a favorite question of mine (and a very contemporary topic), albeit asked in a different way; Why do you need layer 2 for VMotion. Those bearing the email signature of VCP, will be quick to show off their shiny new badge and tell you it is because of VMotion, because of a port group label, or some other VMware specific component. In fact, there is a more fundamental reason. It is called an ARP cache, IP Address, Default Gateway and DNS.
Remember how hosts find each other – with IP a host looks to the IP address for the destination host, but remember we are speaking about L2 or to make it simple for this explanation, Ethernet. This is accomplished by using the address resolution protocol (ARP) to bind a destination IP address with a destination MAC (because Ethernet is Ethernet and NOT IP – Pet peeve alert when someone refers to Ethernet as an IP network, but I digress). Of course, the source host compares the destination IP with its own IP and mask, which determines whether you send the request to the default gateway – which in turn will proxy ARP for the destination host with its own MAC address, or to ARP on the local subnet. To avoid a constant broadcast storm, clients cache this information for a period of time with the assumption that not much or nothing will change.
VMotion is a VMware technology for servers. Servers are accessed in many ways depending on the type of server, but for now let’s just call the source the initiator and the destination the target. There are a couple of things you want to accomplish when looking at these technologies as it relates to traffic. Architects need to view traffic as egress and ingress.
- Maintain connectivity – assuming live migration (hence the MAC / IP address issue)
- Optimize exit from the local subnet towards the client
- Avoid trombone effect on traffic
- Optimized ingress from client towards target
Number 1 is obvious, but the others may not. Optimize exit and avoiding trombone effects of traffic reference how the target responds to the client via its default gateway. Sine this is a MAC function, and that function is in each devices ARP-cache, if a device moves to another DC over layer 2, then a potential trombone like effect could happen between the target and the default gateway. Enter Cisco’s Overlay Transport Virtualization. This is a mechanism that provides MAC routing, ARP traffic controls, and allows the FHRP operation to allow an Active / Active default gateway function so the moved target will egress via the local default gateway. There are many other benefits to this technology, see www.cisco.com for more information.
Number 4 is about how the moved target is visible from outside the subnet. This is a function of routing. Routers generally forward to the longest prefix in the route table. Summarization is generally done at class boundaries, or at other mask lengths, to reduce the size of the routing table. You could have interesting routing patterns for that target based on it moving, if the network did not see a longer path to the subnet or host in the routing table. This can be done with RHI or mechanisms that advertise a /32 for that host in the route table.
LISP is another protocol that has the ability to work in this environment, but it adds a few more functions. LISP has benefits to the global routing space, as well as the local enterprise, with the basic function of separating locator and end identifier. The key here is that a device can keep its IP address and move anywhere in the infrastructure, without having to readdress or worry about L2 or L3 boundaries. This is done via routing to the locator, versus the end identifier, and then advertising a longer prefix for the EID to address optimal traffic patterns.
All that being said, you may or may not have to worry about L2 for VMotion depending upon your implementation.
That means that if you share my vintage, and know the name Radia Perlman and her famous poem, it is appropriate to provide an update – so here goes:
I am fortunate to be around long enough to see
A multi-path more beneficial than a tree
A path whose crucial property
Is all active bandwidth via ECMP
A path which must be sure to enable
Loop free paths and small MAC tables
First the architect needs to see
That MAC routing can happen at L2 and L3
When VM’s dance from cloud to cloud
Traffic storms and table overload cannot be allowed
If vendor innovations give you a shrill
Let me introduce you to the working groups addressing LISP and TRILL