Since I am apparently feeling a bit nostalgic about VMworld and all the frenetic activity we had about this time last year, getting ready for the announcement of the Cisco Nexus 1000V, I caught up with some of the original players that brought our first softswitch to market.
Saravan is a Director of Engineering within the Server Access and Virtualization Business Unit at Cisco and has been leading the Nexus 1000V engineering organization and product strategy from its inception. In addition to Nexus 1000V, Saravan is currently focused on Cisco’s Data Center, Virtualization and Cloud networking solutions.
Michael is a Distinguished Engineer within the Server Access and Virtualization Business Unit at Cisco and was one of the inventors of the original Nexus 1000V concept. His current focus is on Cisco’s efforts related to data center, server virtualization, and cloud computing.
Paul Fazzone is a Senior Manager, Product Management in Cisco’s Server Access and Virtualization Business Unit and one of the original developers of the Nexus 1000V concept. Paul currently manages all of Cisco’s data center access layer software strategy across the Nexus portfolio.
The interview provides some intersting insight into how we moved from customer “asks” to a shipping product:
OS: What was the initial driver behind the Nexus 1000V? SR: We noticed that the edge of the network was moving from a traditional access layer switch to blade switches with the introduction of blade servers and with the introduction of virtualization, it was moving to the virtual switches in the virtualized servers. To provide rich end to end networking solutions, we wanted to develop a presence in the new “edge” of the network and hence started working on Swordfish (later renamed to be Nexus 1000V). MS: We originally ran into this problem when discussing security solutions with customers. With current virtualization solutions, traffic can flow between virtual machines without ever touching the physical network. With the network access layer blending into the server, we realized we wouldn’t be able to offer a true pervasive security solution without having a presence within the hypervisor. PF: We noticed in 2005/2006 that customers were starting to embrace server virtualization in small pockets for non-production applications. The server teams complained about having to get the network team to trunk vlans to the ESX hosts. The network teams complained about lack of visibility and management to perform troubleshooting when the VM couldn’t be accessed. The security teams were raising red flags because the virtual network infrastructure couldn’t be secured like the physical. We saw these 3 items really impacting customer’s ability to virtualize large portions of their server workloads and we thought a more intelligent and feature rich software switch implementation could address the problem.
OS: How long did we spend developing the switch?
SR: The original concept was conceived around March 2006 and we started moving from power point decks to code around September 2006. We were initially planning to support this in ESX 3.5 and had a version for ESX 3.5. Due to a variety of business reasons, the product was finally launched with vSPhere (aka ESX 4.0)
OS: Does the shipping product pretty much reflect the original design? SR: During the course of the last 3 years, the product evolved in 3 phases :
- The first phase was a prototype on ESX 3.0 with a control plane (VSM) on every single host along with the VEM module being implemented as a NIC Driver and we could validate most of our functionalities with this model. Given that this still required vmware vswitches to be present and configured in a certain way (we were plugging the N1K functionality underneath the vswitches at the same layer as a NIC driver would go) and was “complicated” from a deployment perspective, we decided to do “Sailfish”.
- Sailfish provided a IOS CLI front end (VSM) and managed the vmware vswitches through the vCenter APIs. Even though it provided some amount of visibility and manageability of vmware vswitches, we decided that this architectural model will not scale for supporting additional features like ERSPAN, Netflow v9 and some of the end to end Cisco innovations (like CTS etc.). So, we decided against productizing this model.
- The third phase of the project involved in developing the final product that you see today with a Centralized Control Plane (VSM) managing a number of VEM modules with each VEM module being our own switch running in the hypervisor.
MS: When we proposed Sailfish, it was mainly a testing vehicle for the management paradigm which has since been widely accepted in the virtualization industry. However, the visibility provided by Sailfish was limited to that of the VMware virtual switch. Sailfish also lacked a hypervisor presence which meant that it didn’t give the network administrator the ability to truly enforce policies between the virtual machines. Sailfish was a successful prototype, but it took considerable effort on the part of the engineering team to move from the Sailfish prototype to an enterprise class solution. PF: The Sailfish prototype which we developed 2 years ago was a CLI front end for the VMW vSwitch. It had a Cisco NX-OS look and feel, but only supported the limited features of the VMW vSwitch. While this prototype allowed us to validate parts of our management strategy for the Nexus 1000V product, it did not give us a vehicle to test our data plane features like ERSPAN, Netflow, ACLs, QoS, Virtual Service Domains, DHCP Snooping, etc. Also, the Sailfish model couldn’t provide and maintain a boundary between network and server admins, which was preventing customers from virtualizing DMZs and applications with compliance/security requirements. Developing our our own data plane (or a Cisco vSwitch) that replaced the VMW vSwitch and added significantly to the available network features was key to the success of our solution.
OS: Was it really worth the additional time, money, engineers to build our own data plane? SR: Absolutely. Having our own data plane provided us the ability to support additional visibility (ERSPAN, Netflow v9, etc) and Security (Port Security, DHCP Snooping/DAI/IP Source Guard) features as well as a foundation for us to deliver more innovative solutions targeted at the virtualization and cloud computing environments. PF: Yes, developing our own data plane (or Cisco vSwitch) as opposed to just providing a mgmt front-end was necessary to not only support the features customers require, but to also secure the network mgmt boundary between the server and network admins. With this additional capability, our customers are able to virtualize more server workloads than ever before. Over the past 3 months that the product has been shipping, 100′s of VMware & Cisco customers have added the Nexus 1000V to their data center infrastructure in support of existing virtual machines as well as in support of planned virtualization of DMZ environments and VDI deployments. We will be highlighting both DMZ and VDI solutions with the Nexus 1000V at VMworld next week in the Cisco booth. Stop by and check them out.
OS: The N1KV seems pretty feature complete--on par with our hardware switches--we seem to have gotten networking to the point where it is caught up with the needs of the VM environment--so what’s next, where do we go from here? SR: We see N1KV having much tighter integration with upstream physical network and services infrastructure. MS: Virtualization gives the enterprise a great deal of flexibility in how it deploys its infrastructure. This flexibility requires the network and network services to adapt on demand and not to impede the deployment of virtualization, but accelerate it. I think you will see the Nexus 1000V adding some innovative features in this space. PF: As customer look to virtualize more and more of their Tier 1 and 2 applications, the Nexus 1000V will evolve to offer more advanced distributed services to enhance visibility, security, and storage services just to name a few. With the Nexus 1000V model, because the hardware resources (aka server CPU) are so distributed, you can build massively scalable, centrally managed embedded services to support all sorts of applications. And customer will benefit tremendously, because they can grow their virtual infrastructure and add additional services through simple, non-disruptive software upgrades. You can also expect to see a rapidly evolving ecosystem of development and integration partners to ensure the virtual network infrastructure is delivering what customer need. As a matter of face, we already have many major partners who have evolved and developed their solutions to integrate with the Nexus 1000V and many of those will be on display at VMworld 2009. And since the Nexus 1000V shares a common code base with the rest of the Nexus portfolio, you can expect to see the virtualization specific features showing up in other products in the Nexus portfolio in the near future. This can provide customers with centralized management across a large collection of virtual and physical switches, making data center infrastructures much easier to operate.
OS: Last question--is it true that the business case for the N1KV was done on the back of a napkin? SR: Yes. It was a brown paper napkin (made of recycled materials) typically found in a Cisco Café PF: Yes, it was a napkin. And the codename, Swordfish, came from an “across the parking lot” conversation between 2 of the initial team members: “Are you going to the softswitch meeting?” “Swordfish meeting, what’s Swordfish?” “No, the softswitch meeting”
So, delivering ground-breaking products like the Nexus 1000V is a mammoth, multi-person effort. Beyond Saravan, Michael and Paul, the N1KV has a killer team behind it, in fact the team just won the Pioneer Award, which is the premier engineering achievement at Cisco. Here is a pic of the team, enjoying some time in the sun: