Ready to savor tapas, Gaudi and the most vibrant community of IT professionals in the industry? You must be headed to Barcelona for Microsoft TechEd Europe, 28-31, October. Cisco will be there as well. We’ll be showcasing integrated solutions from Cisco and Microsoft for Windows Server 2003 migrations, cloud and SQL Server.
Cisco and Microsoft have worked closely to integrate Cisco UCS with Windows Server 2012 R2, Hyper-V and System Center 2012 R2, to provide the optimal platform for your Microsoft clouds and applications. Listen to what Microsoft Corporate Vice President Brad Anderson has to say about the Cisco and Microsoft relationship.
Make sure to stop by stand #207 to speak with a Cisco solution expert and take in a demo on: Read More »
This is the final part on the High Performance Data Center Design. We will look at how high performance, high availability and flexibility allows customers to scale up or scale out over time without any disruption to the existing infrastructure. MDS 9710 capabilities are field proved with the wide adoption and steep ramp within first year of the introduction. Some of the customer use cases regarding MDS 9710 are detailed here. Furthermore Cisco has not only established itself as a strong player in the SAN space with so many industry’s first innovations like VSAN, IVR, FCoE, Unified Ports that we introduced in last 12 years, but also has the leading market share in SAN.
Before we look at some architecture examples lets start with basic tenants any director class switch should support when it coms to scalability and supporting future customer needs
Design should be flexible to Scale Up (increase performance) or Scale Out (add more port)
The process should not be disruptive to the current installation for cabling, performance impact or downtime
The design principals like oversubscription ratio, latency, throughput predictability (as an example from host edge to core) shouldn’t be compromised at port level and fabric level
Lets take a scale out example, where customer wants to increase 16G ports down the road. For this example I have used a core edge design with 4 Edge MDS 9710 and 2 Core MDS 9710. There are 768 hosts at 8Gbps and 640 hosts running at 16Gbps connected to 4 edge MDS 9710 with total of 16 Tbps connectivity. With 8:1 oversubscription ratio from edge to core design requires 2 Tbps edge to core connectivity. The 2 core systems are connected to edge and targets using 128 target ports running at 16Gbps in each direction. The picture below shows the connectivity.
Down the road data center requires 188 more ports running at 16G. These 188 ports are added to the new edge director (or open slots in the existing directors) which is then connected to the core switches with 24 additional edge to core connections. This is repeated with 24 additional 16G targets ports. The fact that this scale up is not disruptive to existing infrastructure is extremely important. In any of the scale out or scale up cases there is minimal impact, if any, on existing chassis layout, data path, cabling, throughput, latency. As an example if customer doesn’t want to string additional cables between the core and edge directors then they can upgrade to higher speed cards (32G FC or 40G FCoE with BiDi ) and get double the bandwidth on the on the existing cable plant.
Lets look at another example where customer wants to scale up (i.e. increase the performance of the connections). Lets use a edge core edge design for this example. There are 6144 hosts running at 8Gbps distributed over 10 edge MDS 9710s resulting in a total of 49 Tbps edge bandwidth. Lets assume that this data center is using a oversubscription ratio of 16:1 from edge into the core. To satisfy that requirement administrator designed DC with 2 core switches 192 ports each running at 3Tbps. Lets assume at initial design customer connected 768 Storage Ports running at 8G.
Few years down the road customer may wants to add additional 6,144 8G ports and keep the same oversubscription ratios. This has to be implemented in non disruptive manner, without any performance degradation on the existing infrastructure (either in throughput or in latency) and without any constraints regarding protocol, optics and connectivity. In this scenario the host edge connectivity doubles and the edge to core bandwidth increases to 98G. Data Center admin have multiple options for addressing the increase core bandwidth to 6 Tbps. Data Center admin can choose to add more 16G ports (192 more ports to be precise) or preserve the cabling and use 32G connectivity for host edge to core and core to target edge connectivity on the same chassis. Data Center admin can as easily use the 40G FCoE at that time to meet the bandwidth needs in the core of the network without any forklift.
Or on the other hand customer may wants to upgrade to 16G connectivity on hosts and follow the same oversubscription ratios. . For 16G connectivity the host edge bandwidth increases to 98G and data center administrator has the same flexibility regarding protocol, cabling and speeds.
For either option the disruption is minimal. In real life there will be mix of requirements on the same fabric some scale out and some scale up. In those circumstances data center admins have the same flexibility and options. With chassis life of more than a decade it allows customers to upgrade to higher speeds when they need to without disruption and with maximum flexibility. The figure below shows how easily customers can Scale UP or Scale Out.
As these examples show Cisco MDS solution provides ability for customers to Scale Up or Scale out in flexible, non disruptive way.
“Good design doesn’t date. Bad design does.” Paul Rand
It almost feels like this blog entry should start with: Once upon a time…. Because it captures a journey of a young emerging technology and the powerful infrastructure tool it has become. The Cisco UCS journey starts with the tale of Unified Fabric and the Converged Network Adapter (CNA).
Most people think of Unified Fabric as the ability to put both Fiber Channel and Ethernet on the same wire between the server and the Fabric Interconnect or upstream FCoE switchs. That is part of the story, but that part is as simple as putting a Fiber Channel frame inside of an Ethernet frame. What is the magic that makes this happen at the server level? Doesn’t FCoE imply that the Operating System itself would have to know how to present a Fiber Channel device in software and then encapsulate and send the frame across the Ethernet port? Possibly, but that would require OS FCoE software support which would also require CPU overhead and require end users to qualify these new software drivers and compare the performance of software against existing hardware FC HBAs.
For UCS the key to the success of converged infrastructure was due greatly to the very first Converged Network Adapters that were released. These adapters presented existing PCIe Fiber Channel and Ethernet endpoints to the operating system. This required no new drivers or new qualification from the perspective of the operating system and users. However at the heart of this adapter was a Cisco ASIC that provided two key functions:
1.) Present the physical functions for existing PCIe devices to the operating system without the penalty of PCIe switching.
2.) Encapsulate Fiber Channel frames into an Ethernet frame as they are sent to the northbound switch.
Converged Network Adapter
It is the second function that we often focus on because that’s the cool networking portion that many of us at Cisco like to talk about. But how exactly do we convince the operating system that it is communicating with an Intel Dual port Ethernet NIC and a Dual port 4GB Qlogic Fiber Channel HBA? I mean these are the exact same drivers that we use for the actual Intel and Qlogic card, there’s got to be some magic there right?
Well, yes and no. Lets start with the no. Presenting different physical functions (PCIe endpoints) on a physical PCIe card is nothing new. It’s as simple as putting a PCIe switch between the bus and the endpoints. But like all switching technologies a PCIe switch incurs latency and it cannot encapsulate a FC frame into an Ethernet frame. So that’s where the magic comes into play. The original Converged Network Adapater contained a Cisco ASIC that sits on the PCIe bus between the Intel and Qlogic physical functions. From the operating system perspective the ASIC “looks” like a PCIe switch providing direct access to the the Ethernet and Fiber Channel endpoints, but in reality it has the ability to move I/O in and out of the physical functions without incurring the latency of a switch. The ASIC also provides a mechanism for encapsulating the FC Frames into a specific Ethernet frame type to provide FCoE connectivity upstream.
The pure beauty of this ASIC is that we have evolved it from the CNA to the Virtual Interface Card (VIC). These traditional CNAs have a limited number of Ethernet and FC ports available to they system (2 each) based on the chipsets installed on the card. The Cisco VIC provides a variety of vNICs and vHBAs to be created on the card. The VIC not only virtualizes the PCIe switch, it virtualizes the I/O endpoint.
Cisco Virtual Interface Card
So in essence what we have created with the Cisco ASIC, that drives the VIC, is a device that can provide a standard PCIe mechanism to present an end device directly to the operating system. This ASIC also provides a hardware mechanism designed to receive native I/O from the operating system and encapsulate and translate where necessary without the need for OS stack dependencies, for example native Fiber Channel encapsulated into Ethernet.
At the heart of the UCS M-Series servers is the System Link Technology. It is this specific component that provides access to the shared I/O resources in the chassis to the compute nodes. System Link Technology is the 3rd Generation technology behind the VIC and the 4th Generation technology for Unified Fabric within the construct of Unified Computing. The key function of the System Link Technology is the creation of a new PCIe physical function called the SCSI NIC (sNIC) that presents a virtual storage controller to the operating system and maps drive resources to a specific service profile within Cisco UCS.
System Link Technology
It is this innovative technology that provides a mechanism for each compute node within UCS M-Series to have it’s own specific virtual drive carved out of the available physical drives within the chassis. This is accomplished using standard PCIe and not MR-IOV. Therefore it does not require any special knowledge of a change in the PCIe frame format by the operating system.
For a more detailed look at System Link Technology in the M-Series check out the following white paper.
The important thing to remember is that hardware infrastructure is only part of the overall architectural design for UCS M-Series. The other component that is key to UCS is the ability to manage the virtual instantiations of the system components. In the next segment on UCS M-Series Mahesh will discuss how UCS Manager rounds out the architectural design.
This week is exciting, had opportunity to sit on round table with Cisco’s largest customers on an open ended architecture discussion and their take on past, present and future. More on that some other time let’s pick up last critical aspect of High Performance Data Center design namely flexibility. Customers need flexibility to adapt to changing requirements over time as well as to support diverse requirements of their users. Flexibility is not just about protocol, although protocol is very important aspect, but it is also about making sure customers have choice to design, grow and adapt their DC according to their needs. As an example if customers want to utilize the time to market advantage and ubiquity of Ethernet they can by adopt FCoE.
Moreover flexibility has to be complemented by seamless integration where customers can not only mix and match the architectures/protocols/speeds but also evolve from one to other over time with minimal disruption and without forklift upgrades. Investment protection of more than a decade on Cisco director switches allows customer to move to higher speeds, or adopt new protocols using the existing chassis and fabric cards. Finally any solution should allow scalability over time with minimal disruptions and common management model. As an example on MDS 9710 or MDS 9706 customers can choose to use 2/4/8 G FC, 4/8/16G FC, 10G FC or 10G FCoE at each hop.
Let’s review each aspect of flexibility at a time.
Cisco SAN product family is designed to support Architecture flexibility. From smallest to the largest customers and everything in-between. Customers can grow from 12 16G ports to 48 ports on a single 9148S. They can grow from 48 16G Line Rate Ports to 192 16G Line Rate with MDS 9710 and upto 384 ports on MDS 9710. Finally having seamless FC and FCoE capability allows customers to use these directors as edge or core switches . With the industry leading scalability numbers, customers can scale up or scale out as per their needs. Two examples show how customers can use Director class switches (9513, 9506, 9710 or 9706) based Architecture for End of Row designs. Similarly customers can orchestrate Top of Rack designs using Nexus fixed family or MDS 9148S.
If they want to continue with FC for foreseeable future or have sizable FC infrastructure that they want to leverage (and have option to go to FCOE) then MDS serves their needs. Similarly they can support edge core designs, and edge core edge designs or even collapsed cores if so desired.
If customers need converged switch then Nexus 2K, 5K and 6K provides the flexibility, ability to collapse two networks, simplify management as shown in the picture below.
Customers can mix and match the FC speeds 2G/4G/8G, 4G/8G/16G on the latest MDS 9148S, and MDS 9700 product family. With all the major optics supported, customers can pick and choose optics for the smallest distance to long distance CWDM and DWDM solutions in addition to SW, LW and ER optics choices. In addition MDS 9700 supports 10GE optics running 10G FC traffic for ease of implementing 10G DWDM solutions based on ubiquitous 10GE circuits.
FC is a dominant protocol with DC but at the same time a lot of customers are adopting FCoE to improve ROI, simplify the network or simply to have higher speeds and agility. Irrespective of the needs and timeline MDS solution allows customer to adopt FCoE today or down the road without forklift upgrades on the existing MDS 9700 platforms while leveraging the existing FC install base.
The diagram above shows how customers can collapse LAN and SAN networks on the edge into one network. The advantage of FEX include reduced TCO, simplified operations (Parent switch provides a single point of management and policy enforcement and Plug-and-play management includes auto-configuration).
Another example to allow non transition less disruptive for customers Cisco has supported the BiDi optics on the Nexus product family. This allows customers to use the the same same OM2, OM3 and OM4 fabrics for 40G FCoE connectivity and still don;t have to rip and replace cabling plant.
For customer who are not ready to converge networks but want to achieve faster time to market, higher performance, Ethernet scale economies can use separate LAN and SAN network and use FCoE for that dedicated SAN .
Coupled with broad Cisco product portfolio means that customers have the maximum flexibility to tune the architecture precisely to their needs. Cisco product portfolio is tightly integrated, all the SAN switches use same NxOS and DCNM provides seamless manageability across LAN, SAN, Converged infrastructure to Fabric Interconnects on UCS.
From the last 3 blogs lets quickly capture what are the unique characteristics of MDS 9700 that allows for High Performance Scalable Data Center Design.
Cisco UCS M-Series servers have been purpose built to fit specific need in the data center. The core design principles are around sizing the compute node to meet the needs of cloud scale applications.
When I was growing up I used to watch a program on PBS called 3-2-1 Contact, most afternoons, when I came home from school (Yes, I’ve pretty much always been a nerd). There was an episode about size and efficiency, that for some reason I have always remembered. This episode included a short film to demonstrate the relationship between size and efficiency.
The plot goes something like this. Kid #1 says that his uncle’s economy car, that gets a whopping 15 miles to the gallon (this was the 1980s), is more efficient than a school bus that gets 6 miles to the gallon. Kid #2 disagrees and challenges Kid #1 to a contest. But here’s the rub, the challenge is to transport 24 children from the bus stop to school, about 3 miles a way, on a single gallon of fuel. Long story short, the school bus completes the task with one trip, but the car has to make 8 trips and runs out of fuel before it completes the task. So kid #2 proves the school bus is more efficient.
The only problem with this logic is that we know that the school bus is not more efficient in all cases.
For transporting 50 people a bus is very efficient, but if you need to transport 2 people 100 miles to a concert the bus would be a bad choice. Efficiency depends on the task at hand. In the compute world, a task equates to the workload. Using a 1RU 2-socket E5 server for the distributed cloud scale workloads that Arnab Basu has been describing would be equivalent to using a school bus to transport a single student. This is not cost effective.
Thanks to hypervisors, we can have multiple workloads on a single server so that we achieve the economies of scale. However there is a penalty to building that type of infrastructure. You add licensing costs, administrative overhead, and performance penalties.
Customers deploying cloud scale applications are looking for ways to increase the compute capacity without increasing the cost and complexity. They need all terrain vehicles, not school buses. Small, cost effective, and easy to maintain resources that serve a specific purpose.
Many vendors entering this space are just making the servers smaller. Per the analogy above smaller helps. But one thing we have learned from server virtualization is that there is real value in the ability to share the infrastructure. With a physical server the challenge becomes how do you share components in compute infrastructure without a hypervisor? Power and cooling are easy, but what about network, storage and management. This is where M-Series expands on the core foundations of unified compute to provide a compute platform that meets the needs of these applications.
There are 2 key design principles in Unified Compute: