Have a bit of free time this Wednesday morning? If so please feel free to sit in on a Cisco keynote delivered by Mark Balch, Director of Cisco UCS Product Management, as he outlines the challenges faced and the discoveries made with the UCS family and how it has driven revolutionary change and business benefits for today’s modern datacenter.
The Cisco keynote starts WindowsITPro’s “virtual trade show” on Optimizing Your Virtual Infrastructure”. The event brings top industry Microsoft experts together in an online forum affording attendees the opportunity to learn about key datacenter optimization topics and trends.
Our UCS family has been a leader in Data Center optimization since it’s initial release to market five years ago. Having been designed for virtualization from the beginning, UCS is an integrated system that is configured through unified, model-based management to simplify deployment of enterprise-class applications and services running in bare-metal, virtualized, and cloud-computing environments.
Sorry .. I did not mean to steal the title of Hillary Clinton’s book. It so happened that we had to deal with “hard choices” of our own, when we had to decide on the management approach to our new M-Series platform. In the first blog of the UCS M-Series Modular Servers journey series, Arnab briefly alluded to the value our customers placed on UCS Manager.As we started to have more customer conversations, we recognized a clear demarcation when it came to infrastructure management. There was a group of customers who just would not take any offering from us that is not managed by UCS Manager. On the other hand, a few customers who had built their own management framework were more enamored by the disaggregated server offering that we intended to build. For the second set of customers, there was a strong perception that UCS Manager did not add much value to their operations. We were faced with a very difficult choice of whether to release the platform with UCS Manager or provide standalone management. After multiple rounds of discussions, we made a conscious decision to launch M-Series as a UCS Manager managed platform only. Ironically enough, it was one such customer discussion that vindicated our decision. This happened to be a customer deploying large cloud scale applications and did not care much UCS Manager. During the conversation, they talked about some BIOS issues in their super large web farm that surfaced couple of years back. After almost 2 years, they were still rolling out the BIOS updates !
UCS Manager is the industry’s first tool to elegantly break down the operational silos in the datacenter by introducing a policy-based management of disparate infrastructure elements in the datacenter. This was made possible by the concept of Service Profiles, which made it easy for the rapid adoption of converged infrastructure. Service Profiles allowed the abstraction of all elements associated with a server’s identity and rendering the underlying servers pretty much stateless. This enabled rapid server re-purposing and workload mobility as well as made it easy for enforcing operational policies like firmware updates. And, the whole offering has been built on the foundation of XML APIs, which makes it extremely easy to integrate with other datacenter management, automation and orchestration tools. You can learn more about UCS Manager by clicking here.
UCS M-Series Modular Servers are the latest addition to the infrastructure that can be managed by UCS Manager. M-Series is targeted at cloud-scale applications, which will be deployed in 1000s, if not 10s of 1000s of nodes. Automation of policy enforcement is more paramount than the traditional datacenter deployments. Managing groups of compute elements as a single entity, fault aggregation, BIOS updates and firmware upgrades are a few key features of UCS Manager that kept surfacing repeatedly during multiple customer conversations. That was one of the primary drivers in our decision to release this platform with UCS Manager.
In the cloud-scale space, the need to almost instantaneously deploy lots of severs at a time is a critical requirement. Also, all of the nodes are pretty much deployed as identical compute elements. Standardization of configurations across all of the servers is very much needed. UCS Manager makes it extremely easy to create the service profile templates ahead of time (making use of the UCS Manager emulator) and create any number of service profile clones literally at the push of a button. Associating the service profiles with the underlying infrastructure is also done with a couple of clicks. Net-Net: you rack, stack, and cable once; re-provision and re-deploy to meet your workload needs without having to make any physical changes to your infrastructure.
Storage Profiles is the most notable enhancement to UCS Manager in order to support M-series. This feature allows our customers to slice and dice the SSDs in the M-Series chassis into smaller virtual disks. Each of these virtual disks is then served up as if they are local PCIe devices to the server nodes within the compute cartridges plugged into the chassis. Steve has explained that concept elaborately in the previous blog. In the next edition, we will go into more details about Storage Profiles and other pertinent UCS Manager features for the M-Series.
It almost feels like this blog entry should start with: Once upon a time…. Because it captures a journey of a young emerging technology and the powerful infrastructure tool it has become. The Cisco UCS journey starts with the tale of Unified Fabric and the Converged Network Adapter (CNA).
Most people think of Unified Fabric as the ability to put both Fiber Channel and Ethernet on the same wire between the server and the Fabric Interconnect or upstream FCoE switchs. That is part of the story, but that part is as simple as putting a Fiber Channel frame inside of an Ethernet frame. What is the magic that makes this happen at the server level? Doesn’t FCoE imply that the Operating System itself would have to know how to present a Fiber Channel device in software and then encapsulate and send the frame across the Ethernet port? Possibly, but that would require OS FCoE software support which would also require CPU overhead and require end users to qualify these new software drivers and compare the performance of software against existing hardware FC HBAs.
For UCS the key to the success of converged infrastructure was due greatly to the very first Converged Network Adapters that were released. These adapters presented existing PCIe Fiber Channel and Ethernet endpoints to the operating system. This required no new drivers or new qualification from the perspective of the operating system and users. However at the heart of this adapter was a Cisco ASIC that provided two key functions:
1.) Present the physical functions for existing PCIe devices to the operating system without the penalty of PCIe switching.
2.) Encapsulate Fiber Channel frames into an Ethernet frame as they are sent to the northbound switch.
Converged Network Adapter
It is the second function that we often focus on because that’s the cool networking portion that many of us at Cisco like to talk about. But how exactly do we convince the operating system that it is communicating with an Intel Dual port Ethernet NIC and a Dual port 4GB Qlogic Fiber Channel HBA? I mean these are the exact same drivers that we use for the actual Intel and Qlogic card, there’s got to be some magic there right?
Well, yes and no. Lets start with the no. Presenting different physical functions (PCIe endpoints) on a physical PCIe card is nothing new. It’s as simple as putting a PCIe switch between the bus and the endpoints. But like all switching technologies a PCIe switch incurs latency and it cannot encapsulate a FC frame into an Ethernet frame. So that’s where the magic comes into play. The original Converged Network Adapater contained a Cisco ASIC that sits on the PCIe bus between the Intel and Qlogic physical functions. From the operating system perspective the ASIC “looks” like a PCIe switch providing direct access to the the Ethernet and Fiber Channel endpoints, but in reality it has the ability to move I/O in and out of the physical functions without incurring the latency of a switch. The ASIC also provides a mechanism for encapsulating the FC Frames into a specific Ethernet frame type to provide FCoE connectivity upstream.
The pure beauty of this ASIC is that we have evolved it from the CNA to the Virtual Interface Card (VIC). These traditional CNAs have a limited number of Ethernet and FC ports available to they system (2 each) based on the chipsets installed on the card. The Cisco VIC provides a variety of vNICs and vHBAs to be created on the card. The VIC not only virtualizes the PCIe switch, it virtualizes the I/O endpoint.
Cisco Virtual Interface Card
So in essence what we have created with the Cisco ASIC, that drives the VIC, is a device that can provide a standard PCIe mechanism to present an end device directly to the operating system. This ASIC also provides a hardware mechanism designed to receive native I/O from the operating system and encapsulate and translate where necessary without the need for OS stack dependencies, for example native Fiber Channel encapsulated into Ethernet.
At the heart of the UCS M-Series servers is the System Link Technology. It is this specific component that provides access to the shared I/O resources in the chassis to the compute nodes. System Link Technology is the 3rd Generation technology behind the VIC and the 4th Generation technology for Unified Fabric within the construct of Unified Computing. The key function of the System Link Technology is the creation of a new PCIe physical function called the SCSI NIC (sNIC) that presents a virtual storage controller to the operating system and maps drive resources to a specific service profile within Cisco UCS.
System Link Technology
It is this innovative technology that provides a mechanism for each compute node within UCS M-Series to have it’s own specific virtual drive carved out of the available physical drives within the chassis. This is accomplished using standard PCIe and not MR-IOV. Therefore it does not require any special knowledge of a change in the PCIe frame format by the operating system.
For a more detailed look at System Link Technology in the M-Series check out the following white paper.
The important thing to remember is that hardware infrastructure is only part of the overall architectural design for UCS M-Series. The other component that is key to UCS is the ability to manage the virtual instantiations of the system components. In the next segment on UCS M-Series Mahesh will discuss how UCS Manager rounds out the architectural design.
It’s an exciting time in to be in our industry, especially as we witness how technology continues to reshape how we connect and communicate through a myriad of applications and devices not only within our own companies, but also with our customers and partners.
At the epicenter of this technological transformation, we continue to find that the network is what ultimately enables these applications and their users to connect. We also quickly find that if this same network is not ready to deal with the ever increasing influx of devices, new applications with varying traffic patterns, and 24 x 7 access from pretty much anywhere, it can quickly turn into an IT departments nightmare.
It is exactly to deal with these new types of requirements that the award-winning Nexus 9000 Series (made up of both the Nexus 9500 and Nexus 9300 portfolios) was introduced into the market almost 11 months ago. Now, over 600 customers have purchased this new switching family and are experiencing the positive impact that having a high performing, scalable, programmable, and resilient data center network has on application performance and overall user quality of experience in both traditional and Application Centric Infrastructure (ACI) architectures.
Today we are happy to announce the addition of three new switches into the Nexus 9300 Series as well as a 6-port 40Gbps module to deliver more flexibility and form factor options to meet different architectural needs. The new products are:
Cisco Nexus 9372TX: 1-rack-unit switch supporting 1.44 Tbps of bandwidth across 48 fixed 1/10-Gbps BASE-T ports and 6 fixed 40-Gbps QSFP+ ports
Cisco Nexus 9372PX: 1-rack-unit switch supporting 1.44 Tbps of bandwidth across 48 fixed 1/10-Gbps SFP+ ports and 6 fixed 40-Gbps QSFP+ ports
Cisco Nexus 9332PQ: 1-rack-unit switch supporting 2.56 Tbps of bandwidth across 32 x 40Gbps QSFP+ ports
6-port 40 Gigabit Ethernet Module for the Nexus 93128TX, 9396TX , and 9396PX for connectivity options to meet your needs
These new switches deliver high performance, additional buffers, as well as support for VXLAN routing in a compact form factor. In addition to this, support for the Cisco Nexus 2000 Fabric Extenders has also been added to the Nexus 9300 portfolio. So if you already had Fabric Extenders in your data center or are looking for a scalable and operationally simplified architecture – you can now have the best of both worlds.
But it doesn’t end there – in case you missed it, Cisco recently announced the availability of the Application Policy Infrastructure Controller (APIC) making the creation of a more simplified, robust, application-centric infrastructure a reality with the Nexus 9000 Series as the network foundation. You can read more about it here – in Craig Huitema’s blog, which outlines not only new products on the nexus 9000 series including 100Gbps on the Nexus 9500, but also how we have simplified the introduction of the Nexus 9000 and ACI into data centers through different ACI starter kits and bundles. In addition, for those of you that want to deploy the Nexus 7000 in combination with the Nexus 9300s, new bundles that bring together the Nexus 7000 and Nexus 9300 are also available.
As you can see, we continue to deliver the products and architectural options that will allow data centers of all sizes to address increasing and changing application requirements. Between the Nexus 9300 and Nexus 9500 portfolios and their ability to be deployed into 3-tier, spine/leaf, or ACI architectures, customers can benefit from more connectivity options and a diverse set of form factors to meet varying data center needs. I invite you to learn more about the Nexus 9000 Series at www.cisco.com/go/nexus9000.
Cisco UCS M-Series servers have been purpose built to fit specific need in the data center. The core design principles are around sizing the compute node to meet the needs of cloud scale applications.
When I was growing up I used to watch a program on PBS called 3-2-1 Contact, most afternoons, when I came home from school (Yes, I’ve pretty much always been a nerd). There was an episode about size and efficiency, that for some reason I have always remembered. This episode included a short film to demonstrate the relationship between size and efficiency.
The plot goes something like this. Kid #1 says that his uncle’s economy car, that gets a whopping 15 miles to the gallon (this was the 1980s), is more efficient than a school bus that gets 6 miles to the gallon. Kid #2 disagrees and challenges Kid #1 to a contest. But here’s the rub, the challenge is to transport 24 children from the bus stop to school, about 3 miles a way, on a single gallon of fuel. Long story short, the school bus completes the task with one trip, but the car has to make 8 trips and runs out of fuel before it completes the task. So kid #2 proves the school bus is more efficient.
The only problem with this logic is that we know that the school bus is not more efficient in all cases.
For transporting 50 people a bus is very efficient, but if you need to transport 2 people 100 miles to a concert the bus would be a bad choice. Efficiency depends on the task at hand. In the compute world, a task equates to the workload. Using a 1RU 2-socket E5 server for the distributed cloud scale workloads that Arnab Basu has been describing would be equivalent to using a school bus to transport a single student. This is not cost effective.
Thanks to hypervisors, we can have multiple workloads on a single server so that we achieve the economies of scale. However there is a penalty to building that type of infrastructure. You add licensing costs, administrative overhead, and performance penalties.
Customers deploying cloud scale applications are looking for ways to increase the compute capacity without increasing the cost and complexity. They need all terrain vehicles, not school buses. Small, cost effective, and easy to maintain resources that serve a specific purpose.
Many vendors entering this space are just making the servers smaller. Per the analogy above smaller helps. But one thing we have learned from server virtualization is that there is real value in the ability to share the infrastructure. With a physical server the challenge becomes how do you share components in compute infrastructure without a hypervisor? Power and cooling are easy, but what about network, storage and management. This is where M-Series expands on the core foundations of unified compute to provide a compute platform that meets the needs of these applications.
There are 2 key design principles in Unified Compute: