Sorry .. I did not mean to steal the title of Hillary Clinton’s book. It so happened that we had to deal with “hard choices” of our own, when we had to decide on the management approach to our new M-Series platform. In the first blog of the UCS M-Series Modular Servers journey series, Arnab briefly alluded to the value our customers placed on UCS Manager.As we started to have more customer conversations, we recognized a clear demarcation when it came to infrastructure management. There was a group of customers who just would not take any offering from us that is not managed by UCS Manager. On the other hand, a few customers who had built their own management framework were more enamored by the disaggregated server offering that we intended to build. For the second set of customers, there was a strong perception that UCS Manager did not add much value to their operations. We were faced with a very difficult choice of whether to release the platform with UCS Manager or provide standalone management. After multiple rounds of discussions, we made a conscious decision to launch M-Series as a UCS Manager managed platform only. Ironically enough, it was one such customer discussion that vindicated our decision. This happened to be a customer deploying large cloud scale applications and did not care much UCS Manager. During the conversation, they talked about some BIOS issues in their super large web farm that surfaced couple of years back. After almost 2 years, they were still rolling out the BIOS updates !
UCS Manager is the industry’s first tool to elegantly break down the operational silos in the datacenter by introducing a policy-based management of disparate infrastructure elements in the datacenter. This was made possible by the concept of Service Profiles, which made it easy for the rapid adoption of converged infrastructure. Service Profiles allowed the abstraction of all elements associated with a server’s identity and rendering the underlying servers pretty much stateless. This enabled rapid server re-purposing and workload mobility as well as made it easy for enforcing operational policies like firmware updates. And, the whole offering has been built on the foundation of XML APIs, which makes it extremely easy to integrate with other datacenter management, automation and orchestration tools. You can learn more about UCS Manager by clicking here.
UCS M-Series Modular Servers are the latest addition to the infrastructure that can be managed by UCS Manager. M-Series is targeted at cloud-scale applications, which will be deployed in 1000s, if not 10s of 1000s of nodes. Automation of policy enforcement is more paramount than the traditional datacenter deployments. Managing groups of compute elements as a single entity, fault aggregation, BIOS updates and firmware upgrades are a few key features of UCS Manager that kept surfacing repeatedly during multiple customer conversations. That was one of the primary drivers in our decision to release this platform with UCS Manager.
In the cloud-scale space, the need to almost instantaneously deploy lots of severs at a time is a critical requirement. Also, all of the nodes are pretty much deployed as identical compute elements. Standardization of configurations across all of the servers is very much needed. UCS Manager makes it extremely easy to create the service profile templates ahead of time (making use of the UCS Manager emulator) and create any number of service profile clones literally at the push of a button. Associating the service profiles with the underlying infrastructure is also done with a couple of clicks. Net-Net: you rack, stack, and cable once; re-provision and re-deploy to meet your workload needs without having to make any physical changes to your infrastructure.
Storage Profiles is the most notable enhancement to UCS Manager in order to support M-series. This feature allows our customers to slice and dice the SSDs in the M-Series chassis into smaller virtual disks. Each of these virtual disks is then served up as if they are local PCIe devices to the server nodes within the compute cartridges plugged into the chassis. Steve has explained that concept elaborately in the previous blog. In the next edition, we will go into more details about Storage Profiles and other pertinent UCS Manager features for the M-Series.
Tags: Cisco Data Center, Cisco UCS, Cisco UCS Manager, Cloud Computing, UCS m-series, UCSGrandSlam
It almost feels like this blog entry should start with: Once upon a time…. Because it captures a journey of a young emerging technology and the powerful infrastructure tool it has become. The Cisco UCS journey starts with the tale of Unified Fabric and the Converged Network Adapter (CNA).
Most people think of Unified Fabric as the ability to put both Fiber Channel and Ethernet on the same wire between the server and the Fabric Interconnect or upstream FCoE switchs. That is part of the story, but that part is as simple as putting a Fiber Channel frame inside of an Ethernet frame. What is the magic that makes this happen at the server level? Doesn’t FCoE imply that the Operating System itself would have to know how to present a Fiber Channel device in software and then encapsulate and send the frame across the Ethernet port? Possibly, but that would require OS FCoE software support which would also require CPU overhead and require end users to qualify these new software drivers and compare the performance of software against existing hardware FC HBAs.
For UCS the key to the success of converged infrastructure was due greatly to the very first Converged Network Adapters that were released. These adapters presented existing PCIe Fiber Channel and Ethernet endpoints to the operating system. This required no new drivers or new qualification from the perspective of the operating system and users. However at the heart of this adapter was a Cisco ASIC that provided two key functions:
1.) Present the physical functions for existing PCIe devices to the operating system without the penalty of PCIe switching.
2.) Encapsulate Fiber Channel frames into an Ethernet frame as they are sent to the northbound switch.
Converged Network Adapter
It is the second function that we often focus on because that’s the cool networking portion that many of us at Cisco like to talk about. But how exactly do we convince the operating system that it is communicating with an Intel Dual port Ethernet NIC and a Dual port 4GB Qlogic Fiber Channel HBA? I mean these are the exact same drivers that we use for the actual Intel and Qlogic card, there’s got to be some magic there right?
Well, yes and no. Lets start with the no. Presenting different physical functions (PCIe endpoints) on a physical PCIe card is nothing new. It’s as simple as putting a PCIe switch between the bus and the endpoints. But like all switching technologies a PCIe switch incurs latency and it cannot encapsulate a FC frame into an Ethernet frame. So that’s where the magic comes into play. The original Converged Network Adapater contained a Cisco ASIC that sits on the PCIe bus between the Intel and Qlogic physical functions. From the operating system perspective the ASIC “looks” like a PCIe switch providing direct access to the the Ethernet and Fiber Channel endpoints, but in reality it has the ability to move I/O in and out of the physical functions without incurring the latency of a switch. The ASIC also provides a mechanism for encapsulating the FC Frames into a specific Ethernet frame type to provide FCoE connectivity upstream.
The pure beauty of this ASIC is that we have evolved it from the CNA to the Virtual Interface Card (VIC). These traditional CNAs have a limited number of Ethernet and FC ports available to they system (2 each) based on the chipsets installed on the card. The Cisco VIC provides a variety of vNICs and vHBAs to be created on the card. The VIC not only virtualizes the PCIe switch, it virtualizes the I/O endpoint.
Cisco Virtual Interface Card
So in essence what we have created with the Cisco ASIC, that drives the VIC, is a device that can provide a standard PCIe mechanism to present an end device directly to the operating system. This ASIC also provides a hardware mechanism designed to receive native I/O from the operating system and encapsulate and translate where necessary without the need for OS stack dependencies, for example native Fiber Channel encapsulated into Ethernet.
At the heart of the UCS M-Series servers is the System Link Technology. It is this specific component that provides access to the shared I/O resources in the chassis to the compute nodes. System Link Technology is the 3rd Generation technology behind the VIC and the 4th Generation technology for Unified Fabric within the construct of Unified Computing. The key function of the System Link Technology is the creation of a new PCIe physical function called the SCSI NIC (sNIC) that presents a virtual storage controller to the operating system and maps drive resources to a specific service profile within Cisco UCS.
System Link Technology
It is this innovative technology that provides a mechanism for each compute node within UCS M-Series to have it’s own specific virtual drive carved out of the available physical drives within the chassis. This is accomplished using standard PCIe and not MR-IOV. Therefore it does not require any special knowledge of a change in the PCIe frame format by the operating system.
For a more detailed look at System Link Technology in the M-Series check out the following white paper.
The important thing to remember is that hardware infrastructure is only part of the overall architectural design for UCS M-Series. The other component that is key to UCS is the ability to manage the virtual instantiations of the system components. In the next segment on UCS M-Series Mahesh will discuss how UCS Manager rounds out the architectural design.
Tags: Cisco Data Center, Cisco UCS, Cisco UCS Manager, Cloud Computing, UCS m-series, UCSGrandSlam
Cisco UCS M-Series servers have been purpose built to fit specific need in the data center. The core design principles are around sizing the compute node to meet the needs of cloud scale applications.
When I was growing up I used to watch a program on PBS called 3-2-1 Contact, most afternoons, when I came home from school (Yes, I’ve pretty much always been a nerd). There was an episode about size and efficiency, that for some reason I have always remembered. This episode included a short film to demonstrate the relationship between size and efficiency.
The plot goes something like this. Kid #1 says that his uncle’s economy car, that gets a whopping 15 miles to the gallon (this was the 1980s), is more efficient than a school bus that gets 6 miles to the gallon. Kid #2 disagrees and challenges Kid #1 to a contest. But here’s the rub, the challenge is to transport 24 children from the bus stop to school, about 3 miles a way, on a single gallon of fuel. Long story short, the school bus completes the task with one trip, but the car has to make 8 trips and runs out of fuel before it completes the task. So kid #2 proves the school bus is more efficient.
The only problem with this logic is that we know that the school bus is not more efficient in all cases.
For transporting 50 people a bus is very efficient, but if you need to transport 2 people 100 miles to a concert the bus would be a bad choice. Efficiency depends on the task at hand. In the compute world, a task equates to the workload. Using a 1RU 2-socket E5 server for the distributed cloud scale workloads that Arnab Basu has been describing would be equivalent to using a school bus to transport a single student. This is not cost effective.
Thanks to hypervisors, we can have multiple workloads on a single server so that we achieve the economies of scale. However there is a penalty to building that type of infrastructure. You add licensing costs, administrative overhead, and performance penalties.
Customers deploying cloud scale applications are looking for ways to increase the compute capacity without increasing the cost and complexity. They need all terrain vehicles, not school buses. Small, cost effective, and easy to maintain resources that serve a specific purpose.
Many vendors entering this space are just making the servers smaller. Per the analogy above smaller helps. But one thing we have learned from server virtualization is that there is real value in the ability to share the infrastructure. With a physical server the challenge becomes how do you share components in compute infrastructure without a hypervisor? Power and cooling are easy, but what about network, storage and management. This is where M-Series expands on the core foundations of unified compute to provide a compute platform that meets the needs of these applications.
There are 2 key design principles in Unified Compute:
1.) Unified Fabric
2.) Unified Management
Over the next couple of weeks Mahesh Natarajan and I will be describing how and why these 2 design principles became the corner stone for building the M-Series modular servers.
Tags: Cisco Data Center, Cisco UCS, Cisco UCS Manager, Cloud Computing, UCS, UCS m-series, UCSGrandSlam
Welcome back! Thanks for your interest in our journey from that painful “you wasted our engineering” moment to the product we announced on Sep 4th – UCS M-series Modular Servers.
There’s good pain and there’s bad pain. This pain was the muscle ache after a hard game of flag football. We had wasted some energy but were winning. Our customers were our coaches; they were precise about which parts of our game they loved and which parts they didn’t care for. They loved the management policy engine within UCS Manager but *did not* need the levels of redundancy and resilience in the hardware. And they really … really wished that we added some aspects to our game, specifically improvements to power and space efficiencies. Our customers were either trying to eke out more from their existing data centers or trying to reduce their co-location costs.
So our cloud scale customers,
i) loved our management policy engine
ii) didn’t rely on hardware redundancy/resilience
iii) needed better power and space efficiencies
I’ll note here that during this time we gained incredible respect for our cloud scale customers. These customers are either disrupting traditional industries or are innovators who are reinventing themselves to take advantage of the “internet everywhere” age. That’s a tough business, and whoa is competition fierce! ……. being 2nd best on the internet often means you are a distant loser. Read More »
Tags: Cisco UCS Manager, Cloud Computing, cloud scale, data center, UCS, UCS m-series, UCSGrandSlam
Last week we announced the UCS M-series Modular Servers. The launch represented culmination of an exciting journey for us that started two years ago.
In mid 2012 just as UCS B-series blade servers were taking off in a big way, we noticed a group of our customers using our core technology very differently than customers in our primary market, enterprise IT. In our primary market customers loved UCS’s stateless computing model, virtualization benefits and the converged offerings with our partners EMC and NetApp. In this other category, customers did not consider those same benefits nearly as important. However UCS Manager’s powerful policy engine got them really excited. UCS Manager gave them a programmatic interface to manage thousands of nodes across dozens of sites globally.
Curious, I started to visit some of these customers. During one such visit, I was walking thru the aisles of their data center and I noticed something I had not ever seen at any of our enterprise IT customers data center. This customer had all UCS chassis single homed to a single Fabric Interconnect, I stopped in my tracks – really? Isn’t that kind of dangerous? What happens if there’s a failure? Or you have to upgrade? The customer explained to me how a combination of their application architecture and their application instance placement strategy made sure that outages at the rack level could be handled without service disruption. Wow! so we had engineered all kinds of resiliency, dual ported adapters, dual IOMs, dual chassis controllers, clustered Fabric Interconnects … lots and lots of hard engineering work to make our product robust and resilient, and this customer had thrown it all away with one toss… that really hurt. Read More »
Tags: Cisco UCS Manager, Cloud Computing, data center, UCS, UCS m-series, UCSGrandSlam