Unified Computing System Management Basics
As I have noted, virtualization introduces a couple of distinct challenges that directly impact its ultimately scalability as a solution. While virtualization offers a number of compelling benefits, it does introduce increased operational complexity. Because it is simple to add a new virtual machine (VM) and because we can move VMs across physical infrastructure, we see an uptick in the change requests that the network and storage teams must address to support this new flexibility. Additionally, because we see more use of automated live migration of VMs (i.e. VMware DRS), we need to the ability to automate the provisioning of the underlying infrastructure in a much more dynamic way. To us, it was simple: the management framework of the Cisco UCS should simplify and automate server provisioning and the network connectivity that the server requires. This allows companies to accelerate their virtualization plans. However, there was also one other key goal–we did not want to create any data center islands, so the Cisco UCS should easily integrate into the existing environment — no forklifts.
So, let’s start with the UCS Manager (UCSM). Simply put, UCSM is embedded device management software that allows the compute, network and storage access resources in the Cisco UCS to be managed holistically. Instead of multiple control points (server, network, storage, etc), you have one control point, so we have delivered on goal #1: simplicity. I am going to skip goal #2, for a second and skip to goal #3: integration into heterogeneous environments. The UCSM can be accessed a number of ways, including a GUI, and CLI and via an XML API. Because of this and because of support for other industry standard interfaces such as SNMP, IPMI, and the SMASH-CLP standards defined by the DMTF, the Cisco UCS can easily integrate into existing enterprise management systems. In addition, OS based agents from system management tools will run without modification behaving exactly as they do in traditional server environments. This topic alone is worthy a longer discussion, so I am actually going to defer digging into this any deeper until my next blog post. So, let’s go back and look at automation. A single UCSM domain can encompass 320 blade servers and the infrastructure to support them (two 6100 fabric interconnects, 40 UCS 5100 chassis, and 80 UCS 2100 fabric extenders). The UCSM allows you to fully abstract this hardware. Instead of individually managing these resources and managing the coordination between these resources, you manage UCS resources through service profiles and service profile templates. A service profile is a complete software definition of a server including associated LAN and SAN requirements. Specifically, the service profile includes things like UUID, NIC (MAC) and HBA (WWN) identities and network and storage settings such as VLAN and VSAN membership, QoS and uplink specifications to upstream LAN and SAN switches. Service Profiles are very robust and also include policies that specify the RAID level of on blade disks, BIOS settings like boot order, firmware rev levels for BIOS and converged network adaptors (CNA). As a user you get to choose a specific blade server slot to assign the service profile too, or you can choose from pool that has specific server characteristics such as CPU speeds, or amount of memory. Service profiles are the key feature of UCS manager that we believe will lead to new levels of business agility. Touching on heterogenous environments again, these service profiles can be accessed, created, modified, and monitored by external management tools via the aforementioned XML APIs. The API also allows simple integration with external configuration management databases (CMDBs) for inventory population, asset tracking, and granular configuration and state information if required. Its also important to note that service profiles exist below the operating system (i.e. Windows, Linux, or hypervisor), so they can be used in conjunction with existing tools that provide management functions like patch management. From a server virtualization perspective, because the UCSM exists below the hypervisor, and changes such as VMotion of virtual machines is automatically synchronized with the UCS–essentially, we are taking the VN-Link concepts with introduced with the Nexus 1000V and extending them across the infrastructure. The service profile can then be turned into a service profile template to quickly deploy cookie-cutter instances of the service profile. The service profile template does this by taking certain parameters that are dynamic or variable in nature and replacing them with pointers to specific resource pools. For example, instead of specifying a specific WWN in a service profile template, the template points to a pool of WWNs that are managed by the UCSM for that particular profile. This way, you can maintain different pools for different profiles. So, what’s the benefit of all this? Well, we see a shift from managing individual infrastructure silos to managing policy that holistically across silos, which has some obvious operational benefits. Because policies are codified, they are always aligned with operational best practices, security policy and compliance policy. While these are all very cool, I think the greatest value comes from being able to create a truly stateless platform. The ability to use software to provision all the way from low-level firmware to network and storage configuration in a way that is transparent to the OS or hypervisor creates a platform that is elastic and pliable. So, this is the wrap for part 1–sorry its so long, but there is a lot to cover and I just scratched the surface. For part 2, as I mentioned earlier, I’ll dig a bit more into managing the Cisco UCS in a heterogenous environment.