UC – How does Cisco IT Support and Manage its UC Services?
Cisco IT supports all its services with a global service management and delivery team. I am the service owner for the IT UC and video team; we own the strategy, planning and delivery of voice and video services throughout the Cisco enterprise.
We have four tiers of support management. Cisco IT supports users with extensive web-based and wiki-based self-support, showing people how to order, install, and configure their basic IT services. At tier 1 is our global helpdesk, the frontline support team for all our IT services, including voice, IP telephony, voicemail and video. They use a number of runbooks to triage and resolve user instance (forgotten password, phone not registering etc.). If it becomes clear that the issue is more complex, the team will escalate the issue to tier 2, composed of specific IT teams and Cisco RMS to handle service issues. The top tier in IT is the tier 3 operations level. Our global UC & Video operations team is in the same organization as the network operations team and enables close collaboration when required. These teams comprise of skilled engineers that resolve complex infrastructure platform incidents, and manage and drive operations improvements on our platforms. The Cisco IT tier 3 operations team additionally has the Cisco TAC as a resource for deep solution and product troubleshooting support and to raise the occasional bug. The TAC is the Cisco customer facing support service; we are still a Cisco customer after all.
Cisco IT has out-tasked a lot of our infrastructure service monitoring and managing to Cisco Remote Management Services (RMS). From their central Network Operations Center (NOC) they monitor and alert Cisco IT on all of our WAN networks and circuits, including our voice circuits at remote sites. If issues cannot be resolved between RMS and the service provider, the incident is escalated to the tier 3 team for a resolution. The RMS has its own monitoring platform, but internally, we also use some of the management capabilities integrated in the UC manager platform such as RTMT (real time monitoring tool) and a combination of our internally developed enterprise management platform tools (monitoring platform availability and status). Now, the Prime Collaboration Platform offers us a number of tools to support and manage networks. We already use Cisco Prime Collaboration to monitor and troubleshoot both our voice and our video services. Our use cases for Prime are growing over time.
Not only does my team manage support for the platforms, we manage the life-cycle of these platforms too. So how do we manage upgrades? With finesse and brilliant engineering, ideally. Managing upgrades is a constant learning experience. It is a practice that we’ve had a lot of opportunities to improve over the years. Platform capabilities have changed or improved and so have our IT practices. We’ve gotten rigorous around change and as a result we’re moving to a more structured and predictable release cycle for feature-related upgrades. When our Unified Communications Manager (UCM) moved to the Linux platform with version 5, we were able to start taking advantage of server partitioning, whereby each server had an active and inactive partition and we staged upgrades on inactive partitions to minimize disruptions to core processing. We became instant heroes, well not literally, but this new system allowed IT to remain nearly invisible and changes could be made in the background without interrupting business as usual. With our migration to virtualization on the UCS platform, upgrades have become even easier.
We can patch and update CUCM software with near-zero downtime from a user perspective. By ‘near-zero downtime’ we mean that the service remains completely “on.” We still may have to restart the phones, so it’s not invisible to the few people who are using phones in the middle of the night, but it’s not an intrusive process either. If you’re not using your phone, when it restarts, you don’t notice it. If you’re using your phone, the restart waits until you hang up, so again you don’t notice it. However, if you have just finished a call and need to make another one but your phone restarts, it obviously causes a minor disruption. We tend to schedule these types of updates for when the fewest number of users are online outside normal business hours. If we need to do a major upgrade, it just requires more planning from a testing and integration perspective. The actual upgrade process remains the same. To keep from upgrading regularly, we tend to bundle key updates. Priority security alerts or incidents will occasionally require us to upgrade urgently but otherwise we are on a cycle of ~2 feature-related upgrades per year. We are fairly good at upgrading software; hardware on the other hand, is a bit more difficult.
So what happens when the phone hardware changes? This is an area that has challenged us and continues to challenge us. Cisco is a massive enterprise with more than 70,000 employees around the globe. Managing an ongoing refresh cycle for a fleet of over 100,000 physical devices is not easy whichever way you look at it. If I’m ever challenged on the pace of our phone refresh I like to point out the tremendous return on investment in our phone fleets – we’ve still got phones that have been in our network for more than 10 years quite happily (even if a few of their owners occasionally get jealous of a colleagues’ new device).
For the last decade, IT infrastructure has managed its entire infrastructure by using a life-cycle management approach to budgeting and refresh. We always keep our network infrastructure current to avoid running into operational issues with aging technology or platforms and to maintain agility in enabling new business capabilities. We’ve used some of our allocated funds annually to continually update the oldest part of our phone fleet in conjunction with site visits to upgrade other network infrastructure. To reduce the cost, in addition to providing a self-service phone upgrade tool for users to replace their own individual phones, we focus on rightsizing our desktop phone deployment; deploying phones only when required and where they get good usage. We don’t deploy phones to people as much anymore, since many of our sites are more flexible and people sit where they like, so we deploy phones to desks and workspaces within the office.
There is no doubt that video, BYOD, mobile, tablet and software phone clients impact the desk phone market inside Cisco too. Not every desk or employee needs a desktop phone anymore. Some offices or work areas have video units which provide great video phone service. A lot of our employees have their own BYOD smartphone, or tablet, which they use as their primary work phone with Single Number Reach. I’ve stopped using an IP Phone at home for the last year as I can make HD video calls using the Jabber client — why take up my home office desk space with a phone I don’t need? Some of our employees are doing the same in the office. Going forward, those who prefer their soft client, mobile and Jabber video to a phone don’t need to have a physical phone on their desks. We can reduce hardware and ongoing fleet management costs in this manner. That said, we still find now that most employees still do like to use a hardware IP Phones when they’re sitting in the office, or in their home office.
For more information about how we migrated to IP telephony or for a general overview of IPT, check out my blog series:
- The Road to Unified Communications – Flexibility, Mobility, Simplicity
- What is Cisco IT’s UC Global Cluster Architecture?
- Migration – How did Cisco IT get from Legacy PBX to Unified Communications?