Avatar Avatar

The rapid growth of distributed work introduced new levels of complexity within the network, increasing the demands on Cisco IT to ensure secure, seamless, and consistent experiences for employees, no matter where they work. Harnessing the AI and automation capabilities within Cisco Catalyst Center, Cisco IT was able to transform its approach to network management — resulting in a 97% reduction in code vulnerabilities, a reduced software upgrade time by 59%, improved job satisfaction for engineers, and more.

 

Network quality is a complex, end-to-end problem. Traditional network monitoring focuses on device health but has historically fallen short in answering the age-old question: “How well is the wireless network really performing for the end user?”

Additionally, engineers often spend valuable time completing manual, repetitive, and tedious tasks (software upgrades, maintenance, and device configuration) that leave little room for proactive troubleshooting or innovation.

 These toil actions can hinder the ability to monitor and scale effectively but can also open the network up to human error, security, and compliance issues.

Within Cisco IT, we manage more than 200,000+ devices spanning both in-office and remote environments, a vast undertaking which has historically forced us to take a reactive approach to network management. For our Hybrid Network Access (HNA) Team, with fewer than 100 engineers managing 40,000 devices, developing automation for repetitive tasks was essential.

For instance, in the past, tasks like software upgrades and compliance checks were manual, time-consuming processes requiring engineers to individually access each device and execute commands one-by-one. These slow processes frustrated engineers and led to increased exposure to security vulnerabilities through inconsistent configurations or human error in the network.

Recognizing the growing inadequacy of this approach, our engineers looked for ways to streamline operations with Cisco’s own technology.

Turning to automation to simplify the IT experience

A few years prior, our team deployed the Cisco Catalyst Center controller as part of a multi-site initiative to better manage and maintain our campus and branch networks – using Simple Network Management Protocol (SNMP) and other methods to access devices and manage events and telemetry.

With the native automation capabilities in Catalyst Center, we were initially able to develop automated systems to remove the stress around routine tasks including:

  • Software Image Management: A 59% faster software upgrade time and a 97% reduction in code vulnerabilities on Cisco Catalyst 9000 products were achieved as we implemented an automated system leveraging Catalyst Center APIs — ensuring system health by efficiently scheduling and activating upgrades.
  • Software Conformance: With Catalyst Center automation, we have central visibility that allows us to assure correct versions of software, timely security patches, and standardized device configurations for consistent performance and improved security.
  • Cisco Network Plug and Play (PNP): We developed templates for fast, standardized device configuration — ensuring best practice compliance, eliminating staging, and reducing operational expenditures — which recently allowed us to fully provision our new downtown Austin office in under 10 minutes.

Automation not only saved time but also reduced errors caused by manual tasks and enhanced security through consistent configurations.

To infinity and beyond — automation efficiency

The benefits of automation were undeniable, but we wanted to go even further to unlock true intelligence within the network. We found that it all started by tapping into the power of network data and AI.

Over the last 20 years, Cisco has deployed more than 50 million networks, allowing us to create a large, powerful data platform. Devices on the network act as “sensors” and generate enormous amounts of data. However, the sheer volume of data could be overwhelming, making it hard for help desks and engineers to use it to troubleshoot and resolve cases for end users. We knew advanced analysis of this data would uncover critical insights into the state and performance of the network.

The immense potential of historical and real-time data collected through Catalyst Center was clear, and we realized harnessing it would be the foundation for a complete NetOps transformation.

Cisco’s AI Network Analytics Cloud, a cutting-edge solution developed in 2019 and already seamlessly integrated within Catalyst Center, was the key to unlocking the power of AI/ML within the network. This global, cloud-based data platform enables Catalyst Center to deliver AI-driven alerts, usage patterns, and predictive insights — further contextualizing the data and equipping our engineers with actionable recommendations. These insights also enhance Catalyst Center’s ability to manage and configure the network

Ready, set, [AI and intelligent automation] action.

Catalyst Center’s advanced assurance features, powered by the Cisco AI Network Analytics, monitor for experiential or volumetric changes within the network. Catalyst Center uses the data from network devices to define a baseline of normal behavior for each specific environment, sending an AI-generated alert via Webex whenever a deviation or issue is detected. The Webex integration helped our team build these AI capabilities into our operational processes, bringing them straight to our workflow.

Once an issue is triggered in Catalyst Center, an AI alert is sent straight to the inbox of a network engineer. 

With the AI engine measuring real-time network trends, our team no longer solely relies on end-users opening service tickets to identify performance issues that, undetected, could turn into major problems.

Going beyond the alerts

The AI capabilities in Catalyst Center extend beyond these AI alerts and transform many of the daily processes that our engineers handle. One of the most exciting advancements is the AI-enhanced Radio Resource Management (RRM) feature, which automatically adjusts wireless signal strength and channels in real time within offices.

Network engineers relied on RRM to optimize Radio Frequency (RF) performance, manage interference, and improve user experience by optimizing the channel and power plan. With AI RRM, Catalyst Center “remembers” network performance and usage patterns and anticipates future patterns, optimizing the RF before the performance degrades. Further, the AI engine will suggest and model additional RF changes (like suggesting a wider channel bandwidth), giving us a chance to preview changes before we make them.

“With Catalyst Center and Cisco AI Network Analytics, we’re not just monitoring network device health — we’re unlocking a clearer picture of the user experience and proactively resolving issues for our clients.” – Luke Tainton

Visualizing network health with Splunk Enterprise

Going a step further, our team can now send the Cisco data collected in Catalyst Center to Splunk Enterprise, which serves as our primary platform for data analysis. Used mainly to visualize data and analyze alerts based on how and when they occur, Splunk gives us comprehensive visibility into the state of the network.

Pro tip: Catalyst Center Assurance only retains data for 30 days. Ingesting this data into Splunk can enable a longer, more detailed historical view of network health, allowing us to develop more thought-driven, established processes.

Visualization of Catalyst Center AI alerts created in Splunk Enterprise Dashboard

With Splunk, our team can visualize events by subcategory, criticality, location, support group, and severity. By highlighting “Top Talkers,” devices that frequently generate alerts, network engineers can quickly identify and focus on the most critical issues at hand. If we are experiencing a pervasive issue, we go into Splunk Enterprise and figure out how long it’s been in the network. We can identify the root cause quickly from there.

Additionally, Cisco also uses the Splunk Enterprise dashboards as a hub for all the Cisco data that is generated by the various network controllers utilized within Cisco IT (such as Meraki Dashboard, Cisco Catalyst SD-WAN Manager, etc.) By aggregating all that information in one place, we’re able to monitor the IT landscape in ways we never could before. 

A new, improved approach

Building AI and automation into our processes hasn’t just removed complexity; it has fundamentally reshaped the way we approach network management. With automation handling repetitive and routine tasks, and AI constantly monitoring the network and sending alerts when issues are detected, we’ve adopted a proactive monitoring model, enabling:

  • Enhanced visibility: Ingesting syslog and network data into Splunk delivers both real-time and historical insights, empowering proactive troubleshooting and data-driven operations.
  • Simplified IT experience: Automation reduces human error, enhances security with consistent configurations, and saves time — enabling engineers to focus on more complex network issues.
  • Reduced Mean Time to Detect (MTTD) and Mean Time to Resolve (MTTR): Manual detection times, which ranged from 2 – 16 hours, are cut to as little as 41 seconds with automated AI alerts. Rapid detection is directly correlated to a significant decrease in overall MTTR.
  • Frictionless user experience: Cisco Catalyst Center provides deep network visibility and AI-driven insights to ensure consistent and optimal employee experiences, anywhere they work.

What’s next? 

In implementing these changes to our network management processes, we’ve greatly enhanced both the user and IT experience. Reducing toil and repetitive actions through AI and automation frees up our time to focus on the larger problems within our network — a massive step forward from where we were in the past.

Through our use of Catalyst Center, the day-to-day role for a network engineer looks drastically different than it did five years prior — and we aren’t stopping here. Looking ahead, we plan to further operationalize automation and AI alerts within network monitoring, leveraging both the product platform and Cisco’s internal AI tools to build upon our current foundation.

A key goal for the future is to ingest the Cisco IT data from Splunk, acting as a central repository for network controller data, directly into Cisco’s internal AI assistant. The AI, with access to this data, will become an internal tool employees can interact with to help troubleshoot and resolve network issues independently.

Delivering seamless, secure connections is critical to how our organization operates and a powerful network is essential to achieving increased productivity, agility, and frictionless experiences. With Catalyst Center’s AI-driven and automation capabilities, our team empowers Cisco to support the ever-evolving needs of our employees and workplaces.

 

Additional resources:

Find more Cisco on Cisco stories here 

 

 

 

 

 

Authors

Luke Tainton

Site Reliability Engineer

Cisco IT Network Engineering and Operations

Jacob Foxe

Site Reliability Engineer

Cisco IT Network Engineering and Operations