The world is seeing an explosion of data growth. There are countless data-generating devices, digitized video and audio content, and embedded devices such as RFID tags and smart vehicles that have become our new global norm. Cisco is experiencing this dramatic shift as more data sources are being ingested into our enterprise platforms and business models are evolving to harness the power of data, driving Cisco’s growth across Marketing, Customer Experience, Supply Chain, Customer Partner Services and more.
Enterprise Data Growth Impact on Cisco
Enterprise data at Cisco has also grown over the years—with the size of legacy on-premises platforms having grown 5x over the past five years alone. The appetite and demand for data-driven insights has also grown exponentially as Cisco realized the potential of driving growth and business outcomes with insights from data, revealing new business levers and opportunities.
Cloud Data Transformation Drivers
When Cisco started its migration journey several years ago, its data warehouse footprint was entirely on-Prem. With the business pivoting towards an accelerated data-to-insights cycle and the demand for analytics exploding, it quickly became apparent that some of the existing technologies would not allow us to scale to meet data demands.
Why Snowflake and GCP?
Key technology leaders and architects within Data & Analytics conducted market assessments of various data warehousing technologies and reviewed Gartner assessments to shortlist products. We then performed comparative capability assessments, performance benchmarked POCs with representative workloads on Hadoop. Ongoing operational costs are a critical success factor of any solution, which is why cost assessment between the performance and ease of use played a key decision factor.
After significant evaluation, Snowflake and Google Cloud Platform were the chosen Cloud Platforms; Snowflake for our enterprise data and GCP for unstructured data processing and analytics.
Our early POCs indicated that Snowflake was 2-4 times faster than Hadoop for complex workloads. The fact that this was ANSI SQL-based yielded several advantages, including a larger qualified talent pool, shorter development cycles, and improved time to capability. The platform also offered a higher concurrency and lower latency compared to Hadoop. Snowflake was a clear winner!
GCP, by virtue of the rich set of tools it provides for analytics, was the chosen solution across multiple organizations in the enterprise and was a natural choice for analytics with the data residing in Snowflake.
Journey and key success factors
To migrate to Snowflake and GCP, we had to mobilize the enterprise to migrate out of Hadoop within a six-quarter timeline. From a central program management perspective, monumental effort went into planning, stakeholder engagement, vendor selection, and training and enablement of the entire enterprise.
As of December 2020, 100% of the Hadoop workload has been migrated to Snowflake, with key stakeholders like Marketing, Supply Chain, and CX fully migrated and leveraging the benefits of the Cloud Platform.
Some of the key enablers for our successful migration within such a short timeframe include:
- Security certification: The first question from all of our enterprise stakeholders was on the security aspects of storing our data in the Cloud. Extensive work was done with InfoSec and the cryptography team on enabling security with IP whitelisting and Cisco’s private key encryption with Snowflake’s tri-secret secure feature. A lot of attention also went into the D&A Data foundation architecture to enable Role-Based Access Control (RBAC) and granular role separation to manage applications safely and securely.
- Innovation with foundational capabilities: Right from the start, we knew that in order to accelerate migration for the enterprise, the foundation of ingesting data from on-prem sources to the cloud, of maintaining data quality in the cloud data warehouse, of automated on-boarding of new users and applications were critical. The innovative enabler we are especially proud of is the custom ingestion framework that ingests data from our on-prem sources to Snowflake at the speed of ~240MBPS, with an average of 12TB of incremental data ingested each day into Snowflake.
- Automation, automation, automation: This was our mantra. With a talented team, APIs were developed for aspects of enforcing security like token/DB credentials rotation and automating common administration and data access flows. We also built client-facing tools so application teams could own and meter their performance and costs: cop jobs, self-service warehouse resizing are two such examples.
- Proactive cost management: One key paradigm shift in the Cloud is the fact that platform costs are no longer someone else’s problem, or something you worry about only every few years when planning for capacity. With the ability to track usage and costs at a granular level by application comes the responsibility to manage costs better. Visibility of these usage patterns is key to enabling actionable insights for each application team. Data & Analytics has enabled several dashboards that display costs, usage trends over time, a prediction of costs based on current trends, and more. Alerts are also sent based on customizable criteria, such as a week on week spike.
- Enterprise enablement: With the monumental task of having to migrate nearly 300 applications, developed over five years in Hadoop, to Snowflake in 6 quarters, it was critical to ensure that the technology barrier was reduced right away. Over 25 training sessions were conducted with over 3000 participants trained over the course of FY20. This, coupled with numerous working sessions with Snowflake and Data & Analytics architects to share best practices and learnings across the teams, enabled a successful migration for all our stakeholders.
- Enterprise alignment: Lastly (but definitely not the least), ensuring we have stakeholder buy-in early in the game was critical to the success of a transformation at this scale. We worked at the grassroots level with the execution team, the leadership team, and executives to secure commitment and support towards this enterprise wide program.
Results observed and testimonials
As a data warehousing platform, Snowflake has significantly surpassed the performance across multiple dimensions, both in reporting and transformations. Transformation jobs that would take 10 or more hours to run are now completing within an hour, a 10x performance improvement. This provides our business teams more current data on their dashboards, allowing for more accurate insights based on the latest data. Reports are now on an average 4 times faster, with a 4x concurrency improvement, which gives our analysts the flexibility to run reports in parallel based on business needs.
The simple SQL-based technology has reduced the overall time to develop new capabilities or enhance existing ones. Our enterprise stakeholders report about 30% productivity improvement allowing faster time to capability, a key goal with this journey.
- “The Cloud will help us deliver insights to drive business growth, agility needed for faster and for more informed decision making, and improve productivity” — Digital Marketing
- “Customer Service agents can immediately pull case reports and support Cisco customers on average 20x faster than Hadoop” — Customer Experience
- “Virtual Demand Center users on Snowflake receive more accurate customer and partner data and receive leads that are more likely to buy.” — Sales and Marketing
The Cloud Data Platform’s rapidly evolving features also bring additional avenues to improve data governance, enforce more granular data security and harness the power of data – both public and Cisco data, more effectively partner with our customers and partners, and deliver data-driven outcomes.