Of the more than 300 SQLSaturdays around the world, I am lucky enough to represent Cisco at the one in Barcelona on October 25th. If you’re attending TechEd Europe we encourage you to also join us at this one-day free event for IT professionals to learn more about SQL Server and the Cisco Unified Computing System (UCS).
If my experience at a recent SQL event in San Diego is any indication, it is going to be a great event. I was amazed that even after UCS being recognized as the #1 x86 blade server in Americas, many database administrators still came to our table and asked, “What is Cisco doing here at a SQL Saturday event?” The good news is that these same people left with an understanding of how UCS is different from our competitors and can help simplify, standardize and optimize SQL Server deployments.
By now it is clear that big data analytics opens the door to unprecedented analytic opportunities for business innovation, customer retention and profit growth. However, a shortage of data scientists is creating a bottleneck as organizations move from early big data experiments into larger scale adoption. This constraint limits big data analytics and the positive business outcomes that could be achieved.
Click on the photo to hear from Comcast’s Jason Hull, Data Integration Specialist about how his team uses data virtualization to get what they need done, faster
It’s All About the Data
As every data scientist will tell you, the key to analytics is data. The more data the better, including big data as well as the myriad other data sources both in the enterprise and across the cloud. But accessing and massaging this data, in advance of data modeling and statistical analysis, typically consumes 50% or more of any new analytic development effort.
• What would happen if we could simplify the data aspect of the work?
• Would that free up data scientists to spend more time on analysis?
• Would it open the door for non-data scientists to contribute to analytic projects?
SQL is the key. Because of its ease and power, it has been the predominant method for accessing and massaging data for the past 30 years. Nearly all non-data scientists in IT can use SQL to access and massage data, but very few know MapReduce, the traditional language used to access data from Hadoop sources.
How Data Virtualization Helps
“We have a multitude of users…from BI to operational reporting, they are constantly coming to us requesting access to one server or another…we now have that one central place to say ‘you already have access to it’ and they immediately have access rather than having to grant access outside of the tool” -Jason Hull, Comcast
Data virtualization offerings, like Cisco’s, can help organizations bridge this gap and accelerate their big data analytics efforts. Cisco was the first data virtualization vendor to support Hadoop integration with its June 2011 release. This standardized SQL approach augments specialized MapReduce coding of Hadoop queries. By simplifying access to Hadoop data, organizations could for the first time use SQL to include big data sources, as well as enterprise, cloud and other data sources, in their analytics.
In February 2012, Cisco became the first data virtualization vendor to enable MapReduce programs to easily query virtualized data sources, on-demand with high performance. This allowed enterprises to extend MapReduce analyses beyond Hadoop stores to include diverse enterprise data previously integrated by the Cisco Information Server.
In 2013, Cisco maintained its big data integration leadership with updates of its support for Hive access to the leading Hadoop distributions including Apache Hadoop, Cloudera Distribution (CDH) and Hortonworks (HDP). In addition, Cisco now also supports access to Hadoop through HiveServer2 and Cloudera CDH through Impala.
Recently I had an opportunity to sit down with the talented data scientists from Cisco’s Threat Research, Analysis, and Communications (TRAC) team to discuss Big Data security challenges, tools and methodologies. The following is part one of five in this series where Jisheng Wang, John Conley, and Preetham Raghunanda share how TRAC is tackling Big Data.
Given the hype surrounding “Big Data,” what does that term actually mean?
John: First of all, because of overuse, the “Big Data” term has become almost meaningless. For us and for SIO (Security Intelligence and Operations) it means a combination of infrastructure, tools, and data sources all coming together to make it possible to have unified repositories of data that can address problems that we never thought we could solve before. It really means taking advantage of new technologies, tools, and new ways of thinking about problems.
Cisco is proud to be a Platinum sponsor and exhibitor at PASS Summit this year. If you aren’t familiar with PASS Summit, it “is the world’s largest, most-focused, and most-intensive conference for Microsoft SQL Server and BI professionals.”
Gary Serda has done an excellent job in detailing what the Cisco UCS team will be sharing with attendees in his blog post Guide to Cisco at the PASS Summit, so I wanted to highlight our 3D, interactive vRack of our Unified Computing System which is always a highlight at trade shows and will be on display at PASS Summit.
Both the Nexus 1000V and FlexPod won Best of TechEd 2013 awards. This was the third year in a row for a Cisco product to be so honored.
We’re looking forward to seeing you at WPC. Join the conversation on social media using the hashtag #CiscoWPC. If you won’t be able to join us and would like to learn more about how Cisco is changing the economics of the datacenter, I would encourage you to review this presentation on SlideShare or my previous series of blog posts, Yes, Cisco UCS servers are that good. Or visit the Microsoft Cisco UCS portal.
Source: IDC Worldwide Quarterly Server Tracker, Q1 2013 Revenue Share, May 2013