As the leader of Cisco’s Data Virtualization and Analytics Business Units, it is my pleasure to announce Cisco Data Preparation, a new big data and analytics offering for business analysts and IT.
What is Cisco Data Preparation?
Driven by Business’s accelerating demand for analytics, Cisco Data Preparation (Data Prep) makes it easy for non-technical business analysts to gather, explore, cleanse, combine and enrich the data that fuels these analytics.
Primarily designed as a self-service application for business analysts, Data Prep is also a valuable new capability for IT data developers and even data scientists, helping these teams collaborate to achieve the following benefits:
- Faster Insights: New data sets available in minutes, not weeks.
- More Comprehensive Insights: Gain advantage from all your data sources.
- Better Business Outcomes at Scale: Supports hundred of data preparation projects at big data scale.
- Higher Productivity, with Greater Governance: Both Business and IT gain from stronger collaboration.
Why will Business Analysts like Cisco Data Preparation?
Business Analysts can use Data Prep to address the significant data integration challenges they face when preparing analytic data sets using with a self-service approach.
- Every analytic project is different making every data exploration effort unique. Cisco Data Prep’s Excel-like interface and machine learning lets analysts explore data freely.
- Data is messy and everywhere. This results in analysts spending as much as 80% of their time preparing data before analysis can begin. Cisco Data Prep dramatically reduces time required to prepare data.
- Too few Data Scientists and too long IT backlogs puts the onus on the Business adopt self-service. Cisco Data Preps empowers Business Analysts to do this work themselves.
Why will IT like Cisco Data Preparation?
IT can use Data Prep to work in concert with the business to intelligently balance self-service needs with governance constraints, while optimizing infrastructure.
- Many requirements are short lived in contrast with IT’s industrial grade orientation. Cisco Data Prep helps IT and Business meet exploratory data needs with the right level investment and when needed even provide working prototypes that IT can quickly reengineer.
- Independent, ungoverned data prep efforts can lead to duplication of effort, inconsistently transformed data sets of unclear origin, resulting in inaccurate analysis and potentially bad business results. Cisco Data Prep built-in governance and data set sharing increase trust.
- Rogue data preparation activity in personal sandboxes and myriad tools, prevent IT from delivering scalable, secure infrastructure. Cisco Data Prep’s ability to massively scale allows IT so support thousands of users and multiple terabytes of data with a common, cost-effective infrastructure.
A Complete Data Preparation Solution, Only from Cisco
Cisco Data Preparation is a complete software, hardware and services solution that simplifies adoption and accelerates benefits.
- Leveraging an easy-to-learn and use Excel-like interface and powerful machine intelligence algorithms from Cisco partner Paxata, Data Prep removes barriers to adoption and elevates business analysts’ skills.
- Two-way integration with Cisco Data Virtualization helps leverage prior IT investments and closes the loop between the business and IT.
- Data Prep’s massively scalable Hadoop and Spark-based architecture ensures that Data Prep users won’t be constrained by size of data sets or complexity of analysis.
Plus, a complete set of Cisco and Partner provided “Plan” and “Build” services ensure Data Prep implementation success.
Learn More About Cisco Data Preparation
There are lots of ways you can learn more about Data Prep. You can:
- Join us at Strata+Hadoop World this week from September 29 through October 1 at the Javits Center in New York. Stop by Cisco Booth 425. There you can get a Data Prep demo from Cisco Sales Engineer Bill Kellett as well as attend with Cisco Data and Analytics Director Bob Eve and Paxata Product VP Nenshad Bardoliwalla.
- Join us at the 2015 Data and Analytics Conference, October 20-22, at the Hilton Chicago. Register now and join my breakout session, “Data Preparation for Self-Service Analytics.”
- Review the Cisco Data Preparation data sheet to learn more about Data Prep functionality.
- Talk to your Cisco or Cisco partner account manager to arrange a conversation with a Cisco Data Preparation product specialist.
- And look for upcoming blogs relating Cisco Data Preparation with Cisco UCS, Cisco Data Virtualization, Advanced Services and more. Stay tuned!
Join the Conversation
Follow @CicsoDataVirt and @CiscoAnalytics, #CiscoDAC.
Learn More from My Colleagues
Check out the blogs of Mala Anand, Mike Flannagan, Bob Eve and Nicola Villa to learn more.
Tags: analytics, Big Data, Cisco Data Preparation, Data Prep, data virtualization, Paxata
In December 2014, we announced VersaStack, an integrated infrastructure reference solution for enterprise applications that combines technologies from Cisco and IBM. Further extending this partnership, today we are announcing support for IBM BigInsights for Apache Hadoop on our Cisco UCS Integrated infrastructure for Big Data – an industry-leading platform widely adapted for enterprise big data application deployments. The joint solution encompasses disruptive innovations in Cisco UCS and the robust and industry-compatible Apache Hadoop distribution from IBM. This solution can be installed as a standalone Hadoop cluster with powerful analytical tools or can be integrated into existing VersaStack deployments that will benefit from a common fabric and unified management capabilities to deliver the deepest possible insight into your data to help you gain a sustainable competitive advantage.
We are also announcing the availability of Cisco Validated Design (CVD) that provides step by step design guidelines comprehensively tested and documented to help ensure faster, more reliable and predictable deployments at lower total cost of ownership.
- Combines innovations from Cisco UCS such as programable infrastructure with best of open source software with enterprise-grade capabilities in IBM BigInsights for Apache Hadoop
- Designed and optimized for common use cases, pre-tested, pre-validated and fully documented by Cisco and IBM engineers to ensure dependable deployments that can scale from small to very large as workload demands
- Provides enterprises with extensive platform management and data visualization capabilities and integration of big data with other information solutions to help enhance data manipulation and management tasks
- Brings the power of SQL to Hadoop at the performance and scale ever than before accelerating data science and analytics leveraging SQL – arguably the most beautiful programming language – and integration with business applications to access data stored in HDFS and HBase with JDBC and ODBC
- Deep technical expertise, global resources, and world-class support and services from Cisco, IBM and partners
This solution is built on Cisco UCS infrastructure using Cisco UCS 6200 Series Fabric Interconnects and Cisco UCS C-Series Rack Servers optimized for IBM BigInsights for Apache Hadoop with scalability to thousands of nodes with Cisco Nexus 9000 Series Switches:
For more information, please visit:
Follow me on Twitter: https://twitter.com/raghu_nambiar for real time updates.
Tags: Apache Hadoop, Big Data, Cisco UCS, data center, IBM, IBM BigInsights, versastack
According to scientists, the age of smartphones has left humans with such a short attention span that even a goldfish can hold a thought for longer. On an average, human attention span has fallen from 12 seconds in year 2000 to 8 seconds in today’s smart-world.
What does this mean for Splunk Enterprise ? Read More »
Tags: analytics, Big Data, Cisco UCS, data center, Integrated infrastructure, Splunk, Splunk Enterprise, Splunkconf, UCS, ucsbigdata
Delivering on the promise of Big Data and Analytics takes an ecosystem of partners who collaborate to integrate the underlying technologies so your organization can turn data into business value – faster. That’s why Cisco and MapR are teaming to deliver integrated solutions that are transforming the way organizations deploy and capitalize on the value of Hadoop technology.
The Cisco UCS Integrated Infrastructure for Big Data with MapR solution combines the MapR Distribution including Apache Hadoop with Cisco UCS Integrated Infrastructure for Big Data, which unifies computing, storage, connectivity, and management capabilities. This validated solution delivers an industry-leading architectural platform for Hadoop-based applications.
Cisco and MapR continue to innovate to enable new customer use cases. MapR Senior Solutions Architect, Dr. James Sun, provides an excellent example on his latest blog on Dockerizing Apache Webservers with Cisco UCS, Apache Mesos and MapR.
Read More »
Tags: Big Data, Hadoop, MapR, Strata Hadoop, UCS
Guest Blog by Ron Graham
Ron Graham had served as a Data Center Architect and Systems Engineer for some of the largest IT companies in the U.S. including Cisco Systems, NetApp, Sun Microsystems, and Oracle. He is currently working for Cisco Systems as a Big Data Analytics Engineer.
What is Data Virtualization? Our definition is: Agile data integration software that makes it easy to access all your data no matter where it’s managed, and query it across the network as if it were in a single place. I like to say it differently – the real value lies in its ability to provide business users with a single high-level view of data that is spread across their infrastructure.
Data Virtualization is essentially middleware software that leverages a high-performance query engine and can utilize advanced computer architectures such as Cisco UCS. It’s a virtual data integration layer that can deliver data from multiple sources that are loosely coupled or have little or no knowledge of the other components. Of course this is done in a logically organized manner as show by the diagram below.
This is all nice but where is the beef, or the sex appeal? The sexy part is in the front-end business intelligence platforms and data visualization tools that can access and analyze data such as Tableau. Tableau can simply access data through the Cisco Data Virtualization with an ODBC driver. From here, business users can query data on demand from a single point of access (i.e. a common data model) without having to understand the different schemas or SQL dialects of the original data sources.
Read More »
Tags: Big Data, Cisco, Cisco UCS, data center, Hadoop, Strata Hadoop, Tableau