Big Data remains one of the hottest topics in the industry due to the actual dollar value that businesses are deriving from making sense from tons of structured and unstructured data. Virtually every field is leveraging a data-driven strategy as people, process, data and things are increasing being connected (Internet of Everything). New tools and techniques are being developed that can mine vast stores of data to inform decision making in ways that were previously unimagined. The fact that we can derive more knowledge by joining related information and recognizing correlations can inform and enrich numerous aspects of every day life. There’s a good reason why Big Data is so hot!
This year at Hadoop Summit, Cisco invites you to learn how to unlock the value of Big Data. Unprecedented data creation opens the door to responsive applications and emerging analytics techniques and businesses need a better way to analyze data. Cisco will be showcasing Infrastructure Innovations from both Cisco Unified Computing System (UCS) and Cisco Applications Centric Infrastructure (ACI). Cisco’s solution for deploying big data applications can help customers make informed decisions, act quickly, and achieve better business outcomes.
Cisco is partnering with leading software providers to offer a comprehensive infrastructure and management solution, based on Cisco UCS, to support our customers’ big data initiatives. Taking advantage of Cisco UCS’s Fabric based infrastructure, Cisco can apply significant advantage to big data workloads.
Read More »
Tags: ACI, Big Data, blade server, Blade Servers, Cisco UCS, Cisco UCS C240 M3 Rack Server, Cisco Unified Computing System, Cisco Unified Data Center, Cisco Unified Fabric, Cloudera, Hadoop, Hortonworks, MapR, rack server, UCS Central, UCS service profiles
By now it is clear that big data analytics opens the door to unprecedented analytic opportunities for business innovation, customer retention and profit growth. However, a shortage of data scientists is creating a bottleneck as organizations move from early big data experiments into larger scale adoption. This constraint limits big data analytics and the positive business outcomes that could be achieved.
Click on the photo to hear from Comcast’s Jason Hull, Data Integration Specialist about how his team uses data virtualization to get what they need done, faster
It’s All About the Data
As every data scientist will tell you, the key to analytics is data. The more data the better, including big data as well as the myriad other data sources both in the enterprise and across the cloud. But accessing and massaging this data, in advance of data modeling and statistical analysis, typically consumes 50% or more of any new analytic development effort.
• What would happen if we could simplify the data aspect of the work?
• Would that free up data scientists to spend more time on analysis?
• Would it open the door for non-data scientists to contribute to analytic projects?
SQL is the key. Because of its ease and power, it has been the predominant method for accessing and massaging data for the past 30 years. Nearly all non-data scientists in IT can use SQL to access and massage data, but very few know MapReduce, the traditional language used to access data from Hadoop sources.
How Data Virtualization Helps
“We have a multitude of users…from BI to operational reporting, they are constantly coming to us requesting access to one server or another…we now have that one central place to say ‘you already have access to it’ and they immediately have access rather than having to grant access outside of the tool” -Jason Hull, Comcast
Data virtualization offerings, like Cisco’s, can help organizations bridge this gap and accelerate their big data analytics efforts. Cisco was the first data virtualization vendor to support Hadoop integration with its June 2011 release. This standardized SQL approach augments specialized MapReduce coding of Hadoop queries. By simplifying access to Hadoop data, organizations could for the first time use SQL to include big data sources, as well as enterprise, cloud and other data sources, in their analytics.
In February 2012, Cisco became the first data virtualization vendor to enable MapReduce programs to easily query virtualized data sources, on-demand with high performance. This allowed enterprises to extend MapReduce analyses beyond Hadoop stores to include diverse enterprise data previously integrated by the Cisco Information Server.
In 2013, Cisco maintained its big data integration leadership with updates of its support for Hive access to the leading Hadoop distributions including Apache Hadoop, Cloudera Distribution (CDH) and Hortonworks (HDP). In addition, Cisco now also supports access to Hadoop through HiveServer2 and Cloudera CDH through Impala.
Others, beyond Cisco, recognize this beneficial trend. In fact, Rick van der Lans, noted Data Virtualization expert and author, recently blogged on future developments in this area in Convergence of Data Virtualization and SQL-on-Hadoop Engines.
So if your organization’s big data efforts are slowed by a shortage of data scientists, consider data virtualization as a way to break the bottleneck.
Tags: apache, Big Data, Cisco Data Center, Cisco Data virtualization, Cloudera, Composite Software, data integration, data virtualization, Hadoop, HiveServer2, Hortonworks, mapreduce, query, SQL, video
Industry’s first reference architecture for Hadoop with advanced access control and encryption with IDH, first flash-enhanced reference architecture for Hadoop demonstrated using YCSB with MapR, industry’s first validated and certified solution for real-time Big Data analytics with SAP HANA, and Unleashing IT big data special edition
Built up on our vision of shared infrastructure and unified management for enterprise applications, the Cisco UCS Common Platform Architecture (CPA) for Big Data has become a popular choice for enterprise Big Data deployments. It has been widely adopted in finance, healthcare, service provider, entertainment, insurance, and public sectors. The new Cisco UCS CPA V2 improves both performance and capacity featuring Intel Xeon E5-2600 v2 family of processors, industry leading storage density, and industry’s first transparent cache acceleration for Big Data.
The Cisco UCS CPA v2 offers a choice of infrastructure options, including “Performance Optimized”, “Balanced”, “Capacity Optimized”, and “Capacity Optimized with Flash” to support a range of workload needs.
Up to 160 servers (3200 cores, 7.6PB storage) are supported in single switching/UCS domain. Scaling beyond 160 servers can be implemented by interconnecting multiple UCS domains using Nexus 6000/7000 Series switches, scalable to thousands of servers and to hundreds of petabytes storage, and managed from a single pane using UCS Central in a data center or distributed globally.
The Cisco UCS CPA v2 solutions are available through Cisco UCS Solution Accelerator Paks program designed for rapid deployments, tested and validated for performance, and optimized for cost of ownership: Performance Optimized half-rack (UCS-SL-CPA2-P) ideal for MPP databases and scale-out data analytics, Performance and Capacity Balanced rack (UCS-SL-CPA2-PC) ideal for high performance Hadooop and NoSQL deployments, Capacity Optimized rack (UCS-SL-CPA2-C) when capacity matters, and Capacity Optimized with Flash rack (UCS-SL-CPA2-CF) offers industry’s first transparent caching option for Hadoop and NoSQL. Start with any configuration and scale as your workload demands.
Cisco supports leading Hadoop and NoSQL distributions, including Cloudera, HortonWorks, Intel, MapR, Oracle, Pivotal and others. For more information visit Cisco Big Data Portal, and Big Data Design Zone that offers Cisco Validated Designs (CVD) -- pretested and validated architectures that accelerate the time to value for customers while reducing risks and deployment challenges.
Cisco UCS Common Platform Architecture Version 2 for Big Data
Cisco Launches the First Flash-Enhanced Solution for Hadoop
Simplifying the Deployment of Real-time Big Data Analytics — UCS + SAP HANA
Also see Maximizing Big Data Benefits with MapR and Informatica on Cisco UCS
Tags: Cisco UCS CPA, Cisco UCS Solution Accelerator Paks, Cloudera, Hortonworks, Intel Hadoop, MapR, Pivotal HD, SAP. HANA
With enough hype to rival even the most popular of Superbowl’s, Big Data experts will converge on New York City in just a couple weeks! But big data has good reason for all the hype as businesses continue to find new ways to leverage the insights derived from vast data pools that are continuing to grow at an exponential rate. A big reason for this is the ability to leverage Hadoop with the Hadoop Distributed File System and MapReduce functionality to analyze the data very quickly and provide incredibly fast queries that, although not even possible previously, can now be accomplished in minutes or less. We’ve only just begun to scratch the surface in terms of the financial returns made around Hadoop and the infrastructure to support Hadoop deployments but one thing we do know, it’s going to be big and it will continue to get bigger!
So how does Cisco fit into this picture?
Cisco is partnering with leading software providers to offer a comprehensive infrastructure and management solution to support customer big data initiatives including Hadoop, NoSQL and Massive Parallel Processing (MPP) analytics. Leveraging the advantages of fabric computing, the Cisco UCS Common Platform Architecture (CPA) delivers exceptional performance, capacity, management simplicity, and scale to help customers derive value more quickly and with less management overhead for the most challenging big data deployments.
Cisco UCS Common Platform Architecture for big data enables rapid deployment, predictable performance, and massive scale without the need for complex layers of switching infrastructure. In addition, the architecture offers unique data and management integration with enterprise applications hosted on Cisco UCS. This allows big data and enterprise applications to co-exist within a single management domain that simplifies data movement between applications and eliminates the need for unique technology silos in the data center. You can also check out my previous blog, Top Three Reasons Why Cisco UCS is a Better Platform for Big Data, to get an idea of what we’ll be sharing at the show.
Have you considered Cisco UCS for your Big Data projects? I’d like to invite you to come and hear more in a couple weeks at Strata Hadoop World in New York City. We’ll have a number of demos and experts on hand to answer all of your questions.
In addition, Cisco and Cloudera are teaming up to offer you a chance to win some exciting prizes by joining our demo crawl program. Stop by either the Cisco booth (#3) or the Cloudera booth (#403) to learn more.
Stop by and say hello and let me know if you have any comments or questions, or via twitter at @CicconeScott.
Tags: Big Data, blade server, Blade Servers, Cisco UCS, Cisco Unified Computing System, Cisco Unified Data Center, Cisco Unified Fabric, Cisco Unified Management, Cloudera, Hadoop, Hortonworks, Intel, MapR, rack server, UCS Manager, UCS service profiles
Speed is everything. Continuing our commitment to make data center infrastructures more responsive to enterprise applications demands, today, we announced FlexPod Select with Hadoop, formerly known as NetApp Open Solution for Hadoop, broadening our FlexPod portfolio. Developed in collaboration between Cisco and NetApp, offers an enterprise-class infrastructure that accelerates time to value from your data. This solution is pre-validated for Hadoop deployments built using Cisco 6200 Series Fabric Interconnects (connectivity and management), C220 M3 Servers (compute), NetApp FAS2220 (namenode metadata storage) and NetApp E5400 series storage arrays (data storage). Following the highly successful FlexPod model of pre-sized rack level configurations, this solution will be made available through the well-established FlexPod sales engagement and channel.
The FlexPod Select with Hadoop architecture is an extension of our popular Cisco UCS Common Platform Architecture (CPA) for Big Data designed for applications requiring enterprise class external storage array features like RAID protection with data replication, hot-swappable spares, proactive drive health monitoring, faster recovery from disk failures and automated I/O path fail-over. The architecture consists of a master rack and optionally up to nine expansion racks in a single management domain, creating a complete, self-contained Hadoop cluster. The master rack provides all of the components required to run a 12 node Hadoop cluster supporting 540TB storage capacity. Each additional expansion rack provides an additional 16 Hadoop cluster nodes and 720TB storage capacity. Unique to this architecture is seamless management integration and data integration capabilities with existing FlexPod deployments that can help to significantly lower the infrastructure and management costs.
FlexPod Select has been pretested and jointly validated with leading Hadoop vendors, including Cloudera and Hortonworks.
Tags: Big Data, Cloudera, CPA, FlexPod, FlexPod Select, Hadoop, Hortonworks, netapp