You may have heard that the digital universe is in petabytes, global IP traffic is in 100s of exabytes. These are mind bogglingly large metrics. Big data analytics can play a crucial role in making datasets in this space usable – by improving operational efficiency to customer experience to prediction accuracy. While Cisco is the global leader in networking -- Did you know that 85% of estimated 500 exabyte global IP traffic in 2012 will pass through Cisco devices ? – the company also builds an innovative family of unified computing products. This enables the company to provide a complete infrastructure solution including compute, storage, connectivity and unified management for big data applications that reduce complexity, improves agility, and radically improves cost of ownership.
To meet a variety of big data platform demands (Hadoop, NoSQL Databases, Massively Parallel Processing Databases etc), Cisco offers a comprehensive solution stack: the Common Platform Architecture (CPA) for Big Data includes compute, storage, connectivity and unified management. Unique to this architecture is the seamless data integration and management integration capabilities with enterprise application ecosystem including Oracle RDBMS/RAC, Microsoft SQL Server, SAP and others. See Figure 1.
The CPA is built using the following components:
- Cisco UCS 6200 Series Fabric Interconnects provides high speed, low latency connectivity for servers and centralized management for all connected devices with UCS Manager. Deployed in redundant pairs offers the full redundancy, performance (active-active), and exceptional scalability for large number of nodes typical in big data clusters. UCS Manger enables rapid and consistent server integration using service profile, ongoing system maintenance activities such as firmware update operations across the entire cluster as a single operation, advanced monitoring, and option to raise alarms and send notifications about the health of the entire cluster.
- Cisco UCS 2200 Series Fabric Extenders, act as remote line cards for Fabric Interconnects providing a highly scalable and extremely cost-effective connectivity for large number of nodes.
- Cisco UCS C240 M3 Rack-Mount Servers, 2-RU server designed for wide range of compute, IO and storage capacity demands. Powered by two Intel Xeon E5-2600 series processors and support up to 768 GB of main memory (typically 128GB or 256GB for big data applications) and up to 24 SFF disk drives in the performance optimized option or 12 LFF disk drives in the capacity optimized option. Also features Cisco UCS VNIC optimized for high bandwidth and low latency cluster connectivity with support for up to 256 virtual devices.
Read More »
Tags: Big Data, Cloudera, Common Platform Architecture, CPA, Greenplum MR, Hadoop, MapR, MarkLogic, MPP Database, NoSQL, Oracle NoSQL Database, ParAccel
Last year , Oracle launched Oracle NoSQL to address the need of Big Data and analytics. Since then this commercial grade solution has been deployed on Cisco UCS, taking advantage on the high level of performances, and RAS capabilities of this game changer.
At Oracle OpenWorld, I met both Ashok Joshi , Oracle Senior Director NoSQL Database Development and Raghunath Nambiar , Cisco Distinguished Engineer UCS who had together a well attended session on Tuesday to present the powerful combination of NoSQL running on Cisco UCS -- Check below the rich content slide deck .
In this short video, Asho Joshi explained us what are the benefits of a a commercial grade solution, as opposed to an open source, highlighting not only the cost reduction and cost containment, but also the promise of business continuity with multiple data center support, thanks to the computing and networking capabilitie provided by Cisco
At the speaking session, Raghunath described the huge momentum of Cisco UCS, which is now with close to 16,000 unique customers, present in more than half of all Fortune 500 customers data centers.
Check slide 5 for more details on the growth , and financial results of a platform which benefits today from over 2600 Channel partners actively selling it .
Read More »
Tags: Big Data, Cisco, NoSQL, Oracle, UCS
Many Big Data related innovations have been developed by Web 2.0 companies, resulting in a growing collection of open source technologies that dramatically change the culture of collaborative software development and the scale and economics of hardware infrastructure. These technologies enable data storage, management and analysis in ways that were not possible before with traditional technologies such as relational database management systems, in a cost-effective manner.
NoSQL is one such technology that has emerged as an increasingly important part of big data trends for applications that demand large volumes of simple reads and updates against very large datasets (Hadoop is the other innovation, a generic processing framework designed to execute “read only” queries and batch jobs against massive datasets). NoSQL is often characterized by what it is not, and definitions vary. It can be Not Only SQL-based or simply Not a SQL-based relational database management system. NoSQL databases form a broad class of non-relational database management systems that are evolving rapidly, and several solutions are emerging with highly variable feature sets and few standards.
While these technologies are attractive from the standpoint of the innovations they can bring, not all products meet enterprise requirements. Many organizations require robust, commercially supported solutions for rapid deployments and the ability to integrate such solutions in to existing enterprise applications infrastructure.
To address these needs, Cisco and Oracle are the first vendors collaborating to deliver enterprise-class NoSQL solutions. Exceptional performance, scalability, availability and manageability are made possible by the combination of the Cisco Unified Computing System (UCS) and Oracle NoSQL Database. Together, this powerful solution provides a platform for the quick deployment along with predictable throughput and latency for most demanding applications.
Read More »
Tags: Big Data, NoSQL, Oracle NoSQL Database
As discussed in my previous post, application developers and data analysts are demanding fast access to ever larger data sets so they can not only reduce or even eliminate sampling errors in their queries (query the entire raw data set!), but they can also begin to ask new questions that were either not conceivable or not practical using traditional software and infrastructure. Hadoop emerged in this data arms race as a favored alternative to the RDBMS and SAN/NAS storage model. In this second half of the post, I’ll discuss how Hadoop was specifically designed to address these limitations.
Hadoop’s origins derive from two seminal Google white papers from 2003-4, the first describing the Google Filesystem (GFS) for persistent, massively scalable, reliable storage and the second the MapReduce framework for distributed data processing, both of which Google used to ingest and crunch the vast amounts of web data needed to provide timely and relevant search results. These papers laid the groundwork for Apache Hadoop’s implementation of MapReduce running on top of the Hadoop Filesystem (HDFS). Hadoop gained an early, dedicated following from companies like Yahoo!, Facebook, and Twitter, and has since found its way into enterprises of all types due to its unconventional approach to data and distributed computing. Hadoop tackles the problems discussed in Part 1 in the following ways:
Read More »
Tags: Big Data, Cisco, data center, Hadoop, NoSQL
If you have been a regular reader of just about any technology blog or publication over the last year you’d be hard-pressed to have not heard about big data and especially the excitement (some might argue hype) surrounding Hadoop. Big data is becoming big business, and the buzz around it is building commensurately. What began as a specialized solution to a unique problem faced by the largest of Web 2.0 search engines and social media outlets – namely the need to ingest, store and analyze vast amounts of semi- or unstructured data in a fast, efficient, cost-effective and reliable manner that challenges traditional relational database management and storage approaches – has expanded in scope across nearly every industry vertical and trickled out into a wide variety of IT shops, from small technology startups to large enterprises. Big business has taken note, and major industry players such as IBM, Oracle, EMC, and Cisco have all begun investing directly in this space. But why has Hadoop itself proved so popular, and how has it solved some of the limitations of traditional structured relational database management systems (RDBMS) and associated SAN/NAS storage designs?
In the Part 1 of this blog I’ll start by taking a closer look at some of those problems, and tomorrow in Part 2 I’ll show how Hadoop addresses them.
Businesses of all shapes and sizes are asking complex questions of their data to gain a competitive advantage: retail companies want to be able to track changes in brand sentiment from online sources like Facebook and Twitter and react to them rapidly; financial services firms want to scour large swaths of transaction data to detect fraud patterns; power companies ingest terabytes of data from millions of smart meters generating data every hour in hopes of uncovering new efficiencies in billing and delivery. As a result, developers and data analysts are demanding fast access to as large and “pure” a data set as possible, taxing the limits of traditional software and infrastructure and exposing the following technology challenges:
Read More »
Tags: Big Data, Hadoop, NoSQL