You may have heard that the digital universe is in petabytes, global IP traffic is in 100s of exabytes. These are mind bogglingly large metrics. Big data analytics can play a crucial role in making datasets in this space usable – by improving operational efficiency to customer experience to prediction accuracy. While Cisco is the global leader in networking -- Did you know that 85% of estimated 500 exabyte global IP traffic in 2012 will pass through Cisco devices ? – the company also builds an innovative family of unified computing products. This enables the company to provide a complete infrastructure solution including compute, storage, connectivity and unified management for big data applications that reduce complexity, improves agility, and radically improves cost of ownership.
To meet a variety of big data platform demands (Hadoop, NoSQL Databases, Massively Parallel Processing Databases etc), Cisco offers a comprehensive solution stack: the Common Platform Architecture (CPA) for Big Data includes compute, storage, connectivity and unified management. Unique to this architecture is the seamless data integration and management integration capabilities with enterprise application ecosystem including Oracle RDBMS/RAC, Microsoft SQL Server, SAP and others. See Figure 1.
The CPA is built using the following components:
- Cisco UCS 6200 Series Fabric Interconnects provides high speed, low latency connectivity for servers and centralized management for all connected devices with UCS Manager. Deployed in redundant pairs offers the full redundancy, performance (active-active), and exceptional scalability for large number of nodes typical in big data clusters. UCS Manger enables rapid and consistent server integration using service profile, ongoing system maintenance activities such as firmware update operations across the entire cluster as a single operation, advanced monitoring, and option to raise alarms and send notifications about the health of the entire cluster.
- Cisco UCS 2200 Series Fabric Extenders, act as remote line cards for Fabric Interconnects providing a highly scalable and extremely cost-effective connectivity for large number of nodes.
- Cisco UCS C240 M3 Rack-Mount Servers, 2-RU server designed for wide range of compute, IO and storage capacity demands. Powered by two Intel Xeon E5-2600 series processors and support up to 768 GB of main memory (typically 128GB or 256GB for big data applications) and up to 24 SFF disk drives in the performance optimized option or 12 LFF disk drives in the capacity optimized option. Also features Cisco UCS VNIC optimized for high bandwidth and low latency cluster connectivity with support for up to 256 virtual devices.
Read More »
Tags: Big Data, Cloudera, Common Platform Architecture, CPA, Greenplum MR, Hadoop, MapR, MarkLogic, MPP Database, NoSQL, Oracle NoSQL Database, ParAccel
Following the successful workshop “Towards an Industry Standard for Benchmarking Big Data Workloads” (WBDB 2012) held in May 2012 in San Jose , the Second Workshop on Benchmarking Big Data Workloads (WBDB2012.in)  will be held in Pune, India from 17 to 18 December at the Hinjewadi Campus of Persistent Systems Ltd, colocated with the 18th International Conference on Management of Data (COMAD 2012) .
I have the great pleasure to co-chair this workshop with my distinguished colleagues Chaitanya Baru, Meikel Poess, Milind Bhandarkar and Tilmann Rabl with support from the National Science Foundation (NSF.gov).
The objective of the workshop series is to foster the development of industry standards for providing objective measures of the effectiveness of hardware and software systems dealing with Big Data. Several industry experts and researchers are expected to present and debate their vision on benchmarking big data platforms.
 WBDB 2012.in http://clds.ucsd.edu/wbdb2012.in, CFP: http://clds.ucsd.edu/sites/clds.ucsd.edu/files/WBDB.in_.cfp_.pdf
 WBDB 2012 http://blogs.cisco.com/datacenter/towards-an-industry-standard-for-benchmarking-big-data-workloads/, http://clds.ucsd.edu/wbdb2012/
 COMAD 2012 http://comad.in/comad2012
 WBDB 2012.in Program Committee http://clds.ucsd.edu/wbdb2012.in/organizers
Tags: Big Data, Big Data Benchmarks, WBDB
Author’s Note: I have no kids. I have friends with kids, who used to be in diapers. The kids were in diapers, not the friends. I’ve changed a few in my day, but not nearly as many as my friends have. And yes this has some sort of relevance to this story…
In every trade show or conference there’s someone talking about Big Data. They talk about algorithms, CPUs, memory, software stacks, cabling, racks, ROI, TCO, nodes, names, federation, centralization, organization until you get “the pitch.” I’m not really interested in the pitch for why someone’s product is better than the other, I’m more interested in the “What is the Problem that you’re trying to solve?” This to me gets to the root of Big Data,or the consolidation of a set of diverse data sources with a multitude of data types for which you’re attempting to determine relationships and patterns amongst it. Phew. Got it?
Me neither, but I like to think in examples and this is where it dawned on me in the grocery store.
Read More »
Tags: Big Data, data center, retail
Today, Big Data and Hadoop are arguably the hottest (and most mysterious) subjects in computing for most technology workers. Ask any person in IT about Big Data/ Hadoop and you’ll probably get a look of utter confusion. Here at Cisco, I’ve recently taken on the role of Product Manager for Cisco Tidal Enterprise Scheduler (TES) and part of my job is to help you face your fears and put your arms around the Big Data boogeyman.
Big Data’s growth in the market has exploded and it’s clear why: data-driven decision-making results in optimal business outcomes. With Big Data/ Hadoop, analyzing massive datasets has become easier and we glean new business insights, which can be a massive competitive advantage.
I just arrived in NYC for Strata Conference + Hadoop World 2012, where I’m part of the Cisco team here to show off the new 6.1 release of Cisco Tidal Enterprise Scheduler announced yesterday. With 6.1, Cisco TES includes Hadoop integration – to help our customers address the Big Data challenge and gain even more value from your infrastructure. The workload automation features provided by TES are an integral part of getting the most out of your Hadoop deployments.
At the Strata event, we’re featuring Cisco UCS servers and Cisco Nexus switches for Big Data as well as our Cisco TES support for Hadoop. To see Cisco TES and Hadoop in action, check out this online demo here. This demo runs on UCS and schedules a Hadoop MapReduce job every 15 minutes to track tweets at the conference – revealing the biggest Twitter topics and the most active tweeps in Big Data this week.
In addition to our support for Hadoop and Big Data, with TES 6.1 we’ve announced a self-service portal, support for Amazon Web Services’ (AWS) EC2 & S3 features, and an iPhone app. AWS support adds the advantages of cloud-based Hadoop by providing the scalability and agility to expand capacity as needed coupled with Hadoop’s analytical strength. Throwing TES 6.1 into the AWS mix provides automated, efficient provisioning of cloud resources.
Read More »
Tags: Big Data, data center, Hadoop Summit, Tidal Enterprise Scheduler, unified management
In my last installment on Cisco Tidal Enterprise Scheduler, I talked about how our customers have increased their usage of this end-to-end solution by 2-3x.
Based on the upswing of Big Data usage and adoption in the market, this trend is likely to continue for quite some time. And Cisco finds itself with the right solution at the right time.
I’m live here from Strata Conn/Hadoop World 2012 in New York City where “Strata Conference explores the changes brought to technology and business by big data, data science, and pervasive computing. This year, Strata has joined forces with Hadoop World to create the largest gathering of the Apache Hadoop community in the world.” It’s THE place to be for the Big Data and Hadoop geek out.
At this sold out event Cisco is introducing our new 6.1 release of Cisco Tidal Enterprise Scheduler. This new release is packed with new features such as a very cool iPhone app, integration into Amazon EC2 and S3 and a self service portal (stay tuned for more blogs on this later). It also includes a new Hadoop adapter with API integration into Sqoop, Hive, HDFS Data Mover and MapReduce.
What’s that you say? Enterprise workload automation for Hadoop clusters? Why would I need that?
Read More »
Tags: Big Data, intelligent automation, workload automation