Everybody has been talking about big data over the past years . Your data continues to grow, both in mass and importance. And you know that your company is in need of better analytics to use the influx of data as a point of improvement for business. As the Internet expands and connects all things previously unconnected (a concept referred to as the Internet of Everything, or IoE), consumers have access to more personalized information that keeps them engaged and delivers efficient services. This means data is pouring in from—well, everywhere. To sort and utilize it for better user experiences, it’s first necessary to ensure your data center is capable of gathering and housing all this data. And that starts at the foundation.
Our distinguished engineer and Chief Architect of Big Data Solutions at Cisco, Raghunath Nambiar, talks about “A Unified Platform for Big Data” in our last edition of Unleashing IT . Recently elected by the Transaction Processing Performance Council (TPC) to lead the development of the industry’s first big data benchmark standard, Nambiar states “To get the most out of big data, companies need an infrastructure that is tuned for big data workloads, with better performance and scalability than traditional environments.” Read more here .
In fact, the Intel® Xeon® processor-based Cisco® Unified Computing System™ (Cisco UCS®) Common Platform Architecture (CPA) for Big Data is a robust platform built on a unified fabric, and based on Cisco Nexus® switches for exceptional availability and scalability. Built specifically with Big Data in mind, this certified and validated architecture has been utilized by businesses in a variety of industries.
Read More »
Tags: Big Data, Cisco UCS, Common Platform Architecture, CPA, Intel, Ragunath Nambiar, Unleashing IT
While there is not yet an industry standard benchmark for measuring the performance of Hadoop systems (yes, there is work in progress -- WBDB, BigDataTop100 etc), workloads like TeraSort have become a popular choice to benchmark and stress test Hadoop clusters.
TeraSort is very simple, consists of three map/reduce programs (i) TeraGen -- generates the dataset (ii) TeraSort -- samples and sort the dataset (iii) TeraValidate -- validates the output. With multiple vendors now publishing TeraSort results, organizations can make reasonable performance comparisons while evaluating Hadoop clusters.
We conducted a series of TeraSort tests on our popular Cisco UCS Common Platform Architecture (CPA) for Big Data rack with 16 Cisco UCS C240 M3 Rack Servers equipped with two Intel Xeon E5-2665 processors, running Apache Hadoop distribution, see figure below, demonstrating industry leading performance and scalability over a range of data set sizes from 100GB to 50TB. For example, out of the box, our 10TB result is 40 percent faster than HP’s published result on 18 HP ProLiant DL380 Servers equipped with two Intel Xeon E5-2667 processors.
While Hadoop offers many advantages for organizations, the Cisco story isn’t complete without including collaborations with our ecosystem partners that enables us to offer complete solution stacks. We support leading Hadoop distributions including Cloudera, HortonWorks, Intel, MapR, and Pivotal on our Cisco UCS Common Platform Architecture (CPA) for Big Data. We just announced our Big Data Design Zone that offers Cisco Validated Designs (CVD) -- pretested and validated architectures that accelerate the time to value for customers while reducing risks and deployment challenges.
Cisco Big Data Design Zone
Cisco UCS Demonstrates Leading TeraSort Benchmark Performance
Cisco UCS Common Platform Architecture (CPA) for Big Data
Tags: Big Data, Big Data Benchmarks, Cisco UCS C240 M3 Rack Server, Cisco UCS CPA, CPA, Hadoop, TeraSort, YCSB
Cisco UCS Common Platform Architecture (CPA) for Big Data offers a comprehensive stack for enterprise Hadoop deployments. Today we announce the availability of Cisco Validated Design (CVD) for Cloudera (CDH) that describes the architecture and deployment procedures, jointly tested and certified by Cisco and Cloudera to accelerate deployments while reducing the risks, complexity, and total cost of ownership.
Together, Cisco and Cloudera are well positioned to help organizations exploit the valuable business insights found in all their data, regardless of whether it’s structured, semi structured or unstructured. The solution offers industry-leading performance, scalability and advanced management capabilities to address the business needs of our customers.
The rack level configuration detailed in the document can be extended to multiple rack scale. Up to 160 servers (10 racks) can be supported with no additional switching in a single UCS domain. Scaling beyond 10 racks can be implemented by interconnecting multiple UCS domains using Nexus 6000/7000 Series switches, scalable to thousands of servers and to hundreds of petabytes storage, and managed from a single pane using UCS Central.
We would like to invite you to our upcoming Journey to Big Data Roadshow in a city near you, designed to help you identify where you are on your Big Data journey, and how to keep that journey going in a low-risk, productive way.
1. Cisco UCS CPA for Big Data with Cloudera
2. Flexpod Select for Hadoop with Cloudera
3. Cloudera Enterprise with Cisco Unified Computing System (solution brief)
Tags: Cisco UCS CPA, Cloudera, CPA, Hadoop, Journey to Big Data
Speed is everything. Continuing our commitment to make data center infrastructures more responsive to enterprise applications demands, today, we announced FlexPod Select with Hadoop, formerly known as NetApp Open Solution for Hadoop, broadening our FlexPod portfolio. Developed in collaboration between Cisco and NetApp, offers an enterprise-class infrastructure that accelerates time to value from your data. This solution is pre-validated for Hadoop deployments built using Cisco 6200 Series Fabric Interconnects (connectivity and management), C220 M3 Servers (compute), NetApp FAS2220 (namenode metadata storage) and NetApp E5400 series storage arrays (data storage). Following the highly successful FlexPod model of pre-sized rack level configurations, this solution will be made available through the well-established FlexPod sales engagement and channel.
The FlexPod Select with Hadoop architecture is an extension of our popular Cisco UCS Common Platform Architecture (CPA) for Big Data designed for applications requiring enterprise class external storage array features like RAID protection with data replication, hot-swappable spares, proactive drive health monitoring, faster recovery from disk failures and automated I/O path fail-over. The architecture consists of a master rack and optionally up to nine expansion racks in a single management domain, creating a complete, self-contained Hadoop cluster. The master rack provides all of the components required to run a 12 node Hadoop cluster supporting 540TB storage capacity. Each additional expansion rack provides an additional 16 Hadoop cluster nodes and 720TB storage capacity. Unique to this architecture is seamless management integration and data integration capabilities with existing FlexPod deployments that can help to significantly lower the infrastructure and management costs.
FlexPod Select has been pretested and jointly validated with leading Hadoop vendors, including Cloudera and Hortonworks.
Tags: Big Data, Cloudera, CPA, FlexPod, FlexPod Select, Hadoop, Hortonworks, netapp
Cisco and NetApp have been partners for over a decade, and in January we announced the planned expansion of our partnership. We are always looking to work with our partners in new ways to offer customers greater choice, and Cisco and NetApp are working toward delivering a complete platform for enterprises in data-intensive industries with business-critical SLAs. The solution will offer pre-sized storage, networking, and compute in a highly reliable, ready-to-deploy Hadoop stack, and it is planned to be generally available summer 2013. But, who can wait until summer?! I know we can’t, so we’re going to offer a demo of the joint reference architecture at Cisco Live! Melbourne March 5-8, and we hope you’ll stop by to check it out!
To give you more information on the solution — it will be pre-validated for enterprise Hadoop deployments built using 6296 Fabric Interconnects (connectivity and management), a pair of Nexus 2232s, C220 M3 Servers (compute) and NetApp E5400 and FAS 2240 series storage arrays. Following the -- highly successful -- FlexPod model of pre-sized rack level configurations, this solution will be made available through the well-established FlexPod sales engagement and channel. Field sales and partners from both companies will resell the solution upon general availability.
Tags: Big Data, Cisco UCS CPA, CPA, Hadoop, netapp