Cisco Blogs


Cisco Blog > Data Center and Cloud

Introducing TPCx-HS – first Industry Standard for Benchmarking Big Data Systems

Over the past quarter century, industry standard benchmarks have had a significant impact on the computing industry. Vendors use benchmark standards to illustrate performance competitiveness for their existing products, and to improve and monitor the performance of their products under development. Many buyers use the results as points of comparison when purchasing new computing systems.

Continuing on the Transaction Processing Performance Council’s commitment to bring relevant benchmarks to industry, it is my great pleasure to announce TPCx-HS – the first standard that provides verifiable performance, price/performance and energy consumption metrics for big data systems. TPCx-HS can be used to asses a broad range of system topologies and implementation methodologies for Hadoop, in a technically rigorous and directly comparable, vendor-neutral manner. And while modeling is  based on a simple application, the results are highly relevant to Big Data hardware and software systems.

Developing an industry standard benchmark for a new environment like Big Data has taken the dedicated efforts of experts across many companies. I would like to thank the contributions of  Andrew Bond (Red Hat), Andrew Masland (NEC), Avik Dey (Intel), Brian Caufield (IBM), Chaitanya Baru (SDSC), Da Qi Ren (Huawei), Dileep Kumar (Cloudera), Jamie Reding (Microsoft), John Fowler (Oracle), John Poelman (IBM), Karthik Kulkarni (Cisco), Meikel Poess (Oracle), Mike Brey (Oracle), Mike Crocker (SAP), Paul Cao (HP), Reza Taheri (VMware), Simon Harris (IBM), Tariq Magdon-Ismail (VMware), Wayne Smith (Intel), Yanpei Chen (Cloudera), Michael Majdalany (L&M), Forrest Carman (Owen Media) and Andreas Hotea (Hotea Solutions).

I envision that TPCx-HS will be a useful benchmark standard to buyers, as they evaluate new systems for Hadoop deployments in terms of performance, price/performance and energy efficiency. And for vendors in demonstrating competitiveness of their products.

Sincerely,
Raghunath Nambiar
(Chair TPC Big Data Committee)

Additional information
TPCx-HS Portal
TPC Press Release
Slides
TPC takes the measure of big data systems by Joab Jackson
Can it be true? A BIG DATA benchmark? Yes, says TPC by Chris Mellor

TPCTC2014/VLDB2014 paper presentation: Introducing TPCx-HS: Industry’s First Standard for Benchmarking Big Data Systems Raghunath Nambiar (Cisco), Tariq Magdon-Ismail (VMware), Akon Dey (University of Sydney), Paul Cao (HP), Andrew Bond (Red Hat), Da Qi Ren (Huawei), Meikel Poess (Oracle), Hangzhou, China, 9/4/2014

Tags: ,

TeraSort Results on Cisco UCS and Announcing Cisco Big Data Design Zone

While there is not yet an industry standard benchmark for measuring the performance of Hadoop systems (yes, there is work in progress -- WBDB, BigDataTop100 etc), workloads like TeraSort have become a popular choice to benchmark and stress test Hadoop clusters.

TeraSort is very simple, consists of three map/reduce programs (i) TeraGen -- generates the dataset (ii) TeraSort -- samples and sort the dataset (iii) TeraValidate -- validates the output. With multiple vendors now publishing TeraSort results, organizations can make reasonable performance comparisons while evaluating Hadoop clusters.

We conducted a series of TeraSort tests on our popular Cisco UCS Common Platform Architecture (CPA) for Big Data rack with 16 Cisco UCS C240 M3 Rack Servers equipped with two Intel Xeon E5-2665 processors, running Apache Hadoop distribution, see figure below,  demonstrating industry leading performance and scalability over a range of data set sizes from 100GB to 50TB.  For example, out of the box, our 10TB result is 40 percent faster than HP’s published result on 18 HP ProLiant DL380 Servers equipped with two Intel Xeon E5-2667 processors.

TS1

While Hadoop offers many advantages for organizations, the Cisco story isn’t complete without including collaborations with our ecosystem partners that enables us to offer complete solution stacks. We support leading Hadoop distributions including Cloudera, HortonWorks, Intel, MapR, and Pivotal on our Cisco UCS Common Platform Architecture (CPA) for Big Data. We just announced our Big Data Design Zone that offers Cisco Validated Designs (CVD)  -- pretested and validated architectures that  accelerate the time to value for customers while reducing risks and deployment challenges.

Additional Information:
Cisco Big Data Design Zone
Cisco UCS Demonstrates Leading TeraSort Benchmark Performance
Cisco UCS Common Platform Architecture (CPA) for Big Data

Tags: , , , , , , ,

Announcing 5th International Conference on Performance Evaluation and Benchmarking

The Transaction Processing Performance Council today announced its fifth international Conference on Performance Evaluation and Benchmarking (TPCTC 2013). I’ve the great privilege of chairing TPCTC  series since 2009. This year’s conference will be collocated with the 39th International Conference on Very Large Data Bases (VLDB 2013) on August 26, 2013 in Riva del Garda, Italy. With this conference we are encouraging researchers and industry experts to submit ideas and methodologies in performance evaluation, measurement and characterization. Additional information on TPCTC 2013 is available online at http://www.tpc.org/tpctc/tpctc2013/.

Tags: , , , , , , ,

One More Step Closer “Towards an Industry Standard for Benchmarking Big Data Workloads”

Following the successful workshop “Towards an Industry Standard for Benchmarking Big Data Workloads” (WBDB 2012) held in May 2012 in San Jose [2],  the Second Workshop on Benchmarking Big Data Workloads (WBDB2012.in) [1] will be held in Pune, India from 17 to 18 December at the Hinjewadi Campus of Persistent Systems Ltd, colocated with the 18th International Conference on Management of Data (COMAD 2012) [3].

I have the great pleasure to co-chair this workshop with my distinguished colleagues Chaitanya Baru, Meikel Poess, Milind Bhandarkar and Tilmann Rabl with support from the National Science Foundation (NSF.gov).

The objective of the workshop series is to foster the development of industry standards for providing objective measures of the effectiveness of hardware and software systems dealing with Big Data. Several industry experts and researchers are expected to present and debate their vision on benchmarking big data platforms.

[1] WBDB 2012.in http://clds.ucsd.edu/wbdb2012.in, CFP: http://clds.ucsd.edu/sites/clds.ucsd.edu/files/WBDB.in_.cfp_.pdf
[2] WBDB 2012 http://blogs.cisco.com/datacenter/towards-an-industry-standard-for-benchmarking-big-data-workloads/, http://clds.ucsd.edu/wbdb2012/
[3] COMAD 2012 http://comad.in/comad2012
[4] WBDB 2012.in Program Committee http://clds.ucsd.edu/wbdb2012.in/organizers

Tags: , ,

Towards an Industry Standard for Benchmarking Big Data Workloads

Industry standard benchmarks have played, and continue to play, a crucial role in the advancement of the computing industry. Demands for them have existed since buyers were first confronted with the choice between purchasing one system over another. Over the years, industry standard benchmarks have proven critical to both buyers and vendors: buyers use benchmark results when evaluating new systems in terms of performance, price/performance and energy efficiency, while vendors use benchmarks to demonstrate competitiveness of their products and to monitor release-to-release progress of their products under development [1]. Historically we have seen that industry standard benchmarks enable healthy competition that results in product improvements and the evolution of brand new technologies.

Over the past quarter-century, industry standard bodies like the Transaction Processing Performance Council (TPC) and the Standard Performance Evaluation Corporation (SPEC) have developed several industry standards for performance benchmarking, which have been a significant driving force behind the development of faster, less expensive, and/or more energy efficient system configurations.

The world has been in the midst of an extraordinary information explosion over the past decade, punctuated by rapid growth in the use of the Internet and the number of connected devices worldwide. Today, we’re seeing a rate of change faster than at any point throughout history, and both enterprise application data and machine generated data, known as Big Data, continue to grow exponentially, challenging industry experts and researchers to develop new innovative techniques to evaluate and benchmark hardware and software technologies and products.

I am co-chairing a workshop with my distinguished colleagues Chaitanya Baru, Meikel Poess, Milind Bhandarkar, Tilmann Rabl and others entitled Workshop on Big Data Benchmarking (WBDB 2012) [2], supported by the National Science Foundation (NSF.gov). This is a crucial initial step towards the development of an industry standard benchmark for providing objective measures of the effectiveness of hardware and software systems dealing with Big Data. Several industry experts and researchers have been invited to present and debate their vision on benchmarking big data platforms.

A report from this workshop will be presented at the just-announced 4th International Conference on Performance Evaluation Benchmarking (TPCTC 2012) [3], organized by the TPC, which will be collocated with the 38th International Conference of Very Large Data Bases (VLDB 2012), a premier forum for data management and database researchers, vendors and users. With this conference, we encourage industry experts and researchers to submit ideas and methodologies in performance evaluation, measurement and characterization in areas including, but not limited to: big data, cloud computing, business intelligence, energy and space efficiency, hardware and software innovations and lessons learned in practice using TPC and other benchmark workloads [4].

Cisco has been an active member of the TPC since 2010 and the SPEC since 2009.

[1] R. Nambiar, N. Wakou, P. Thawley, A. Masland, M. Lanken, M. Majdalany, F. Carman: Shaping the Landscape of Industry Standard Benchmarks: Contributions of the Transaction Processing Performance Council: Springer 2011
[2] Workshop on Big Data Benchmarking: http://clds.ucsd.edu/wbdb2012/
[3] TPC Press Release: http://finance.yahoo.com/news/transaction-processing-performance-council-announces-150000511.html
[4] TPCTC 2012 Call for Papers: http://www.tpc.org/tpctc2012/

Tags: , , , ,