Towards an Industry Standard for Benchmarking Big Data Workloads
Industry standard benchmarks have played, and continue to play, a crucial role in the advancement of the computing industry. Demands for them have existed since buyers were first confronted with the choice between purchasing one system over another. Over the years, industry standard benchmarks have proven critical to both buyers and vendors: buyers use benchmark results when evaluating new systems in terms of performance, price/performance and energy efficiency, while vendors use benchmarks to demonstrate competitiveness of their products and to monitor release-to-release progress of their products under development . Historically we have seen that industry standard benchmarks enable healthy competition that results in product improvements and the evolution of brand new technologies.
Over the past quarter-century, industry standard bodies like the Transaction Processing Performance Council (TPC) and the Standard Performance Evaluation Corporation (SPEC) have developed several industry standards for performance benchmarking, which have been a significant driving force behind the development of faster, less expensive, and/or more energy efficient system configurations.
The world has been in the midst of an extraordinary information explosion over the past decade, punctuated by rapid growth in the use of the Internet and the number of connected devices worldwide. Today, we’re seeing a rate of change faster than at any point throughout history, and both enterprise application data and machine generated data, known as Big Data, continue to grow exponentially, challenging industry experts and researchers to develop new innovative techniques to evaluate and benchmark hardware and software technologies and products.
I am co-chairing a workshop with my distinguished colleagues Chaitanya Baru, Meikel Poess, Milind Bhandarkar, Tilmann Rabl and others entitled Workshop on Big Data Benchmarking (WBDB 2012) , supported by the National Science Foundation (NSF.gov). This is a crucial initial step towards the development of an industry standard benchmark for providing objective measures of the effectiveness of hardware and software systems dealing with Big Data. Several industry experts and researchers have been invited to present and debate their vision on benchmarking big data platforms.
A report from this workshop will be presented at the just-announced 4th International Conference on Performance Evaluation Benchmarking (TPCTC 2012) , organized by the TPC, which will be collocated with the 38th International Conference of Very Large Data Bases (VLDB 2012), a premier forum for data management and database researchers, vendors and users. With this conference, we encourage industry experts and researchers to submit ideas and methodologies in performance evaluation, measurement and characterization in areas including, but not limited to: big data, cloud computing, business intelligence, energy and space efficiency, hardware and software innovations and lessons learned in practice using TPC and other benchmark workloads .
Cisco has been an active member of the TPC since 2010 and the SPEC since 2009.
 R. Nambiar, N. Wakou, P. Thawley, A. Masland, M. Lanken, M. Majdalany, F. Carman: Shaping the Landscape of Industry Standard Benchmarks: Contributions of the Transaction Processing Performance Council: Springer 2011
 Workshop on Big Data Benchmarking: http://clds.ucsd.edu/wbdb2012/
 TPC Press Release: http://finance.yahoo.com/news/transaction-processing-performance-council-announces-150000511.html
 TPCTC 2012 Call for Papers: http://www.tpc.org/tpctc2012/