Towards an industry standard for benchmarking AI
After decades of struggle and disappointing results, Artificial Intelligence (AI) is finally coming into its own. Recent advances in computational power, mathematical refinements enabling the creation of much deeper neural networks, and dramatic improvements in techniques used to train machine learning systems have all combined to create applications with real practical value. IBM’s Watson beating the Jeopardy! champions, and Google DeepMind’s AlphaGo beating the highest ranked world champion Go player are two recent high-profile examples.
But what is Artificial Intelligence? In the broadest terms, AI is the attempt to create human level intelligence in machines. This is not something we’ve achieved and some argue we never will (though I wouldn’t bet against human innovation). Nevertheless, AI research has spawned many subfields like computer vision, robotics and natural language processing. Many of these fields make use of Machine Learning.
Machine Learning is easier to define. A machine learning system is one where the output or performance of the system improves as more data is given to it to process. Machine Learning systems are not programmed with specific logic like traditional computer systems. Instead, they identify relationships and patterns in the data, build a model of the problem and use the model to make predictions on new data. Contrast this with data mining which shares some techniques with machine learning but with some significant differences. Data mining uses pre-programmed techniques to identify patterns in the data in a human-directed effort to find meaningful insights. A Machine Learning system does not require either the programming or human direction to produce its output.
The United States Congress recognizes the need to better understand Artificial Intelligence and its impact on society. Senators from five states have drafted a bill to establish a committee on Artificial Intelligence to advise the government on how to implement, regulate and promote the development of Artificial Intelligence.
Artificial Intelligence is currently seeing everyday use in applications as diverse as speech recognition, sentiment analysis and language translation (natural language processing), computer vision and image recognition, autonomous vehicles and recommendation engines. It is a rapidly-growing area, being evaluated for a broad array of use cases across consumer, enterprise, and government markets.
In tandem with this transformative process is the need for industry standards for benchmarking hardware and software systems and how they handle different workloads. These standards are used for comparison between systems, and more importantly, create benchmarks that are used to drive innovation, fueling an iterative process resulting in higher performing systems at lower cost and more efficient energy usage. The unique qualities of AI introduce new challenges, in particular, how to characterize performance and total cost of ownership (TCO). As such, it is critical for organizations like the Transaction Processing Performance Council (TPC) to develop standards that can be used by vendors, customers and researchers.
To this end, the TPC has announced the formation of a new Working Group (TPC-AI), and I am honored to have been elected chairman. The TPC-AI Working Group is tasked with developing industry standard benchmarks for both hardware and software platforms associated with running Artificial Intelligence based workloads. We will be working to define a level playing field for vendors, identify the areas with the greatest potential for improvement through performance optimization, and understand what are the key factors for customers when making their purchase decisions.
I encourage organizations that are interested in participating in the benchmarking development process to join the TPC.
Raghu Nambiar (Chairman, TPC AI)