As Hadoop is becoming the center of Data Management and OpenStack the platform for private clouds, many organizations have begun to think – When to virtualize Hadoop? – What is the role of OpenStack in Hadoop?
When to virtualize Hadoop? While most of Hadoop deployments in the coming years are expected to be on bare-metal environments there are two main reasons to virtualize Hadoop (i) Offering Hadoop as a Service (HaaS) for their internal (or external) customers by consolidating multiple Hadoop clusters on the same physical cluster to improve infrastructure utilization, and provide access controls and security isolation between tenants (ii) Running both production (with a stable version of the software stack) and test (experimenting using beta or latest versions of software stacks) environments at the same scale, using a single underlying infrastructure platform. Since workloads that work well with smaller datasets on smaller clusters can often fail as you scale up to larger clusters (for various reasons), customers may find a collocated approach (using virtualization to logically separate production and test environments) to achieve more predictable results.
What is the role of OpenStack in Hadoop ? OpenStack brings the operating system for clouds of all types of clouds, whether public, private, or hybrid. It enables self-service provisioning, elastic scaling and support for multi-tenancy – all critical for enabling Hadoop as a Service (HaaS).
While Hadoop and OpenStack are attractive from the standpoint of their respective innovations, deploying enterprise class solutions with such new technologies can be very challenging.
Today, we are announcing the availability of Cisco Validated Design for HaaS with Cisco UCS Common Platform Architecture (CPA v2) for Big Data. The solution uses Hortonworks Data Platform and Canonical OpenStack Platform on Cisco UCS CPA v2 for Big Data. The objective of the CVD is to provide step by step instructions that help ensure fast, reliable, and predictable deployments should a customer decide that the time is right to virtualize Hadoop.