This has been an exciting week. Further expanding its Big Data portfolio, Cisco has announced collaboration with Intel, its long term partner, for the next generation of open platform for data management and analytics. The joint solution combines Intel® Distribution for Apache Hadoop Software with Cisco’s Common Platform Architecture (CPA) to deliver performance, capacity, and security for enterprise-class Hadoop deployments.
As described in my blog posting, the CPA is highly scalable architecture designed to meet variety of scale-out application demands that includes compute, storage, connectivity and unified management, already being deployed in a range of industries including finance, retail, service provider, content management and government. Unique to this architecture is the seamless data integration and management integration capabilities between big data applications and enterprise applications such as Oracle Database, Microsoft SQL Server, SAP and others, as shown below:
The current version of the CPA offers two options depending on use case: Performance optimized -- offers balanced compute power with I/O bandwidth optimized for price/performance, and Capacity optimized – for low cost per terabyte. The Intel® Distribution is supported for both performance optimized and capacity optimized options, and is available in single rack and multiple rack scale.
The Intel® Distribution is a controlled distribution based on the Apache Hadoop, with feature enhancements, performance optimizations, and security options that are responsible for the solution’s enterprise quality. The combination of the Intel® Distribution and Cisco UCS joins the power of big data with a dependable deployment model that can be implemented rapidly and scaled to meet performance and capacity of demanding workloads. Enterprise-class services from Cisco and Intel can help with design, deployment, and testing, and organizations can continue to rely on these services through controlled and supported releases.
Today Paul Perez, Vice President and CTO of Cisco’s Data Center Group joined on stage downtown San Francisco Boyd A. Davis, Intel Architecture Group Vice President and GM, Data Center Software Division to announce a proposed extension of the alliance between Cisco and Intel into Big Data .
Over the past months, our readers had the opportunity to appreciate the growing investment of Cisco in this market frequently articulated by our experts Raghunath Nambiar and Jacob Rapp through blog postings and speaking at industry events.
Cisco and Intel have worked together for years to deliver enterprise solutions that improve performance and enable organizations to deliver new services. As we have stated several times recently , Intel has been a critical partner and significant contributor to the phenomenal success of the Cisco UCS. So it will not come as a surprise to anybody that Cisco and Intel are looking to partner again to offer you a leading Big Data solution.
In this video, Cisco Paul Perez and Intel Boyd Davis explained how Cisco will support the Intel distribution of Apache Hadoop on UCS, and how both companies intend to collaborate to address the growing Big Data needs of our joint customers.
Our Common Platform Architecture (CPA) for Big Data has been gaining momentum as a viable platform for enterprise big data deployments. The newest addition to the portfolio is EMC’s new Pivotal HD™ that natively integrates Greenplum MPP database technology with Apache Hadoop enabling SQL applications and traditional business intelligence tools directly on Hadoop framework. Extending support for Pivotal HD on Cisco UCS, Satinder Sethi, Vice President at Cisco’s Datacenter Group said “Hadoop is becoming a critical part of enterprise data management portfolio that must co-exist and complement enterprise applications, EMC’s Pivotal HD is an important step towards that by enabling native SQL processing for Hadoop”.
Built up on our 3+ years of partnership with Greenplum database distribution and Hadoop distributions, the joint solution offers all the architectural benefits of the CPA including: Unified Fabric -- fully redundant active-active fabric for server clustering, Fabric Extender technology -- highly scalable and cost-effective connectivity, Unified Management -- holistic management of infrastructure through a single pane of glass using UCS manager, and High performance -- high speed fabric along with Cisco UCS C240 M3 Rack Servers powered by Intel® Xeon® E5-2600 series processors. Unique to this solution is the management integration and data integration capabilities between Pivotal HD based Big Data applications running on CPA and enterprise application running on Cisco UCS B-Series Blade Servers connected to enterprise SAN storage from EMC or enterprise application running on integrated solutions like Vblock.
The Cisco solution for Pivatol HD is offered as reference architecture and as Cisco UCS SmartPlay solution bundles that can be purchased by ordering a single part number: UCS-EZ-BD-HC -- rack level solution optimized for for low cost per terabyte and UCS-EZ-BD-HP -- rack level solution offers balance of compute power with IO bandwidth optimized for price/performance.
When customers look to deploy their Hadoop solutions, one of the first questions they ask is, which distro should we run it on? For many enterprise customers, the answer has been MapR. For those of you not familiar with MapR, they offer an enterprise-grade Hadoop software solution that provides customers with a robust set of tools for running Big Data workloads. A few months ago, Cisco announced the release of Tidal Enterprise Scheduler (TES) 6.1 and with it integrations for Hadoop software distributions, such as Cloudera and MapR, as well as adapters to support Sqoop, Data Mover (HDFS), Hive, and MapReduce jobs. All performed through the same TES interface as their other enterprise workloads.
Today, I’m pleased to announce that with the upcoming 6.1.1 release of Cisco’s Tidal Enterprise Scheduler, Cisco’s MapR integration will deepen further. Leveraging Big Data for competitive advantage and rises in innovative product offerings are changing the storage, management, and analysis of an enterprise’s most critical asset -- data. The difficulty of managing Hadoop clusters will continue to grow and enterprises need solutions like Hadoop to enable the processing of large amounts of data. Cisco Tidal Enterprise Scheduler enables more efficient management of those environment because it is an intelligent solution for integrating Big Data jobs into an existing data center infrastructure. TES has adapters for a range of enterprise applications including: SAP, Informatica, Oracle, PeopleSoft, MSSQL, JDEdwards, and many others.
Stay tuned for additional blog posts on Cisco’s Tidal Enterprise Scheduler version 6.
A little over a month ago we had a chance to present as session in conjunction with Eric Sammer of Cloudera on Designing Hadoop for the Enterprise Data Center and findings at Strata + Hadoop World 2012 .
Taking a look back, we started this initiative back in early 2011 as the demand for Hadoop was on the rise and we began to notice a lot of confusion from our customers on what Hadoop would mean to their Data Center Infrastructure. This lead us to our first presentation at Hadoop World 2011 where we shared an extensive testing effort with the goal of characterizing what happens when you run a Hadoop Map/Reduce job. Further, we illustrated how different network and compute considerations would change these characteristics. As Hadoop deployment gained tracking in enterprise, we found a need of developing network reference architecture for Hadoop. This lead us to another round of testing concluded earlier this year and presented at Hadoop Summit, which examined what happened when looking at design considerations such as architectures, availability, capacity, scale and management.
Finally this brings us to last month and our presentation at Strata + Hadoop World 2012. We met with Cloudera in the months leading up to the event and discussed what we could share to the Hadoop community. We discussed all the previous rounds of testing and came to the conclusion that along with a combination of customer experiences and another round of testing that examined Multi-tenant environments we could put together a talk that really addressed the fundamental design considerations of Hadoop in the Enterprise Data Center.