Big data is rapidly getting bigger. That in itself isn’t a problem. The issue is what Gartner analyst Doug Laney describes as the three Vs of Big Data: volume, velocity, and variety.
Volume refers to the ever-growing amount of data being collected. Velocity is the speed at which the data is being produced and moved through the enterprise information systems. Variety refers to the fact that we’re gathering information from multiple data sources such as sensors, enterprise resource planning (ERP) systems, e-commerce transactions, log files, supply chain info, social media feeds, and the list goes on.
Data warehouses weren’t made to handle this fast-flowing stream of wildly dissimilar data. Using them for this purpose has led to resource-draining, sluggish response times as workers attempt to perform numerous extract, load, and transform (ELT) functions to make stored data accessible and usable for the task at hand.
Constructing Your Hub
An EDH addresses this problem. It serves as a central platform that enables organizations to collect structured, unstructured, and semi-structured data from slews of sources, process it quickly, and make it available throughout the enterprise.
Building an EDH begins with selecting the right technology in three key areas: infrastructure, a foundational system to drive EDH applications, and the data integration platform. Obviously, you want to choose solutions that fit your needs today and allow for future growth. You’ll also want to ensure they are tested and validated to work well together and with your existing technology ecosystem. In this post, we’ll focus on selecting the right hardware.
The Infrastructure Component
Big data deployments must be able to handle continued growth, from both a data and user load perspective. Therefore, the underlying hardware must be architected to run efficiently as a scalable cluster. Important features such as the integration of compute and network, unified management, and fast provisioning all contribute to an elastic, cloud-like infrastructure that’s required for big data workloads. No longer is it satisfactory to stand up independent new applications that result in new silos. Instead, you should plan for a common and consistent architecture to meet all of your workload requirements.
Big data workloads represent a relatively new model for most data centers, but that doesn’t mean best practices must change. Handling a big data workload should be viewed from the same lens as deployments of traditional enterprise applications. As always, you want to standardize on reference architectures, optimize your spending, provision new servers quickly and consistently, and meet the performance requirements of your end users.
Cisco Unified Computing System to Run Your EDH
The Cisco Unified Computing System™ (Cisco UCS®) Integrated Infrastructure for Big Data delivers a highly scalable platform that is proven for enterprise applications like Oracle, SAP, and Microsoft. It also provides the same required enterprise-class capabilities--performance, advanced monitoring, simplification of management, QoS guarantees--to big data workloads. With lower switch and cabling infrastructure costs, lower power consumption, and lower cooling requirements, you can realize a 30 percent reduction in total cost of ownership. In addition, with its service profiles, you get fast and consistent time-to-value by leveraging provisioning templates to instantly set up a new cluster or add many new nodes to an existing cluster.
And when deploying an EDH, the MapR Distribution including Apache™ Hadoop® is especially well-suited to take advantage of the compute and I/O bandwidth of Cisco UCS. Cisco and MapR have been working together for the past 2 years and have developed Cisco-validated design guides to provide customers the most value for their IT expenditures.
Cisco UCS for Big Data comes in optimized power/performance-based configurations, all of which are tested with the leading big data software distributions. You can customize these configurations further, or use the system as is. Utilizing one of Cisco UCS for Big Data’s pre-configured options goes a long way to ensuring a stress-free deployment. All Cisco UCS solutions also provide a single point of control for managing all computing, networking, and storage resources, for any fine tuning you may do before deployment or as your hub evolves in the future.
I encourage you to check out the latest Gartner video to hear Satinder Sethi, our VP of Data Center Solutions Engineering and UCS Product Management, share his perspective on how powering your infrastructure is an important component of building an enterprise data hub.
It is but just a year since Cisco announced its acquisition of Insieme Networks and the Application Centric Infrastructure (ACI) approach for Data Center SDN deployments. The promise was to bring a simpler, highly automated and secure data center taking a policy-driven approach.
Now, with more than double the number of APIC customers, over 900 Nexus 9000 Customers, and nearly 35 Ecosystem Partners, this promise is fast becoming a reality. Nothing can be more real than customers willing to speak of their deployments and of the benefits they are beginning to derive in a tangible manner, or of ecosystem partners willing to integrate into an open architecture and indulge in joint go-to-market activities.
Today we’d like to invite you to a set of special webcasts where you can hear the perspectives of different organizations on how Application Centric Infrastructure transforms IT
You’ll also hear from ACI Ecosystem Partners and how their solutions integrate to help customize and extend ACI deployments leveraging the open architecture, and how joint customers are benefiting from multi-vendor innovation.
Global Participation. A few weeks ago, the OpenStack community gathered came in Paris for the Juno release of the OpenStack platform. The Foundation reported record attendance of 4600, many attending a Summit for the first time. Of those visiting Cisco’s booth: 68% came from Europe/Middle East/Africa/Russia; 18% from the Americas, 12% from Asia Pacific/Japan and 1% from Greater China.
Cisco Presenters. In his premiere breakout, “A World of Many OpenStack Clouds,” Lew Tucker, VP and CTO for Cisco Cloud Computing and Vice-Chair of the OpenStack Foundation, spoke about the future of cloud and how past experience building the Internet might be applied to building the Intercloud—the universe of clouds. Additional comments can be viewed in eWeek’s interview with Lew Tucker, where he discusses Cisco’s expanding support for OpenStack.
Cisco was a Gold Sponsor of the Summit and delivered eight technical presentations. If you missed the Summit, check out these links to video recordings of the Cisco sessions:
Hot Topics. Four years in, OpenStack has growing global support, a strong ecosystem of vendors, end-users running in production, and leading-edge companies willing to talk about their experience scaling OpenStack in their organizations. The number of new OpenStack core projects being initiated has tapered off and the Foundation reports that ten times as many bug fixes were contributed as new features in this most recent cycle, producing a stronger focus on stability.
Other topics trending up at the Paris Summit were:
Software defined ‘X’ (with X equaling a wide variety of technologies and services)
Docker/ Linux containers
Juno Release Highlights
Formation of the new Sahara data processing project, which allows users to run Hadoop big data clusters in an OpenStack cloud
Full integration of Swift cloud storage project, which allows users to define and apply different cloud storage policies
Key improvements to the Neutron networking project to support Distributed Virtual Routing and IPv6 protocols, increasingly important in large scale deployments
Enhancement to the Nova compute project to simplify recovery after a server failure
Extension of the Keystone access and identity project to accommodate federated multi-cloud environments.
Keynote Speakers. Matt Haines, VP Cloud Engineering and Operations for Time Warner Cable (TWC), reviewed the work he oversaw in 2014 to deploy OpenStack infrastructure at scale to support the company’s strategy of delivering “any content, on any device, anywhere.” At TWC, an OpenStack cloud provides self-service IT infrastructure for devops activity. Haines consulted with Cisco in the planning phase when he was determining which IT services would be offered, the number of data centers that would be involved, how to design the provider networks, and how to stabilize the environment. His infrastructure is currently operating with what Matt called “enterprise stability at service provider scale.”
Other keynotes included Dr. Stefan Lenz, Manager of Data Center IT Infrastructure for BMW, who spoke about how BMW’s internally built private cloud had failed to provide the stability required across all of the projects that OpenStack covers, and Jose Maria San Jose Juarez, Chief of Innovation in Technology for BBVA Bank, who shared how a programmatic (software driven) approach to delivering IT infrastructure was critical for achieving the agility, speed, and reliability required to deliver industry-leading customer applications.
In addition to the customer keynotes, technical leaders from CERN, Comcast, Ericsson, Expedia, Intel, and Tapjoy delivered presentations on their OpenStack deployments.
Cisco Contributions. Cisco continues to lead contributions to the OpenStack Neutron networking project, summarized in the blog Cisco and OpenStack Juno Release, Part 1. Enhancements included improvements to Neutron plugin and driver integration and metering of key network services through OpenStack Ceilometer in order to support service level agreements and monitor performance. Cisco also contributed to other core OpenStack projects, summarized in the blog Cisco and OpenStack Juno Release, Part 2, which includes enabling configuration of IPv6 through OpenStack Horizon and increasing flexibility and security of SAN access through OpenStack Cinder. In addition to core work, Cisco also contributes code to incubation projects via GitHub/Stackforge. A high visibility project this cycle has been the effort to enable programmability of network infrastructure and services through group based policy. This approach is the basis of Cisco’s implementation of Application Centric Infrastructure (ACI), which delivers new levels of agility, speed and control for IT.
In addition to direct code contributions, Cisco provides plugins that ensure OpenStack distributions run smoothly on Cisco UCS servers and Cisco Nexus physical and virtual switches. The recent acquisition of Metacloud also allows Cisco to deliver OpenStack clouds-as-a-service, onsite at the customer, providing customers an alternative to building and managing private clouds themselves. In another step forward, Cisco is also building a global Intercloud ecosystem, in which clouds built on diverse hardware platforms can be easily connected to form highly-secure and efficient hybrid clouds.
For more information on OpenStack at Cisco, visit www.cisco.com/go/openstack and mark your calendars for the next OpenStack Summit May 18-20 in Vancouver, British Columbia.
Have a bit of free time this Wednesday morning? If so please feel free to sit in on a Cisco keynote delivered by Mark Balch, Director of Cisco UCS Product Management, as he outlines the challenges faced and the discoveries made with the UCS family and how it has driven revolutionary change and business benefits for today’s modern datacenter.
The Cisco keynote starts WindowsITPro’s “virtual trade show” on Optimizing Your Virtual Infrastructure”. The event brings top industry Microsoft experts together in an online forum affording attendees the opportunity to learn about key datacenter optimization topics and trends.
Our UCS family has been a leader in Data Center optimization since it’s initial release to market five years ago. Having been designed for virtualization from the beginning, UCS is an integrated system that is configured through unified, model-based management to simplify deployment of enterprise-class applications and services running in bare-metal, virtualized, and cloud-computing environments.