Want to get the most out of your big data? Build an enterprise data hub (EDH).
Big data is rapidly getting bigger. That in itself isn’t a problem. The issue is what Gartner analyst Doug Laney describes as the three Vs of Big Data: volume, velocity, and variety.
Volume refers to the ever-growing amount of data being collected. Velocity is the speed at which the data is being produced and moved through the enterprise information systems. Variety refers to the fact that we’re gathering information from multiple data sources such as sensors, enterprise resource planning (ERP) systems, e-commerce transactions, log files, supply chain info, social media feeds, and the list goes on.
Data warehouses weren’t made to handle this fast-flowing stream of wildly dissimilar data. Using them for this purpose has led to resource-draining, sluggish response times as workers attempt to perform numerous extract, load, and transform (ELT) functions to make stored data accessible and usable for the task at hand.
Constructing Your Hub
An EDH addresses this problem. It serves as a central platform that enables organizations to collect structured, unstructured, and semi-structured data from slews of sources, process it quickly, and make it available throughout the enterprise.
Building an EDH begins with selecting the right technology in three key areas: infrastructure, a foundational system to drive EDH applications, and the data integration platform. Obviously, you want to choose solutions that fit your needs today and allow for future growth. You’ll also want to ensure they are tested and validated to work well together and with your existing technology ecosystem. In this post, we’ll focus on selecting the right hardware.
The Infrastructure Component
Big data deployments must be able to handle continued growth, from both a data and user load perspective. Therefore, the underlying hardware must be architected to run efficiently as a scalable cluster. Important features such as the integration of compute and network, unified management, and fast provisioning all contribute to an elastic, cloud-like infrastructure that’s required for big data workloads. No longer is it satisfactory to stand up independent new applications that result in new silos. Instead, you should plan for a common and consistent architecture to meet all of your workload requirements.
Big data workloads represent a relatively new model for most data centers, but that doesn’t mean best practices must change. Handling a big data workload should be viewed from the same lens as deployments of traditional enterprise applications. As always, you want to standardize on reference architectures, optimize your spending, provision new servers quickly and consistently, and meet the performance requirements of your end users.
Cisco Unified Computing System to Run Your EDH
The Cisco Unified Computing System™ (Cisco UCS®) Integrated Infrastructure for Big Data delivers a highly scalable platform that is proven for enterprise applications like Oracle, SAP, and Microsoft. It also provides the same required enterprise-class capabilities–performance, advanced monitoring, simplification of management, QoS guarantees–to big data workloads. With lower switch and cabling infrastructure costs, lower power consumption, and lower cooling requirements, you can realize a 30 percent reduction in total cost of ownership. In addition, with its service profiles, you get fast and consistent time-to-value by leveraging provisioning templates to instantly set up a new cluster or add many new nodes to an existing cluster.
And when deploying an EDH, the MapR Distribution including Apache™ Hadoop® is especially well-suited to take advantage of the compute and I/O bandwidth of Cisco UCS. Cisco and MapR have been working together for the past 2 years and have developed Cisco-validated design guides to provide customers the most value for their IT expenditures.
Cisco UCS for Big Data comes in optimized power/performance-based configurations, all of which are tested with the leading big data software distributions. You can customize these configurations further, or use the system as is. Utilizing one of Cisco UCS for Big Data’s pre-configured options goes a long way to ensuring a stress-free deployment. All Cisco UCS solutions also provide a single point of control for managing all computing, networking, and storage resources, for any fine tuning you may do before deployment or as your hub evolves in the future.
I encourage you to check out the latest Gartner video to hear Satinder Sethi, our VP of Data Center Solutions Engineering and UCS Product Management, share his perspective on how powering your infrastructure is an important component of building an enterprise data hub.
In addition, you can read the MapR Blog, Building an Enterprise Data Hub, Choosing the Foundational Software.
Let me know if you have any comments or questions, or via twitter at @CicconeScott.
Tags: Big Data, blade server, blades servers, C240 M3 Rack Server, Cisco UCS, Cisco Unified Computing System, Cisco Unified Data Center, Cisco Unified Fabric, Enterprise Data Hub, Gartner, Hadoop, MapR, rack server, UCS Central, UCS service profiles
Have a bit of free time this Wednesday morning? If so please feel free to sit in on a Cisco keynote delivered by Mark Balch, Director of Cisco UCS Product Management, as he outlines the challenges faced and the discoveries made with the UCS family and how it has driven revolutionary change and business benefits for today’s modern datacenter.
The Cisco keynote starts WindowsITPro’s “virtual trade show” on Optimizing Your Virtual Infrastructure”. The event brings top industry Microsoft experts together in an online forum affording attendees the opportunity to learn about key datacenter optimization topics and trends.
Our UCS family has been a leader in Data Center optimization since it’s initial release to market five years ago. Having been designed for virtualization from the beginning, UCS is an integrated system that is configured through unified, model-based management to simplify deployment of enterprise-class applications and services running in bare-metal, virtualized, and cloud-computing environments.
Download the UCS Family poster
Read More »
Tags: ACI, Cisco, Cisco Data Center, Cisco UCS, datacenter, Hardware Optimization, Microsoft, network optimization, nexus
My final observation from my days at the London Gartner Data Center Conference is related to SDN and ease of network management – or otherwise. Hopefully this discussion will give you some ideas for good questions to ask at the Las Vegas conference, which is running as I write this.
Cisco UCS on show at the Gartner Data Center Conference
Before I start, if you are at the conference in Las Vegas, please do take time out to visit the Cisco stand #305 to find out more onCisco solutions including Unified Computing and ACI. Also take some time to say hello to our with new, exciting team members from our Metacloud acquisition – it’s fantastic to have such OpenStack and DevOps expertise in particular part of the Cisco team.
To catch up on my earlier questions, see my part 1 and part 2 blogs – questions you can ask at any SDN conference or of any vendor, since this blog series is not just about the Gartner conference. Now on to more SDN questions to ask ….
Read More »
Tags: ACI, API, Cisco UCS, cisco_services, network_management, nms, SDN
The Internet of Everything continues to gain momentum and every new connection is creating new data. Cisco UCS Integrated Infrastructure for Big Data is helping customers convert that data into powerful intelligence, and we’re working with a number of new partners to bring exciting new solutions to our customers.
Today, I want to spotlight Elasticsearch, Inc. and welcome them to the Cisco Solution Partner Program.
Elasticsearch excels at providing real-time insight into data – whether structured or unstructured, human- or machine-generated; by bringing a search-based architecture to data analytics. By combining the ELK stack with Cisco UCS, organizations benefit from a turnkey underlying infrastructure solution that provides them with real-time search and analytics for a variety of applications, from log analysis, to structured, semi-structured, or unstructured searches, as well as a web-backend for custom applications that use search-based analytics as a core functionality.
Mozilla is just one of the companies who are already benefiting from the joint solution with real-time search and analysis of data powering its defense platform, MozDef. The ELK stack leverages Cisco UCS’ fast connectivity for query, indexing and replication of data traffic. And Elasticsearch handles the full scale of event storage, archiving, indexing and searching of the data logs. The ELK stack and Cisco UCS also protect Mozilla’s network, services, systems, and audit data from hackers.
Partners like Elasticsearch are just one reason that Cisco UCS Integrated Infrastructure can help your company capitalize on the IoE data avalanche and deliver powerful and cost-effective analytics solutions throughout your enterprise.
Find out more at www.cisco.com/go/bigdata, or register for a webinar entitled, “Learn How Mozilla Tackles their Security Logs with Elasticsearch and Cisco”.
Thursday, November 13th
9:00 AM PST / 12:00 PM EST / 5:00 PM GMT
Are you interested in learning how to build enterprise applications on top of Elasticsearch and Cisco’s Unified Computing System (UCS) infrastructure? We’re holding a webinar to delve more deeply into how to optimize ELK on Cisco UCS infrastructure.
Cisco UCS unites compute, network, and storage access into a single cohesive system. By combining the ELK stack with Cisco UCS, businesses benefit by having a turnkey hardware-software solution for their search and analytics applications. In this webinar you’ll learn about the various UCS hardware profiles you should consider when deploying ELK and how Mozilla built MozDef, their custom SIEM application, using ELK on Cisco UCS.
- Introduction – Jobi George, Elasticsearch (5 minutes)
- Overview of UCS + ELK reference architectures – Raghunath Nambiar, Distinguished Engineer, Data Center Business Group, Cisco (10 minutes)
- How Mozilla Built MozDef on ELK and Cisco UCS – Jeff Bryner, Security Assurance, Mozilla (25 minutes)
- Q&A – Jobi George, Elasticsearch (~20 minutes)
Tags: analytics, Big Data, Cisco, Cisco UCS, Cisco Unified Computing System, elasticsearch, UCS
A Guest Blog by Partner Rick Heiges of Scalability Experts: Rick is a SQL Server Microsoft MVP and Senior Solutions Architect. He primarily works with Enterprise customers on their Data Platform strategies. Rick is also very involved in the SQL Server Community primarily through PASS and events such as the PASS Summit, SQL Saturdays, and 24 Hours of PASS. His tenure on the PASS Board of Directors saw the annual Summit triple in size from 2003 to 2011. You can find his blog at www.sqlblog.com.
So far, it has been another great week here at the PASS Summit 2014, SQL Server’s largest annual user and partner conference. With yesterday’s keynote address, there is still very much a focus on getting to the cloud and new investments in cloud technology in general. Microsoft seems to be extending its data collection and storage technologies in the cloud and also on-prem. One of the coolest features talked about was the concept of a “stretch tables” where a table that lives on your on-prem SQL Server can be “stretched” on to tables in SQL Azure Databases. The data may be shared so that the “hot” data can stay local and the “cold” data would live in the cloud. There were some other great demos around using the Kinect device to create a heat map of customer activity in a physical store (similar to what people linger and search for when shopping online). You can watch the PASS Summit 2014 Keynote here on PASStv.
As a Senior Solutions Architect with Scalability Experts, I work with large enterprise customers (Fortune 500 type) on a regular basis. There is more and more interest about leveraging the Public Cloud for some workloads and taking advantage of “on-prem” resources in a cloud-like way. This means deploying your internal resources in a similar way – for example via Cisco’s Microsoft Fast Track certified FlexPod or VSPEX integrated infrastructure solutions – that public cloud resources are deployed with a similar chargeback (or ‘show back’) model and automating the self-service deployment of infrastructure, and the monitoring of the entire stack.
One of the things that I really like about Microsoft’s products is a focus on ease of use, tight integration, and low TCO. This is important to a lot of the customers that I interact with. This is why I have seen a surge in Cisco UCS products in my customer base of the past few years. Cisco has a similar goal to keep things simple and TCO low – read this Total Economic Impact report from Forrester on UCS ROI/TCO. Cisco also provides Management Pack plug-ins to Microsoft’s System Center suite for tight integration so that you can manage the entire stack (Hardware, Hypervisor, Application, and even Public Cloud) with a single tool. It is great to see how this partnership between Microsoft and Cisco can be beneficial to the customers that I work with.
Microsoft’s SQL Server 2014 also brings “In-Memory” Technology to OLTP in a cost-effective manner by not forcing a complete rewrite of the application. In a recent Cisco UCS on Microsoft SQL Server 2014 case study, Progressive Insurance was able to take advantage of this technology to further its strategy of its competitive advantage – ease of use.
Eventually, I see the Public Cloud taking on a more “primary” role in the future. Similar to the “Everything on a VM unless there is a reason not to” mantra, I see an “Everything on a Public Cloud VM unless there is a reason not to” mantra on the long-term horizon. Until then, the Hybrid Cloud will be the default stance for many large enterprises.
Tags: Big Data, Cisco, Cisco UCS, FlexPod, Microsoft, Microsoft Hyper-V, Microsoft SQL Server, Microsoft SQL Server2014, Nexus 1000v, PASS Summit 2014, SQL PASS, vspex