According to EMA, “sophisticated users and applications, along with less expensive hardware and software, better and faster technology, and new valuable data sources are causing a shift away from single platforms solutions such as enterprise data warehouses towards a more diverse or hybrid ecosystem (watch video) focused on matching data type, workload and platform capabilities to execute these workload.
Hybrid strategies allow for deeper business insights and more sophisticated workloads but often demand more than traditional data integration tools can deliver. As more platforms are utilized and data is more diverse and geographically separated, data virtualization becomes a solution that’s critical for many companies to utilize.”
So that is the case for data virtualization. But why combine data virtualization and networking. The report addresses this point directly. “As data virtualization has matured to meet these new demands, the networks sitting at either end of the data virtualization technology, have become the bottlenecks to speed and scale.
Data virtualization technologies access data where it resides versus physically moving it to other platforms. This model allows for a more agile environment saving time and money when managing data in complex environments. Data virtualization platforms must rely upon query optimization along relatively fixed network paths to enable the transmission of this data.” Check this video dialog with Shawn Rogers, VP of Research Enterprise Management Associates.
“The acquisition of Composite Software is a wise move for Cisco as it combines the functionality of data virtualization and Cisco’s ability to understand the network as well as enact change within it to allow data virtualization to be executed at an even higher level than previous technologies allowed.”
Big Data is one of the most talked about topics of today across industry, government and research. It is becoming the center of Investments, Innovations and Improvizations (3I’s), and no exaggeration to say that Big Data is Transforming the World. Considering it’s potential the IEEE Computer Society is conducting the IEEE International Conference on Big Data 2013, a premier forum to disseminate and exchange the latest and greatest in Big Data. The main theme of the conference will be the 5V’s: Volume, Velocity, Variety, Value and Veracity aspects. The conference will take place in Santa Clara, CA from October 6th to 9th. I have the great privilege to co-chair the Industry and Government Program with my distinguished colleagues: Rayid Ghani (Obama Campaign), Wei Han (Noah’s Ark Lab) and Ronny Lempel (Yahoo! Labs) along with Xiaohua Tony Hu (Drexel University) who is chairing the Steering Committee. The 4-day program includes about 50 presentations selected from over 300 paper submissions from more than 1000 authors from 40 countries, four keynotes (Amr Awadallah, Mike Franklin, Hector Garcia-Molina and Roger Schell), 12 workshops, and two tutorials. I have the great pleasure to deliver the opening and welcoming speech on behalf of the industry and government committee. I am also chairing Amr Awadallah’s keynote session on Key Usage Patterns for Apache Hadoop in the Enterprise and co-presenting a paper titled A Look at Challenges and Opportunities of Big Data Analytics in Healthcare at the workshop on Big Data in Bioinformatics and Healthcare Informatics. This workshop will be very interesting with sessions like Big Data Solutions for Predicting Risk‐of‐Readmission for Congestive Heart Failure Patients, Colon cancer survival prediction using ensemble data mining on SEER Data etc.
Cisco is a proud sponsor of the conference. Additional Information:
Fast changing business conditions require agility, a difficult challenge in your distributed on-premises, big data and cloud environments. Data virtualization makes it easy for you to access your data, no matter where it resides.
Cisco’s integrated data platform optimizes query, compute and network infrastructure,so you access and query all types of data across the network as if it is in a single place.
You get the benefits of greater business insight and the flexibility you need in IT, with significant cost savings. You can then adapt to change more quickly and make better decisions in real time, without physically moving your data.
Data virtualization makes it possible to:
Empower your people with instant access to all the data they want, the way they want it
Respond faster to your changing analytics and business intelligence needs
While there is not yet an industry standard benchmark for measuring the performance of Hadoop systems (yes, there is work in progress -- WBDB, BigDataTop100 etc), workloads like TeraSort have become a popular choice to benchmark and stress test Hadoop clusters.
TeraSort is very simple, consists of three map/reduce programs (i) TeraGen -- generates the dataset (ii) TeraSort -- samples and sort the dataset (iii) TeraValidate -- validates the output. With multiple vendors now publishing TeraSort results, organizations can make reasonable performance comparisons while evaluating Hadoop clusters.
We conducted a series of TeraSort tests on our popular Cisco UCS Common Platform Architecture (CPA) for Big Data rack with 16 Cisco UCS C240 M3 Rack Servers equipped with two Intel Xeon E5-2665 processors, running Apache Hadoop distribution, see figure below, demonstrating industry leading performance and scalability over a range of data set sizes from 100GB to 50TB. For example, out of the box, our 10TB result is 40 percent faster than HP’s published result on 18 HP ProLiant DL380 Servers equipped with two Intel Xeon E5-2667 processors.
While Hadoop offers many advantages for organizations, the Cisco story isn’t complete without including collaborations with our ecosystem partners that enables us to offer complete solution stacks. We support leading Hadoop distributions including Cloudera, HortonWorks, Intel, MapR, and Pivotal on our Cisco UCS Common Platform Architecture (CPA) for Big Data. We just announced our Big Data Design Zonethat offers Cisco Validated Designs (CVD) -- pretested and validated architectures that accelerate the time to value for customers while reducing risks and deployment challenges.
Your smart sprinkler system is happily pumping water to your lawn in highly efficient sprays that are “aware” of the soil, the climate, the weather, the time of day, and even whether or not your kids are playing in the backyard on a Saturday. Suddenly, a faulty valve bursts and an uncontrolled geyser erupts. One part of your property is about to be ruined by flooding while the rest of the lawn is left to yellow in the sun.
You and your family are miles away, yet you know all about it. Sensors throughout the system alert your smartphone. At the same time, machine-to-machine signals shut down the pumps, and an expert from the sprinkler company is dispatched to your home with the precise replacement part and the real-time knowledge to fix the system.
It’s a great example of how the Internet of Everything (IoE) may soon funnel precise information in real time to the people — or machines — that need it most. Many of these “remote expert“ technologies are either already here or on the horizon.