Big Data is one of the most talked about topics of today across industry, government and research. It is becoming the center of Investments, Innovations and Improvizations (3I’s), and no exaggeration to say that Big Data is Transforming the World. Considering it’s potential the IEEE Computer Society is conducting the IEEE International Conference on Big Data 2013, a premier forum to disseminate and exchange the latest and greatest in Big Data. The main theme of the conference will be the 5V’s: Volume, Velocity, Variety, Value and Veracity aspects. The conference will take place in Santa Clara, CA from October 6th to 9th. I have the great privilege to co-chair the Industry and Government Program with my distinguished colleagues: Rayid Ghani (Obama Campaign), Wei Han (Noah’s Ark Lab) and Ronny Lempel (Yahoo! Labs) along with Xiaohua Tony Hu (Drexel University) who is chairing the Steering Committee. The 4-day program includes about 50 presentations selected from over 300 paper submissions from more than 1000 authors from 40 countries, four keynotes (Amr Awadallah, Mike Franklin, Hector Garcia-Molina and Roger Schell), 12 workshops, and two tutorials. I have the great pleasure to deliver the opening and welcoming speech on behalf of the industry and government committee. I am also chairing Amr Awadallah’s keynote session on Key Usage Patterns for Apache Hadoop in the Enterprise and co-presenting a paper titled A Look at Challenges and Opportunities of Big Data Analytics in Healthcare at the workshop on Big Data in Bioinformatics and Healthcare Informatics. This workshop will be very interesting with sessions like Big Data Solutions for Predicting Risk‐of‐Readmission for Congestive Heart Failure Patients, Colon cancer survival prediction using ensemble data mining on SEER Data etc.
Cisco is a proud sponsor of the conference. Additional Information:
Tags: 3Is, 5Vs, Big Data, bigdatatop100, Hadoop, IEEE, TPCTC, WBDB
Fast changing business conditions require agility, a difficult challenge in your distributed on-premises, big data and cloud environments. Data virtualization makes it easy for you to access your data, no matter where it resides.
Recognizing this accelerating customer requirement, Cisco acquired data virtualization market leader Composite Software in July 2013.
Cisco’s integrated data platform optimizes query, compute and network infrastructure,so you access and query all types of data across the network as if it is in a single place.
You get the benefits of greater business insight and the flexibility you need in IT, with significant cost savings. You can then adapt to change more quickly and make better decisions in real time, without physically moving your data.
Data virtualization makes it possible to:
- Empower your people with instant access to all the data they want, the way they want it
- Respond faster to your changing analytics and business intelligence needs
- Reduce complexity and save money
To learn how business and IT leaders use data virtualization to address big data and cloud challenges and drive business advantage, you can attend Cisco-sponsored Data Virtualization Day 2013 in New York City on Wednesday, October 9, 2013 from 8:30 am to 4:00 pm ET.
This year’s powerful speaker line up includes:
Executives from Goldman Sachs, BMO Financial and Sky who will describe how they used data virtualization to address competitive, cost and compliance challenges
Analyst speakers including Forrester Research’s Noel Yuhanna and R20/Consultancy’s Rick van der Lans will describe the state of data virtualization today, while projecting its future
Cisco executives will communicate the vision surrounding their acquisition of Composite Software and the synergies that the combination will provide to the data virtualization user community
Read More »
Tags: analytics, Big Data, cloud, data virtualization
While there is not yet an industry standard benchmark for measuring the performance of Hadoop systems (yes, there is work in progress -- WBDB, BigDataTop100 etc), workloads like TeraSort have become a popular choice to benchmark and stress test Hadoop clusters.
TeraSort is very simple, consists of three map/reduce programs (i) TeraGen -- generates the dataset (ii) TeraSort -- samples and sort the dataset (iii) TeraValidate -- validates the output. With multiple vendors now publishing TeraSort results, organizations can make reasonable performance comparisons while evaluating Hadoop clusters.
We conducted a series of TeraSort tests on our popular Cisco UCS Common Platform Architecture (CPA) for Big Data rack with 16 Cisco UCS C240 M3 Rack Servers equipped with two Intel Xeon E5-2665 processors, running Apache Hadoop distribution, see figure below, demonstrating industry leading performance and scalability over a range of data set sizes from 100GB to 50TB. For example, out of the box, our 10TB result is 40 percent faster than HP’s published result on 18 HP ProLiant DL380 Servers equipped with two Intel Xeon E5-2667 processors.
While Hadoop offers many advantages for organizations, the Cisco story isn’t complete without including collaborations with our ecosystem partners that enables us to offer complete solution stacks. We support leading Hadoop distributions including Cloudera, HortonWorks, Intel, MapR, and Pivotal on our Cisco UCS Common Platform Architecture (CPA) for Big Data. We just announced our Big Data Design Zone that offers Cisco Validated Designs (CVD) -- pretested and validated architectures that accelerate the time to value for customers while reducing risks and deployment challenges.
Cisco Big Data Design Zone
Cisco UCS Demonstrates Leading TeraSort Benchmark Performance
Cisco UCS Common Platform Architecture (CPA) for Big Data
Tags: Big Data, Big Data Benchmarks, Cisco UCS C240 M3 Rack Server, Cisco UCS CPA, CPA, Hadoop, TeraSort, YCSB
Your smart sprinkler system is happily pumping water to your lawn in highly efficient sprays that are “aware” of the soil, the climate, the weather, the time of day, and even whether or not your kids are playing in the backyard on a Saturday. Suddenly, a faulty valve bursts and an uncontrolled geyser erupts. One part of your property is about to be ruined by flooding while the rest of the lawn is left to yellow in the sun.
You and your family are miles away, yet you know all about it. Sensors throughout the system alert your smartphone. At the same time, machine-to-machine signals shut down the pumps, and an expert from the sprinkler company is dispatched to your home with the precise replacement part and the real-time knowledge to fix the system.
It’s a great example of how the Internet of Everything (IoE) may soon funnel precise information in real time to the people — or machines — that need it most. Many of these “remote expert“ technologies are either already here or on the horizon.
Read More »
Tags: Big Data, Cisco, Cisco Consulting Services, collaboration, Internet of Everything, IoE, M2M, Machine to Machine, remote expert
A common cornerstone of both the Internet of Things and Internet of Everything concepts is the idea of a future with billions, if not trillions, of connections to the Internet. As the Internet of Everything connects objects, data, people and processes, the future of connected things will not be traditional computers or smartphones. Rather, it may be your refrigerator, or a traffic light, or even a litter box. Basically, anything that can have a status change that will interest someone has the potential to be connected to the Internet in order to alert you to that change.
The idea of being alerted to important information automatically is appealing. After all, if your refrigerator is having a cooling issue and it can send you a text alert, you can save money by taking corrective action before your milk and other products go bad. However, not all of the data generated by the Internet of Everything will be of high value. In fact, most of it will be of little value at all.
Read More »
Tags: Big Data, Cisco, connections, Internet of Everything, internet of things, IoE, IoT, sensors, smart things