Cisco Blogs
Share

Big Data is better than a sharp stick in the eye

- March 5, 2015 - 2 Comments

Big Data is better than a sharp stick in the eye. I can say this with great authority, since I missed the first half of Strata+Hadoop World 2015 in San Jose because of the latter. But eye injuries have never kept me offline for long, and I was able to follow online with what I didn’t see in person. But I was very happy to make it in to the show on Friday, and even got a seat at about row 6 in the main hall for the keynotes.

One of the most provocative things that came up during the event, and in Friday’s lightning keynotes, was the Open Data Platform announcement from Hortonworks and Pivotal. This looks to be an interesting move by Hortonworks; while it’s not immediately as impressive as the big announcement Hortonworks made at the Hadoop Summit in 2011 (I think I was in row 6 for that keynote as well, albeit in a much smaller hall), we’ll be looking forward to seeing how the industry moves forward with this initiative. Pivotal contributing its big data platform to the open source community is definitely a big deal.

Another topic that’s been picking up steam is security around big data. One of the biggest concerns around the Internet of Everything has been security, and anyone who’s bought anything in the past couple of years has probably wondered how secure their data is (especially after the high profile financial and healthcare breaches of recent years). I touched on this topic in my first Cisco blog post in 2013.

Security as an afterthought has been a hallmark of UNIX, for example, as anyone who ran Sendmail in the mid to late 1990s would be able to tell you (open relays anyone?). At Strata this year, Eddie Garcia, chief security architect of Cloudera, talked about moving to a model of “secure by default,” which I believe will be essential to making the Big Data era sustainable and survivable. Apache Sentry and the like are already out there, but as developers, data scientists, and solutions providers start deploying secure-by-default solutions, customers and end users will be more confident in the use of their data, and everyone will be better off.

It is a relief to me to see that we’ve moved beyond the “Oh, Hadoop? I’ve heard of that” phase, and at Strata last year in New York City we heard Mike Olson say that Hadoop would disappear in the next year. It wasn’t that Hadoop is going anywhere, and contrary to what some advocates are preaching, even MapReduce is going to be around for the foreseeable future. Remember, people still make money at providing Usenet service, probably 20 years after it was first declared dead.

But the focus has moved toward analytics, and making business value of your Hadoop data environment, rather than putting all your attention on the Hadoop environment itself. Hadoop is becoming the plumbing, the infrastructure, the foundation of your data world rather than the entirety of it.

All of that aside, what I really enjoy most about these events is the opportunity to talk with distant coworkers, our software partners, other people in the industry, and even former coworkers who I may have lost track of in recent months. I did miss seeing Jim McHugh, whose coverage of his Strata experience is definitely worth a look. I got some quality time with the folks at MapR, finally meeting Ellen Friedman in person as well as seeing Jim Scott’s cluster-in-a-suitcase in person.

And while my Roving Reporter schedule was cut back a bit, I did catch up with Clint Sharp for a quick conversation. Clint is director of product management for Splunk, who are sometimes described as a world-famous t-shirt company that also produces an analytics software platform.

I’ve known about Splunk since before the infamous “Take the SH out of IT” tee shirts took the techie world by storm in the mid 2000s. Some of my coworkers from Looksmart were early employees there, with at least one of them still being there (hi Ledio!), and I haven’t worked many places that didn’t use Splunk in some capacity.

But Splunk has come a long way since their “Google for log files” beginnings nearly a decade ago. Check out this video to see what Clint had to say about Splunk’s viral infiltration of enterprises—not as sinister as it sounds, trust me—and why Splunk likes Cisco UCS Integrated Infrastructures for deploying their software.

Read more about what Cisco and Splunk have been doing lately at our latest Splunk blog post, written by Cisco’s Raghu Nambiar.

What did you think was great or not-so-great about Strata+Hadoop World this year? What do you expect to hear about in New York this fall? Find me on Twitter at @gallifreyan–look for the Big Data Safari hat–and let me know.

Tags:

In an effort to keep conversations fresh, Cisco Blogs closes comments after 60 days. Please visit the Cisco Blogs hub page for the latest content.

2 Comments

    Excellent article... I have just started reading "Data Science for Business" by Foster Provost and Tom Fawcett. In past the widest applications of data-mining techniques were in marketing for tasks such as "targeted marketing, online advertising, and recommendations for cross-selling." But more and more contact centers have started to look at data in totally new way.

      Hi Ibrahim, Thanks for sharing those thoughts. Data science, and big data itself, has definitely changed scope dramatically since I started working with the concepts a decade ago. It's definitely intriguing to see how various industries who might not have even used computers regularly in the past are now making great use of big data and analytics.

Share