Big Compute, Big Data, Big VMworld 2011 session

August 31, 2011 - 0 Comments

This is a review post for the Big Compute and Big Data, NoSQL session. Hmm, when signing up for the session I was concerned about being a little lost on the topic, but the panel was very good.

It consisted of members from SAS, Clodera, Data Tatics, VMwware and EMC.

So, right of the bat, we are talking about Big Compute, Big Data. Not sure how these phrases get started but, someone coined the phrase and now it seems to be used a lot for large and unstructured data sets. The best definition I heard was

“Data just bigger than you can handle with current tool set, not just size, but difficult decisions from a volume or type perspective”

I t really kind of put the “Big Data” concept in perspective.

The NoSQL part of the title really meant, “Not only SQL”. These companies are looking for freedom of language for manipulation and structure.  Traditionally data always needed a schema. We are seeing a trend toward bring the data to the compute, so that when you process or retrieve the data you apply lens or filter to achieve the desired result.

The trend is to gain value of data compilation.

  • Store once
  • Less control
  • More programmer centric
  • Let me decide how to analysis
  • Actionable

Similar to traditional scale up in some way, “Big Data”  could be petabytes,  lots of small files, or loaded in RAM and using real time analytics. The main commonality is the requirement or need for parallel computing.

New use cases like

  • Unstructured data – like Twitter/Facebook – filtering for trending
  • Real time application Finical and Retail
  • Data pipeline requirements have been greatly extended – from unstructured to advanced analytics
  • Dating sites use Hadoop
  • Satellite imaging analysis – How many cars are parked at a local chain
  • Credit modeling history with longer data retention targets reaching seven years

Big Compute and Big data companies are moving to a federated model with larger local drive located next to the compute. The amount of interconnect traffic is a very big issue. Cisco has a number of products for implementation in Big Data and Big Compute environments.

This was a really great session with a really good panel. It is really interesting to see the areas where port density and ultra low latency devices are being position by our customers.

In an effort to keep conversations fresh, Cisco Blogs closes comments after 60 days. Please visit the Cisco Blogs hub page for the latest content.