Cisco Blogs


Cisco Blog > Data Center and Cloud

Data Abstraction: The Lingua Franca for Data Silos

Enterprises are seeking ways to improve their overall profitability, cut costs, reduce risk and more through better leverage of their data assets.

Significant volumes of complex, diverse data spread across various technology and application silos make it difficult for organizations to achieve these business outcomes. To further complicate matters, there is a range of problems such as

  • Separate access mechanisms, syntax, and security for each source
  • Lack of proper structure for business user or application consumption and reuse
  • Incomplete or duplicate data
  • And a mixture of latency issues

Data abstraction overcomes these challenges by transforming data from its native structure and syntax into views and data services that are much easier for business intelligence and analytics developers to use when creating new decision-making applications.

Enterprises can approach data abstraction three ways:

  • Manual data abstraction
  • Data warehouse schemas
  • Data virtualization

Of the three approaches, data virtualization is the superior solution for data abstraction because it enables the most flexibility and agility when you need to provide simple, consistent, business–formatted data from different data locations and sources.

As a complement to Cisco’s Data Virtualization software and services, Cisco also provides data abstraction best practices that help you accelerate your data abstraction activities. Composed of three distinct layers (application layer, business layer and physical layer), these best practices support a data reference architecture that rationalizes multiple, diverse data silos for a range of BI and analytic applications. The architecture aligns closely with analyst best practices mapped out by both Forrester and Gartner on the topic of data virtualization. Using these best practices will enable your company to access the right data for the business, gain agility and efficiency, maintain end-to-end control, and increase security of your data across all your data silos.

To learn more about data abstraction best practices using Cisco Data Virtualization, check out our white paper.

Tags: , , , ,

Active Archiving with Big Data

Historical data is now an essential tool for businesses as they struggle to meet increasingly stringent regulatory requirements, manage risk and perform predictive analytics that help improve business outcomes. While recent data is readily accessible in operational systems and some summarized historical data available in the data warehouse, the traditional practice of archiving older, detail-level data on tape makes analysis of that data challenging, if not impossible.

Active Archiving Uses Hadoop Instead of Tape

What if the historical data on tape was loaded into a similar low cost, yet accessible, storage option, such as Hadoop?  And then data virtualization applied to access and combine this data along with the operational and data warehouse data, in essence intelligently partitioning data access across hot, warm and cold storage options.  Would it work?

Yes it would!  And in fact does every day at one of our largest global banking customers.  Here’s how:

Adding Historical Data Reduces Risk

The bank uses complex analytics to measure risk exposure in their fixed income trading business by industry, region, credit rating and other parameters.  To reduce risk, while making more profitable credit and bond derivative trading decisions, the bank wanted to identify risk trends using five years of fixed income market data rather than the one month (400 million records) they currently stored on line.  This longer time frame would allow them to better evaluate trends, and use that information to build a solid foundation for smarter, lower-risk trading decisions.

As a first step, the bank installed Hadoop and loaded five years of historical data that had previously been archived using tape.  Next they installed Cisco Data Virtualization to integrate the data sets, providing a common SQL access approach that made it easy for the analysts to integrate the data.  Third the analysts extended their risk management analytics to cover five years.   Up and running in just a few months, the bank was able to use this long term data to better manage fixed income trading risk.

Archiving with Big Data_Bank

To learn more about Cisco Data Virtualization, check out our Data Virtualization Video Portal.

Tags: , , , , ,

Save Big Money with Big Data

Data in data warehouses doubles every 2.5 years. For users, this means more data to analyze, leading to better business outcomes. That’s the good news. The bad news is that this extra storage capacity and computing power comes at a cost. A high cost it turns out.

So what is an enterprise to do?

Keep writing bigger and bigger checks to the data warehouse vendor? At least the business can take advantage of the extra data?

Or should they move some of the lesser-used data to tape? That will save money. But it will also limit business access to this now “off-line” data which may mean missed business opportunities.

What if there was a third option that would preserve the on-line access for the business analysts and control these escalating costs for IT?

Cisco’s new Big Data Warehouse Expansion solution announced this week at Cisco Live provides this third option.

Log in here to access the presentations at Cisco Live on Cisco’s new Big Data Warehouse Expansion.

Cisco Big Data Warehouse Expansion is a new offering that combines hardware, software and services to help customers control the costs of their ever-expanding data warehouses by offloading infrequently used data to low-cost big data stores. Analytics are enriched as more data is retained and all data remains accessible.

Components in the solution include:

  • Cisco UCS optimized for big data stores.
  • Cisco Data Virtualization for federating multiple data sources.
  • Appfluent VisibilityTM to deliver analytics on business activity and data usage across Teradata, Oracle / Exadata, IBM DB2, IBM Netezza, IBM® PureData™ for Analytics and Hadoop.
  • Cisco Services methodology for assessing, migrating, virtualizing and operating a logically expanded warehouse.

If you are looking for a solution to your rising enterprise data warehouse costs, look no further than Cisco.

Follow us @CiscoDataVirt to stay up to date on the latest news!

Tags: , , , ,

Data Virtualization: Live at Cisco Live! San Francisco

It has been a great year for Data Virtualization at Cisco Live!   Milan, Melbourne, and Toronto were fantastic opportunities to introduce Data Virtualization to Cisco customer and partner audiences.  And we have saved the best for last with multiple activities at Cisco Live! San Francisco.

We kick things off on Monday May 18 with a by-invitation program for Cisco Data Virtualization customers and prospects.  We start the day at 3:00 with a special pass to John Chambers’ keynote address.  This is followed by a reception, data virtualization demo and tour in the World of Solutions hall.  And we close the evening with a dinner at one of San Francisco’s finest restaurants. Participants in this program return on Wednesday night for a special performance by Lenny Kravitz.   If you would like to join us, please contact Paul Torrento at ptorrent@cisco.com.

For those of you attending the full event, Data Virtualization is also featured in two sessions both entitled, Driving Business Outcomes for Big Data Environment.  I will lead a quick summary session on Thursday at 11:15am, with Jim Green providing a deeper-dive technical session from 11:30-12:30 that day.   In these sessions we will address one of the major issues organizations are facing as a consequence of exponential data growth – that is the huge expenses required to upgrade capacity in their enterprise data warehouses. To avoid this spend, customers are looking for lower cost alternatives such as offloading infrequently used data to Hadoop.  In these sessions you will find out about Cisco’s complete solution with Unified Computing System hardware and Data Virtualization software and Services methodology.

Please also stop by the Data Virtualization booth in the Cisco Services pavilion where we can chat about your business outcome objectives and how data virtualization can help.

And if you can’t make it to Cisco Live! San Francisco, then no worries.  Just check out the recording of my colleague Peter Tran’s session, Utilizing Data Virtualization to Create More Business Agility and Better Decision-Making, from Cisco Live! Milan.  It’s a great crash course intro to data virtualization.

Tags: , , , , , , ,

How Data Virtualization Helps Data Scientists

By now it is clear that big data analytics opens the door to unprecedented analytic opportunities for business innovation, customer retention and profit growth. However, a shortage of data scientists is creating a bottleneck as organizations move from early big data experiments into larger scale adoption. This constraint limits big data analytics and the positive business outcomes that could be achieved.

Jason Hull

Click on the photo to hear from Comcast’s Jason Hull, Data Integration Specialist about how his team uses data virtualization to get what they need done, faster

It’s All About the Data

As every data scientist will tell you, the key to analytics is data. The more data the better, including big data as well as the myriad other data sources both in the enterprise and across the cloud. But accessing and massaging this data, in advance of data modeling and statistical analysis, typically consumes 50% or more of any new analytic development effort.

• What would happen if we could simplify the data aspect of the work?
• Would that free up data scientists to spend more time on analysis?
• Would it open the door for non-data scientists to contribute to analytic projects?

SQL is the key. Because of its ease and power, it has been the predominant method for accessing and massaging data for the past 30 years. Nearly all non-data scientists in IT can use SQL to access and massage data, but very few know MapReduce, the traditional language used to access data from Hadoop sources.

How Data Virtualization Helps

“We have a multitude of users…from BI to operational reporting, they are constantly coming to us requesting access to one server or another…we now have that one central place to say ‘you already have access to it’ and they immediately have access rather than having to grant access outside of the tool” -Jason Hull, Comcast

Data virtualization offerings, like Cisco’s, can help organizations bridge this gap and accelerate their big data analytics efforts. Cisco was the first data virtualization vendor to support Hadoop integration with its June 2011 release. This standardized SQL approach augments specialized MapReduce coding of Hadoop queries. By simplifying access to Hadoop data, organizations could for the first time use SQL to include big data sources, as well as enterprise, cloud and other data sources, in their analytics.

In February 2012, Cisco became the first data virtualization vendor to enable MapReduce programs to easily query virtualized data sources, on-demand with high performance. This allowed enterprises to extend MapReduce analyses beyond Hadoop stores to include diverse enterprise data previously integrated by the Cisco Information Server.

In 2013, Cisco maintained its big data integration leadership with updates of its support for Hive access to the leading Hadoop distributions including Apache Hadoop, Cloudera Distribution (CDH) and Hortonworks (HDP). In addition, Cisco now also supports access to Hadoop through HiveServer2 and Cloudera CDH through Impala.

Others, beyond Cisco, recognize this beneficial trend. In fact, Rick van der Lans, noted Data Virtualization expert and author, recently blogged on future developments in this area in Convergence of Data Virtualization and SQL-on-Hadoop Engines.

So if your organization’s big data efforts are slowed by a shortage of data scientists, consider data virtualization as a way to break the bottleneck.

Tags: , , , , , , , , , , , , , ,