Cisco Blogs
Share

Transparently Offloading Data Warehouse Data to Hadoop using Data Virtualization


December 3, 2014 - 1 Comment

More data allows for better and more expansive analysis. And better analysis is a critical success factor for businesses today.

But most data warehouses use the once-in-never-out principle when storing data. So whenever new business activities occur, new data is added without removing old data to make room. New data sources, such as data from social media networks, open data sources, and public web services further expand the warehouse. Unfortunately, all this growth comes at a cost.

Is there a way you can have your cake and eat it too?

With Hadoop and Cisco Big Data Warehouse Expansion, you can.

Disadvantages of More Data

While everyone understands the business advantage that can be derived from analyzing more data, not everyone understands the disadvantages that can occur including:

  • Expensive data storage: Data warehouse costs include hardware costs, management costs, and database server license fees.  These grow in line with scale.
  • Poor query performance: The bigger the database tables, the slower the queries.
  • Poor loading performance: As tables grow, loading new data also slows down.
  • Slow backup/recovery: The larger the database, the longer the backup and restore process.
  • Expensive database administration: Larger databases require more database administration including tuning and optimizing the database server, the tables, the buffer, and so on.

Three Options to Control Costs

The easiest way to control data warehouse costs is to simply remove data, especially the less-frequently used or older data. But then this data can no longer be analyzed.

Another option is to move the lesser-used data to tape. This option provides cost savings, and in an emergency, the data can be reloaded from tape. But analysis has now become EXTREMELY difficult.

The third option is to offload lesser-used data to cheaper online data storage, with Hadoop the obvious choice. This provides a 10x cost savings over traditional databases, while retaining the online access required for analysis.

This is the “have your cake and eat it too” option.

The Fast Path to Transparent Offloading

Cisco provides a packaged solution called Cisco Big Data Warehouse Expansion, which includes the data virtualization software, hardware, and services required to accelerate all the activities involved in offloading data from a data warehouse to Hadoop.

And to help you understand how it works, Rick van der Lans, data virtualization’s leading independent analyst, recently wrote a step-by-step white paper, Transparently Offloading Data Warehouse Data to Hadoop using Data Virtualization, that explains everything you need to do.

Read The White Paper

Download Transparently Offloading Data Warehouse Data to Hadoop using Data Virtualization here.

 

Learn More

To learn more about Cisco Data Virtualization, check out our page.

Join the Conversation

Follow us @CiscoDataVirt.

In an effort to keep conversations fresh, Cisco Blogs closes comments after 60 days. Please visit the Cisco Blogs hub page for the latest content.

1 Comments

  1. Hadoop remains an overly complex beast for most enterprises, one reason that 43 per cent of Hortonworks’ revenue derives from low, low-margin services. While many people know how to pronounce Hadoop, far fewer know how to implement it, or why they should. For hadoop training bangalore