It is an exciting day for Cisco Data Virtualization, our data integration software that connects all kinds of data from across the network and makes it appear as if it is in one place and in one consolidated view. To see it in action, check out this video on how we replaced Denodo with our own data virtualization technology at Cisco.
Today at Data Virtualization Day, in New York City, I will be joined by customers, partners and industry experts as we launch a major update to our flagship data virtualization platform, Cisco Information Server (CIS). CIS 7.0 will enable IT departments to deliver self-service data access and enable business agility like never before.
My favorite part of Data Virtualization Day is the time I get to spend with our customers and partners, talking about shared successes and upcoming product enhancements. Since joining Cisco through our acquisition of Composite Software in July 2013, data virtualization has been a key piece of our portfolio and is a vital solution to our customers’ challenges brought on by the Internet of Everything (IoE), Cloud and Big Data trends.
Data is exploding now more than ever before. The majority of data is generated automatically by connected devices with up to 50 billion devices expected by the year 2020. The data explosion is the result of the IoE, this hyperconnection of people, process, data, and things that will create new capabilities, richer experiences, and unprecedented economic opportunities for businesses, individuals and countries for those with ‘IoE Ready’ strategies, infrastructure and technical capabilities in place.
Cisco Data Virtualization is a key part of being ‘IoE Ready’ by connecting device data, big data, data in the cloud and traditional enterprise data in new and extraordinary ways. Organizations that tap into this data pool will be able to leverage it strategically to monitor customer sentiment and behaviors; identify market and competitive changes, anticipate market transitions while optimizing performance of assets and operations and achieving the upmost business agility. It will separate the market leaders from the rest of the pack and will turn the challenges of the IoE, Cloud and Big Data into amazing opportunities.
Many organizations are shifting traditional data center environments to cloud data environments in order to optimize data center investment, leading to more hybrid IT environments. Cisco Data Virtualization truly enables a hybrid IT model by helping our customers live in a “world of many clouds” – connecting people, communities and organizations with intelligent networking capabilities that unify resources within and between data centers and across clouds. Now our customers can deploy any hybrid IT mix they desire while retaining the access and insights they require and free from the constraints of traditional data center operations and economics.
With the pace of worldwide data growth accelerating; organizations using innovative methods for storing, accessing and analyzing data will thrive amongst their competition. There has never been a more exciting time in the history of technology, and data virtualization is at the heart of how our customers are gaining a business advantage from all of the new data at their fingertips.
Happy Data Virtualization Day!
To learn more about Cisco Data Virtualization, check out our page.
Join the Conversation
Follow @CiscoDataVirt #DVDNYC
Tags: analytics, Big Data, cloud, data, data virtualization, Internet of Everything
At the June Hadoop Summit in San Jose, Hadoop was re-affirmed as the data center “killer app,” riding an avalanche of Enterprise Data, which is growing 50x annually through 2020. According to IDC, the Big Data market itself growing six times faster than the rest of IT. Every major tech company, old and new, is now driving Hadoop innovation, including Google, Yahoo, Facebook Microsoft, IBM, Intel and EMC – building value added solutions on open source contributions by Hortonworks, Cloudera and MAPR. Cisco’s surprisingly broad portfolio will be showcased at Strataconf in New York on Oct. 15 and at our October 21st executive webcast. In this third of a blog series, we preview the power of Application Centric Infrastructure for the emerging Hadoop eco-system.
Why Big Data?
Organizations of all sizes are gaining insight and creativity into use cases that leverage their own business data.
The use cases grow quickly as businesses realize their “ability to integrate all of the different sources of data and shape it in a way that allows business leaders to make informed decisions.” Hadoop enables customers to gain insight from both structure and unstructured data. Data Types and sources can include 1) Business Applications – OLTP, ERP, CRM systems, 2) Documents and emails 3) Web logs, 4) Social networks, 5) Machine/sensor generated, 6) Geo location data.
IT operational challenges
Even modest-sized jobs require clusters of 100 server nodes or more for seasonal business needs. While, Hadoop is designed for scale out of commodity hardware, most IT organizations face the challenge of extreme demand variations in bare-metal workloads (non-virtualizable). Furthermore, they are requested by multiple Lines of Business (LOB), with increasing urgency and frequency. Ultimately, 80% of the costs of managing Big Data workloads will be OpEx. How do IT organizations quickly, finish jobs and re-deploy resources? How do they improve utilization? How do they maintain security and isolation of data in a shared production infrastructure?
And with the release of Hadoop 2.0 almost a year ago, cluster sizes are growing due to:
- Expanding data sources and use-cases
- A mixture of different workload types on the same infrastructure
- A variety of analytics processes
In Hadoop 1.x, compute performance was paramount. But in Hadoop 2.x, network capabilities will be the focus, due to larger clusters, more data types, more processes and mixed workloads. (see Fig. 1)
ACI powers Hadoop 2.x
Cisco’s Application Centric Infrastructure is a new operational model enabling Fast IT. ACI provides a common policy-based programming approach across the entire ACI-ready infrastructure, beginning with the network and extending to all its connected end points. This drastically reduces cost and complexity for Hadoop 2.0. ACI uses Application Policy to:
– Dynamically optimize cluster performance in the network
– Redeploy resources automatically for new workloads for improved utilization
– Ensure isolation of users and data as resources are deployments change
Let’s review each of these in order:
Cluster Network Performance: It’s crucial to improve traffic latency and throughput across the network, not just within each server.
- Hadoop copies and distributes data across servers to maximize reliability on commodity hardware.
- The large collection of processes in Hadoop 2.0 are usually spread across different racks.
- Mixed workloads in Hadoop 2.0, support interactive and real-time jobs, resulting in the use of more on-board memory and different payload sizes.
As a result, server IO bandwidth is increasing which will place loads on 10 gigabit networks. ACI policy works with deep telemetry embedded in each Nexus 9000 leaf switch to monitor and adapt to network conditions.
Using policy, ACI can dynamically 1) load-balance Big Data flows across racks on alternate paths and 2) prioritize small data flows ahead of large flows (which use the network much less frequently but use up Bandwidth and Buffer). Both of these can dramatically reducing network congestion. In lab tests, we are seeing flow completion nearly an order of magnitude faster (for some mixed workloads) than without these policies enabled. ACI can also estimate and prioritize job completion. This will be important as Big Data workloads become pervasive across the Enterprise. For a complete discussion of ACI’s performance impact, please see a detailed presentation by Samuel Kommu, chief engineer at Cisco for optimizing Big Data workloads.
Resource Utilization: In general, the bigger the cluster, the faster the completion time. But since Big Data jobs are initially infrequent, CIOs must balance responsiveness against utilization. It is simply impractical for many mid-sized companies to dedicate large clusters for the occasional surge in Big Data demand. ACI enables organizations to quickly redeploy cluster resources from Hadoop to other sporadic workloads (such as CRM, Ecommerce, ERP and Inventory) and back. For example, the same resources could run Hadoop jobs nightly or weekly when other demands are lighter. Resources can be bare-metal or virtual depending on workload needs. (see Figure 2)
How does this work? ACI uses application policy profiles to programmatically re-provision the infrastructure. IT can use a different profile to describe different application’s needs including the Hadoop eco-system. The profile contains application’s network policies, which are used by the Application Policy Infrastructure controller in to a complete network topology. The same profile contains compute and storage policies used by other tools, such as Cisco UCS Director, to provisioning compute and storage.
Data Isolation and Security: In a mature Big Data environment, Hadoop processing can occur between many data sources and clients. Data is most vulnerable during job transitions or re-deployment to other applications. Multiple corporate data bases and users need to be correctly to ensure compliance. A patch work of security software such as perimeter security is error prone, static and consumes administrative resources.
In contrast, ACI can automatically isolate the entire data path through a programmable fabric according to pre-defined policies. Access policies for data vaults can be preserved throughout the network when the data is in motion. This can be accomplished even in a shared production infrastructure across physical and virtual end points.
As organizations of all sizes discover ways to use Big Data for business insights, their infrastructure must become far more performant, adaptable and secure. Investments in fabric, compute and storage must be leveraged across, multiple Big Data processes and other business applications with agility and operational simplicity.
Leading the growth of Big Data, the Hadoop 2.x eco-system will place particular stresses on data center fabrics. New mixed workloads are already using 10 Gigabit capacity in larger clusters and will soon demand 40 Gigabit fabrics. Network traffic needs continuous optimization to improve completion times. End to end data paths must use consistent security policies between multiple data sources and clients. And the sharp surges in bare-metal workloads will demand much more agile ways to swap workloads and improve utilization.
Cisco’s Application Centric Infrastructure leverages a new operational and consumption model for Big Data resources. It dynamically translates existing policies for applications, data and clients in to fully provisioned networks, compute and storage. . Working with Nexus 9000 telemetry, ACI can continuously optimize traffic paths and enforce policies consistently as workloads change. The solution provides a seamless transition to the new demands of Big Data.
To hear about Cisco’s broader solution portfolio be sure to for register for the October 21st executive webcast ‘Unlock Your Competitive Edge with Cisco Big Data Solutions.’ And stay tuned for the next blog in the series, from Andrew Blaisdell, which showcases the ability to predictably deliver intelligence-driven insights and actions.
Tags: ACI, analytics, Big Data, Cisco Application Centric Infrastructure, Nexus 9000, UCS, UnlockBigData
As we think of Healthcare and Big data Analytics, some of the topics that come to fore front are personalized medicines, managing readmissions, identifying health risk indexes and many more. While each of these is important areas that benefit from power of Big Data Analytics, one of the areas that is at table stakes in Healthcare is protecting critical care systems. Can the power of big data analytics provide us a protective shield?
Before we dive in, the question that comes up is why is Healthcare Security any different and why Big Data Analytics instead of the traditional approaches to protection that we have today.
This was the topic of my presentation at the recently concluded COM.BigData 2014 conference in Washington DC: ‘Dynamic Protection for Critical Care Systems using Cisco Cloud web security (CWS): Unleashing the power of Big Data Analytics’.
While the Health IT transitions are opening up healthcare access in newer ways that has significant security implications, there are additional trends that are making Healthcare a prime target.
Targeting Healthcare Industry
According to the World Privacy Forum, the street value of a stolen Healthcare data is ~ $50 as compared to $1 for a stolen social security number. The Ponemon Institute, in its third annual report on Medical Identity theft, 2012, estimates the economic impact of medical identity theft at 41.3 billion per year, a significant increase from 30.9 billion per year in 2011. In addition, new attack models such as ransomware can capitalize on the sensitivity of the situation, where the question is not about losing your data, but your life. Adding up all these, healthcare industry is an attractive target.
The expanded boundaries
Read More »
Tags: analytics, Big Data, Cloud web security, critical systems, Dynamic Protection, healthcare IT, security
Finding a molecule with the potential to become a new drug is complicated. It’s time-consuming. Fewer than 10 percent of molecules or compounds discovered are promising enough to enter the development pipeline. And fewer of those ever come to market. At Pfizer, if it were not for data virtualization, it would be even more challenging.
Years of Data, Thousands of Decisions
The pipeline from discovery to licensing occurs in phases over 15-20 years, and few compounds complete the journey. The initial study phase represents a multimillion-dollar investment decision. Each succeeding phase – proof-of-concept study, dose range study, and large-scale population study – represents a magnitude-larger investment and risk than the one before.
Senior management and portfolio managers need to know:
- Which projects the company should fund?
- Which compounds are meeting Pfizer’s high standards for efficacy and safety?
- What are scientists discovering in clinical trials?
Portfolio and project managers routinely make complex tactical decisions such as:
- How to allocate scarce R&D resources across different projects?
- How to prioritize multiple development scenarios?
- What is impact of a clinical trial result on downstream manufacturing?
Before Pfizer adopted Cisco Data Virtualization, getting useful data to answer these questions took weeks or months. Why so long? The problem has several dimensions. First, each phase of development generates massive amounts of data and requires extensive analysis to provide an accurate picture. Second, data comes from Pfizer research scientists all over the world; from physicians; clinical trials; product owners and managers; marketing teams; and hundreds of different back-end systems. Third, the scientific method is based on trial and error, with unpredictable results. Thus no two decisions are alike and therefore the specific data required for each decision is unique.
Data Virtualization Provides the Solution
To support their decision-making needs, Pfizer needed a solution that would allow them to pull all this diverse information together in an agile, ad hoc way. Cisco Data Virtualization – agile data integration software that makes it easy to access and gather relevant data, no matter where data sources reside – provided the solution.
With Cisco Data Virtualization, Pfizer’s research and portfolio data resides in one virtual place and provides “one version of the truth” that is available for everyone to use to address the myriad decisions that arise. Further, by applying virtualization instead of consolidation, infrastructure costs are also reduced.
According to Pfizer, “data virtualization is far less expensive than building specialized data marts to answer questions. With Cisco Data Virtualization, our portfolio teams get answers in hours or days for about one-tenth the cost.”
This data virtualization progress has not gone unnoticed. At Data Virtualization Day 2012, Pfizer was awarded the “Data Virtualization Champion” award for consistently achieving and promoting data virtualization value within the organization and across the industry.
Learn from other leaders in the industry and see who wins this year’s Data Virtualization Leadership Awards at Data Virtualization Day 2014 on October 1. Register now!
To read more about this Pfizer case study click here.
To learn more about Cisco Data Virtualization, check out our page.
Join the Conversation
Follow us @CiscoDataVirt #DVDNYC
Tags: analytics, Big Data, cloud, data, data virtualization, Internet of Everything
Responses in a recent Cisco-sponsored Cloud Security Alliance survey (hyperlink) illustrate that many data privacy challenges previously cast in the “too hard” basket can be more readily navigated though focusing on universal principles across Cloud, IoT and Big Data. Survey responses showed a surprisingly strong level of interest in a global consumer bill of rights and responses were overwhelming in favor of the OECD data privacy principles facilitating the trends of Cloud, IoT and Big Data.
Following are the most significant findings:
Data Residency and Sovereignty
Data residency and sovereignty challenges continue to emerge. However, there was a common theme of respondents identifying “personal data” and Personally Identifiable Information (PII) as the data that is required to remain resident in most countries.
73 percent of respondents indicated that there should be a call for a global consumer bill of rights and saw the United Nations as fostering that. This is of great significance with the harmonization efforts taking place in Europe with a single EU data Privacy Directive to represent 28 European member states. As well as with the renewed calls for a U.S. Consumer Bill of Privacy Rights in the United States and cross-border privacy arrangements in Australia and Asia.
Finally we explored whether OECD privacy principles that have been very influential in the development of many data privacy regulations also facilitate popular trends in cloud, IoT and big data initiatives or cause room for tension. The responses were very much in favor of facilitating the various trends.
The survey report includes an executive summary from Dr. Ann Cavoukian, Former Information and Privacy Commissioner of Ontario, Canada and commentary from other industry experts on the positive role that privacy can play in developing new and innovative cloud, IoT and Big Data Solutions. Read the Data Protection Heat Index survey report:
Tags: Big Data, cloud, IoE, privacy, report, security, survey