To help organizations who aspire to apply the power of big data enterprise-wide, Cisco provides a powerful, efficient, and secure infrastructure and a wide array of analytics solutions. In our previous blogs, others have highlighted the benefits of Cisco’s ability to provide the scalability, ability to process both real-time data and historical data with predictable, high performance, and the comprehensive management automation enterprises will need to keep pace with big data in the IoE era. Today, I’d like to begin a conversation about how enterprises can secure their increasingly distributed networks – and the data that is being transported across them – as we operate in an environment comprised of 50 billion connected devices (in just five years from now).
One of the key drivers of Big Data is the Internet of Things (IoT), when every connected ‘thing’ will be capable of producing data. IoT has become a popular topic of discussion amongst security company executives, analysts, and other industry pundits. As they discuss the technical details, it quickly becomes evident that many of the most experienced security professionals still approach IoT with an IT-centric mindset. Of course, they are partially correct. Securing an escalating volume of data requires rethinking our approach to security. Not only do security devices need to be faster, they need to navigate issues very specific to data centers and complex data flows. They need to be inserted as close to the traffic flow as possible, such as being positioned inline into East/West traffic flowing across the data center. They need to be able to track and secure asymmetric traffic, often across multiple locations. They need to be able to blend corporate policy with public standards. Finally, they need to move seamlessly across physical, virtual, and cloud environments in order to ensure seamless policy enforcement. Gone are the days when we can just hairpin traffic out of the data center to be inspected elsewhere. Speed and agility do not allow for that sort of bottleneck.
However, IoT is not only about the billions of new connected objects and inspecting the data they are producing. While the dramatic increase in the number and types of connected objects certainly expands the attack surface and dramatically increases the diversity of threats, they are only part of the IoT security challenge. Another new challenge is the convergence of the organization’s existing IT network with the operational technology (OT) network (e.g., manufacturing floors, energy grids, transportation systems, and other industrial control systems.) These new environments, usually omitted from traditional IT thinking, expand the depth of security challenges, and makes threat remediation remarkably more complex.
Big Data is not just being generated by web-enabled toothbrushes or smart appliances. For Big Data to be useful, the data that is collected needs to be actionable. Converging data needs to be able to turn on or off water supplies, ramp up manufacturing floors, redirect traffic, or manage the flow of electricity during peak usage. As a result, while IT and OT were once separate networks, they are now simply different environments within a single extended network ‒ but by no means are they the same! The architectures, operational needs, platforms, and protocols are vastly different for each of them, and drive radically different security requirements. As a result, security architectures, solutions, and policies that have proven effective for years in the IT world often don’t apply in OT environments, so attempting to enforce consistent security policies across the extended network is doomed for failure.
Protecting data confidentiality, especially at high volume, is IT’s primary concern, so when faced with a threat, a common immediate response is to quarantine or shut down the affected system. But OT runs critical, 24×7 processes, including critical infrastructures, so data availability is their primary concern. Shutting down these processes can cost an organization millions of dollars, and actually put the public at risk, so the cost of remediation may be greater than simply dealing with the aftermath of an infection. In addition, because OT is a human-based operation in what can often be dangerous working conditions, their focus is also on the safety of their operation as well as their employees. Because of these main differences, IT and OT teams have traditionally approached security in completely different ways. While IT uses a variety of cybersecurity controls to defend the network against attack and to protect data confidentiality, OT views security more in terms of secure physical access, as well as operational and personnel safety.
Securing IoT networks that need to participate in and respond to the demands of Big Data must go beyond today’s thinking. Rather than focusing on individual security devices, solutions need to be networked so they can collaborate to process increasing volumes of data into comprehensive, actionable security intelligence. By combining numerous systems, including cyber and physical security solutions, IoT-enabled security driven by Big Data can protect the entire interconnected environment outside threats, monitor and secure critical data and infrastructure inside specific domains, and even improve employee safety. As a best practice, IT should maintain centralized management over the entire security solution, including the use of open standards in order to see and coordinate with public standards, but IT also needs to develop a high level of sensitivity to and understanding of the specific needs of OT. This will allow them to enforce differentiated security policies to meet the specific needs, of the different parts of their network and provide localized control over critical OT systems while dealing with the operational demands of Big Data.
At the end of the day, IT and OT need to work together for the common good of the entire IoT implementation – locally and globally –thereby driving truly pervasive, customized security across the extended network.
Cisco can help organizations deliver the security they need to succeed in the IoT and IoE eras. To hear more about Cisco’s big data story, join us for a webcast at 9 AM Pacific time on October 21st entitled ‘Unlock Your Competitive Edge with Cisco Big Data and Analytics Solutions.’ #UnlockBigData
As the pace of big data adoption increases, speeding delivery of new big data and analytics solutions will become increasingly important. To find out how Cisco is helping our customers do just that, watch for Mike Flannagan’s upcoming blog “Aligning Solutions to Meet Our Customers’ Data Challenges” this Thursday. #UnlockBigData
In our previous big data blogs, a number of my Cisco associates have talked about the right infrastructure, the right sizing, the right integrated infrastructure management and the right provisioning and orchestration for your clusters. But, to gain the benefits of pervasive use of big data, you’ll need to accelerate your big data deployments and make a seamless pivot of your “back of the data center” science experiment into the standard data center operational processes to speed delivery of the value of these new analytics workloads.
If you are using a “free” (hint: nothing’s free), or open source workload scheduler, or even a solution that can manage day-to-day batch jobs, you may run into problems right off the bat. Limitations may come in the form of dependency management, calendaring, error recovery, role-based access control and SLA management.
And really, this is just the start of your needs for full-scale, enterprise-grade workload automation for Big Data environments! As the number of your mission-critical big data workloads increases, predictable execution and performance will become essential.
Lucky for you Cisco has exactly what you need! Read More »
Enterprises are challenged to keep pace. That’s a big problem in a world where data and analytics form the competitive battlefield. With trends like Big Data, Cloud and the Internet of Everything are transforming our world, the possibilities are staggering.
Unfortunately, these possibilities also come with challenges:
With 25 billion connected devices by the end of 2015 and another 25 billion by 2020, data is and will continue to be sprawled out over billions of devices.
Data distribution reaches far beyond the traditional enterprise data warehouse, today data is everywhere across hybrid IT environments that span on-premise, private cloud and public cloud.
With data the lifeblood of today’s modern enterprise, business and IT stakeholders must radically change how they partner together in order to extract the most useful insight for all users. An “all hands on deck” approach to data and analytics is needed.
These are challenges CIS 7.0 is engineered to address.
Self-Service Data Gateway for Business
Leveraging the new class of easy-to-use business intelligence tools such as Qliktech, Spotfire and Tableau, as well as the increasingly powerful and ubiquitous Excel, business users have become adept at visualizing and analyzing data without IT’s help. However finding and accessing that data remains a big challenge, with long IT lead times frequently the only option. That is until CIS 7.0.
Business Directory is the first data virtualization offering designed exclusively for business self-service.
CIS 7.0’s Business Directory is the first data virtualization offering designed exclusively for business self-service. Users apply search and categorization techniques to quickly find the data they’re looking for, and then use their business intelligence (BI) tool of choice to query it. The result is far faster time to insight and translates to better business outcomes sooner.
With Business Directory, IT creates a new partnership with the business. IT provides secure, curated data sets to the business. Then the business adds domain knowledge and analytic value on the path to insight. Using CIS 7.0’s scalable platform, IT can manage security profiles ensuring users see only the data for which they’re authorized. And as new business needs arise, IT can use the CIS Studio to quickly add new data sets, often in a day or less.
Connecting Data Globally, with Control
Data virtualization’s ability to connect and deliver data with agility is well understood. As data has become increasingly distributed, data virtualization adoption has skyrocketed. This has led to highly complex, global-scale data virtualization deployments.
CIS 7.0’s Deployment Manager simplifies management of the mega-scale CIS 7.0 deployments. Deployment Manager automates the transfer views, data services, caches, policies and more across multiple CIS 7 instances. These faster, risk-free deployments provide the scalability desired, ensure compliance with software development life cycle processes and other governance practices, and reduce the cost of IT administration.
More Data Sources For Better Analysis
Everyone knows that if you can access more data sources, you can drive better, data-driven business outcomes. However the rise of new fit-for-purpose data stores from graph databases to Hadoop, as well as highly specialized industry solutions, such as the PI System in upstream energy, has made it difficult for IT to integrate all these sources at business pace.
CIS 7.0’s Data Source Software Development Kit (SDK) accelerates data adapter development. Using Data Source SDK, Cisco, system integrators and customers can build high-performance data virtualization adapters for emerging and industry-specific data sources quickly and in a way that leverages Cisco development best practices and market-leading, query optimization techniques.
Winning on the Data Battlefield
With Big Data, Cloud and the Internet of Everything disrupting data integration, the time is right for CIS 7.0. Business Directory addresses business demand for self-service data. Deployment Manager provides global-scale data virtualization with control. Data Source SDK extends data virtualization’s reach. These breakthroughs, on top of the industry’s leading data virtualization platform, will help our customers drive better business outcomes and outpace their competition.
CIS 7.0’s time in now! So get ready!
For those of you attending Data Virtualization Day, check out a sneak preview at the Solution Showcase. General availability via Cisco Support is scheduled for next month. Cisco Advanced Services and our many ATP partners are set to provide migration assistance.
To learn more about Cisco Data Virtualization, check out our page.
It is an exciting day for Cisco Data Virtualization, our data integration software that connects all kinds of data from across the network and makes it appear as if it is in one place and in one consolidated view. To see it in action, check out this video on how we replaced Denodo with our own data virtualization technology at Cisco.
Today at Data Virtualization Day, in New York City, I will be joined by customers, partners and industry experts as we launch a major update to our flagship data virtualization platform, Cisco Information Server (CIS). CIS 7.0 will enable IT departments to deliver self-service data access and enable business agility like never before.
My favorite part of Data Virtualization Day is the time I get to spend with our customers and partners, talking about shared successes and upcoming product enhancements. Since joining Cisco through our acquisition of Composite Software in July 2013, data virtualization has been a key piece of our portfolio and is a vital solution to our customers’ challenges brought on by the Internet of Everything (IoE), Cloud and Big Data trends.
Data is exploding now more than ever before. The majority of data is generated automatically by connected devices with up to 50 billion devices expected by the year 2020. The data explosion is the result of the IoE, this hyperconnection of people, process, data, and things that will create new capabilities, richer experiences, and unprecedented economic opportunities for businesses, individuals and countries for those with ‘IoE Ready’ strategies, infrastructure and technical capabilities in place.
Cisco Data Virtualization is a key part of being ‘IoE Ready’ by connecting device data, big data, data in the cloud and traditional enterprise data in new and extraordinary ways. Organizations that tap into this data pool will be able to leverage it strategically to monitor customer sentiment and behaviors; identify market and competitive changes, anticipate market transitions while optimizing performance of assets and operations and achieving the upmost business agility. It will separate the market leaders from the rest of the pack and will turn the challenges of the IoE, Cloud and Big Data into amazing opportunities.
Many organizations are shifting traditional data center environments to cloud data environments in order to optimize data center investment, leading to more hybrid IT environments. Cisco Data Virtualization truly enables a hybrid IT model by helping our customers live in a “world of many clouds” – connecting people, communities and organizations with intelligent networking capabilities that unify resources within and between data centers and across clouds. Now our customers can deploy any hybrid IT mix they desire while retaining the access and insights they require and free from the constraints of traditional data center operations and economics.
With the pace of worldwide data growth accelerating; organizations using innovative methods for storing, accessing and analyzing data will thrive amongst their competition. There has never been a more exciting time in the history of technology, and data virtualization is at the heart of how our customers are gaining a business advantage from all of the new data at their fingertips.
Happy Data Virtualization Day!
To learn more about Cisco Data Virtualization, check out our page.
At the June Hadoop Summit in San Jose, Hadoop was re-affirmed as the data center “killer app,” riding an avalanche of Enterprise Data, which is growing 50x annually through 2020. According to IDC, the Big Data market itself growing six times faster than the rest of IT. Every major tech company, old and new, is now driving Hadoop innovation, including Google, Yahoo, Facebook Microsoft, IBM, Intel and EMC – building value added solutions on open source contributions by Hortonworks, Cloudera and MAPR. Cisco’s surprisingly broad portfolio will be showcased at Strataconf in New York on Oct. 15 and at our October 21st executive webcast. In this third of a blog series, we preview the power of Application Centric Infrastructure for the emerging Hadoop eco-system.
Why Big Data?
Organizations of all sizes are gaining insight and creativity into use cases that leverage their own business data.
The use cases grow quickly as businesses realize their “ability to integrate all of the different sources of data and shape it in a way that allows business leaders to make informed decisions.” Hadoop enables customers to gain insight from both structure and unstructured data. Data Types and sources can include 1) Business Applications -- OLTP, ERP, CRM systems, 2) Documents and emails 3) Web logs, 4) Social networks, 5) Machine/sensor generated, 6) Geo location data.
IT operational challenges
Even modest-sized jobs require clusters of 100 server nodes or more for seasonal business needs. While, Hadoop is designed for scale out of commodity hardware, most IT organizations face the challenge of extreme demand variations in bare-metal workloads (non-virtualizable). Furthermore, they are requested by multiple Lines of Business (LOB), with increasing urgency and frequency. Ultimately, 80% of the costs of managing Big Data workloads will be OpEx. How do IT organizations quickly, finish jobs and re-deploy resources?How do they improve utilization?How do they maintain security and isolation of data in a shared production infrastructure?
A mixture of different workload types on the same infrastructure
A variety of analytics processes
In Hadoop 1.x, compute performance was paramount. But in Hadoop 2.x, network capabilities will be the focus, due to larger clusters, more data types, more processes and mixed workloads. (see Fig. 1)
ACI powers Hadoop 2.x
Cisco’s Application Centric Infrastructure is a new operational model enabling Fast IT. ACI provides a common policy-based programming approach across the entire ACI-ready infrastructure, beginning with the network and extending to all its connected end points. This drastically reduces cost and complexity for Hadoop 2.0. ACI uses Application Policy to:
- Dynamically optimize cluster performance in the network
- Redeploy resources automatically for new workloads for improved utilization
- Ensure isolation of users and data as resources are deployments change
Let’s review each of these in order:
Cluster Network Performance: It’s crucial to improve traffic latency and throughput across the network, not just within each server.
Hadoop copies and distributes data across servers to maximize reliability on commodity hardware.
The large collection of processes in Hadoop 2.0 are usually spread across different racks.
Mixed workloads in Hadoop 2.0, support interactive and real-time jobs, resulting in the use of more on-board memory and different payload sizes.
As a result, server IO bandwidth is increasing which will place loads on 10 gigabit networks. ACI policy works with deep telemetry embedded in each Nexus 9000 leaf switch to monitor and adapt to network conditions.
Using policy, ACI can dynamically 1) load-balance Big Data flows across racks on alternate paths and 2) prioritize small data flows ahead of large flows (which use the network much less frequently but use up Bandwidth and Buffer). Both of these can dramatically reducing network congestion. In lab tests, we are seeing flow completion nearly an order of magnitude faster (for some mixed workloads) than without these policies enabled. ACI can also estimate and prioritize job completion. This will be important as Big Data workloads become pervasive across the Enterprise. For a complete discussion of ACI’s performance impact, please see a detailed presentation by Samuel Kommu, chief engineer at Cisco for optimizing Big Data workloads.
Resource Utilization: In general, the bigger the cluster, the faster the completion time. But since Big Data jobs are initially infrequent, CIOs must balance responsiveness against utilization. It is simply impractical for many mid-sized companies to dedicate large clusters for the occasional surge in Big Data demand. ACI enables organizations to quickly redeploy cluster resources from Hadoop to other sporadic workloads (such as CRM, Ecommerce, ERP and Inventory) and back. For example, the same resources could run Hadoop jobs nightly or weekly when other demands are lighter. Resources can be bare-metal or virtual depending on workload needs. (see Figure 2)
How does this work? ACI uses application policy profiles to programmatically re-provision the infrastructure. IT can use a different profile to describe different application’s needs including the Hadoop eco-system. The profile contains application’s network policies, which are used by the Application Policy Infrastructure controller in to a complete network topology. The same profile contains compute and storage policies used by other tools, such as Cisco UCS Director, to provisioning compute and storage.
Data Isolation and Security: In a mature Big Data environment, Hadoop processing can occur between many data sources and clients. Data is most vulnerable during job transitions or re-deployment to other applications. Multiple corporate data bases and users need to be correctly to ensure compliance. A patch work of security software such as perimeter security is error prone, static and consumes administrative resources.
In contrast, ACI can automatically isolate the entire data path through a programmable fabric according to pre-defined policies. Access policies for data vaults can be preserved throughout the network when the data is in motion. This can be accomplished even in a shared production infrastructure across physical and virtual end points.
As organizations of all sizes discover ways to use Big Data for business insights, their infrastructure must become far more performant, adaptable and secure. Investments in fabric, compute and storage must be leveraged across, multiple Big Data processes and other business applications with agility and operational simplicity.
Leading the growth of Big Data, the Hadoop 2.x eco-system will place particular stresses on data center fabrics. New mixed workloads are already using 10 Gigabit capacity in larger clusters and will soon demand 40 Gigabit fabrics. Network traffic needs continuous optimization to improve completion times. End to end data paths must use consistent security policies between multiple data sources and clients. And the sharp surges in bare-metal workloads will demand much more agile ways to swap workloads and improve utilization.
Cisco’s Application Centric Infrastructure leverages a new operational and consumption model for Big Data resources. It dynamically translates existing policies for applications, data and clients in to fully provisioned networks, compute and storage. . Working with Nexus 9000 telemetry, ACI can continuously optimize traffic paths and enforce policies consistently as workloads change. The solution provides a seamless transition to the new demands of Big Data.
To hear about Cisco’s broader solution portfolio be sure to for register for the October 21st executive webcast ‘Unlock Your Competitive Edge with Cisco Big Data Solutions.’ And stay tuned for the next blog in the series, from Andrew Blaisdell, which showcases the ability to predictably deliver intelligence-driven insights and actions.