Recently I had an opportunity to sit down with the talented data scientists from Cisco’s Threat Research, Analysis and Communications (TRAC) team to discuss Big Data security challenges, tools and methodologies. The following is part one of five in this series where Jisheng Wang, John Conley, and Preetham Raghunanda share how TRAC is tackling Big Data.
Given the hype surrounding “Big Data”, what does that term actually mean?
John: First of all, because of overuse, the “Big Data” term has become almost meaningless. For us and for SIO (Security Intelligence and Operations) it means a combination of infrastructure, tools, and data sources all coming together to make it possible to have unified repositories of data that can address problems that we never thought we could solve before. It really means taking advantage of new technologies, tools, and new ways of thinking about problems.
Read More »
Tags: analytics, API, Big Data, Cisco, database, Hadoop, HDFS, innovation, Intelligence, java, mapreduce, NoSQL, operations, security, Shark, Spark, SQL, telemetry, TRAC
This has been an exciting week. Further expanding its Big Data portfolio, Cisco has announced collaboration with Intel, its long term partner, for the next generation of open platform for data management and analytics. The joint solution combines Intel® Distribution for Apache Hadoop Software with Cisco’s Common Platform Architecture (CPA) to deliver performance, capacity, and security for enterprise-class Hadoop deployments.
As described in my blog posting, the CPA is highly scalable architecture designed to meet variety of scale-out application demands that includes compute, storage, connectivity and unified management, already being deployed in a range of industries including finance, retail, service provider, content management and government. Unique to this architecture is the seamless data integration and management integration capabilities between big data applications and enterprise applications such as Oracle Database, Microsoft SQL Server, SAP and others, as shown below:
The current version of the CPA offers two options depending on use case: Performance optimized -- offers balanced compute power with I/O bandwidth optimized for price/performance, and Capacity optimized – for low cost per terabyte. The Intel® Distribution is supported for both performance optimized and capacity optimized options, and is available in single rack and multiple rack scale.
The Intel® Distribution is a controlled distribution based on the Apache Hadoop, with feature enhancements, performance optimizations, and security options that are responsible for the solution’s enterprise quality. The combination of the Intel® Distribution and Cisco UCS joins the power of big data with a dependable deployment model that can be implemented rapidly and scaled to meet performance and capacity of demanding workloads. Enterprise-class services from Cisco and Intel can help with design, deployment, and testing, and organizations can continue to rely on these services through controlled and supported releases.
A performance optimized CPA rack running Intel® Distribution will be demonstrated at the Intel Booth at O’Reilly Strata Conference 2013 this week.
Tags: Big Data, Cisco UCS CPA, CPA, Hadoop, HBase, Intel, NoSQL
You may have heard that the digital universe is in petabytes, global IP traffic is in 100s of exabytes. These are mind bogglingly large metrics. Big data analytics can play a crucial role in making datasets in this space usable – by improving operational efficiency to customer experience to prediction accuracy. While Cisco is the global leader in networking -- Did you know that 85% of estimated 500 exabyte global IP traffic in 2012 will pass through Cisco devices ? – the company also builds an innovative family of unified computing products. This enables the company to provide a complete infrastructure solution including compute, storage, connectivity and unified management for big data applications that reduce complexity, improves agility, and radically improves cost of ownership.
To meet a variety of big data platform demands (Hadoop, NoSQL Databases, Massively Parallel Processing Databases etc), Cisco offers a comprehensive solution stack: the Cisco UCS Common Platform Architecture (CPA) for Big Data includes compute, storage, connectivity and unified management. Unique to this architecture is the seamless data integration and management integration capabilities with enterprise application ecosystem including Oracle RDBMS/RAC, Microsoft SQL Server, SAP and others. See Figure 1.
The Cisco UCS CPA for Big Data is built using the following components:
- Cisco UCS 6200 Series Fabric Interconnects provides high speed, low latency connectivity for servers and centralized management for all connected devices with UCS Manager. Deployed in redundant pairs offers the full redundancy, performance (active-active), and exceptional scalability for large number of nodes typical in big data clusters. UCS Manger enables rapid and consistent server integration using service profile, ongoing system maintenance activities such as firmware update operations across the entire cluster as a single operation, advanced monitoring, and option to raise alarms and send notifications about the health of the entire cluster.
- Cisco UCS 2200 Series Fabric Extenders, act as remote line cards for Fabric Interconnects providing a highly scalable and extremely cost-effective connectivity for large number of nodes.
- Cisco UCS C240 M3 Rack-Mount Servers, 2-RU server designed for wide range of compute, IO and storage capacity demands. Powered by two Intel Xeon E5-2600 series processors and support up to 768 GB of main memory (typically 128GB or 256GB for big data applications) and up to 24 SFF disk drives in the performance optimized option or 12 LFF disk drives in the capacity optimized option. Also features Cisco UCS VNIC optimized for high bandwidth and low latency cluster connectivity with support for up to 256 virtual devices.
Read More »
Tags: Big Data, Cisco UCS CPA, Cloudera, CPA, Greenplum MR, Hadoop, Hortonworks, MapR, MarkLogic, MPP Database, NoSQL, Oracle NoSQL Database, ParAccel, Pivotal HD
Last year , Oracle launched Oracle NoSQL to address the need of Big Data and analytics. Since then this commercial grade solution has been deployed on Cisco UCS, taking advantage on the high level of performances, and RAS capabilities of this game changer.
At Oracle OpenWorld, I met both Ashok Joshi , Oracle Senior Director NoSQL Database Development and Raghunath Nambiar , Cisco Distinguished Engineer UCS who had together a well attended session on Tuesday to present the powerful combination of NoSQL running on Cisco UCS -- Check below the rich content slide deck .
In this short video, Asho Joshi explained us what are the benefits of a a commercial grade solution, as opposed to an open source, highlighting not only the cost reduction and cost containment, but also the promise of business continuity with multiple data center support, thanks to the computing and networking capabilitie provided by Cisco
At the speaking session, Raghunath described the huge momentum of Cisco UCS, which is now with close to 16,000 unique customers, present in more than half of all Fortune 500 customers data centers.
Check slide 5 for more details on the growth , and financial results of a platform which benefits today from over 2600 Channel partners actively selling it .
Read More »
Tags: Big Data, Cisco, NoSQL, Oracle, UCS
Many Big Data related innovations have been developed by Web 2.0 companies, resulting in a growing collection of open source technologies that dramatically change the culture of collaborative software development and the scale and economics of hardware infrastructure. These technologies enable data storage, management and analysis in ways that were not possible before with traditional technologies such as relational database management systems, in a cost-effective manner.
NoSQL is one such technology that has emerged as an increasingly important part of big data trends for applications that demand large volumes of simple reads and updates against very large datasets (Hadoop is the other innovation, a generic processing framework designed to execute “read only” queries and batch jobs against massive datasets). NoSQL is often characterized by what it is not, and definitions vary. It can be Not Only SQL-based or simply Not a SQL-based relational database management system. NoSQL databases form a broad class of non-relational database management systems that are evolving rapidly, and several solutions are emerging with highly variable feature sets and few standards.
While these technologies are attractive from the standpoint of the innovations they can bring, not all products meet enterprise requirements. Many organizations require robust, commercially supported solutions for rapid deployments and the ability to integrate such solutions in to existing enterprise applications infrastructure.
To address these needs, Cisco and Oracle are the first vendors collaborating to deliver enterprise-class NoSQL solutions. Exceptional performance, scalability, availability and manageability are made possible by the combination of the Cisco Unified Computing System (UCS) and Oracle NoSQL Database. Together, this powerful solution provides a platform for the quick deployment along with predictable throughput and latency for most demanding applications.
Read More »
Tags: Big Data, NoSQL, Oracle NoSQL Database