Ron Graham had served as a Data Center Architect and Systems Engineer for some of the largest IT companies in the U.S. including Cisco Systems, NetApp, Sun Microsystems, and Oracle. He is currently working for Cisco Systems as a Big Data Analytics Engineer.
What I mean is, is your data not being used that much or is the temperature of the data going from hot to cold? Hot data is being used a lot and cold data is being used sparingly. I think every one runs into this problem at some point where they are store cold or frozen data on high performance compute resources. Does it make sense to move unused data to an archive directory as long as it is still in the same cluster and can still be accessed? In the majority of cases this makes sense.
We have hot data and cold data, so what about warm data? Warm data is giving off a moderate degree of heat and data is used less frequently than hot and more than cold. Take a look at the graph below. I interpolated the graph based on tech posting from Ebay and interviews with a former Disney admin.
On the business side, my analysis proved a 15.9% saving in CAPEX for a 1 petabyte (PB) Hadoop cluster. With a hot and cold storage ratio of 4:1, which means that 80% of my data will be on high performance storage platforms and 20% of my data on storage optimized platforms.
It seems people sometimes have this view of SDN as addressing rather esoteric use cases and situations. However, the reality is that while there are instances of ‘out there stuff’ happening, there are many situations where we see customers leverage the technology to address pretty straightforward issues. And these issues are often similar across different business/vertical/customer types.
Aftab Rasool is Senior Manager, Data Center Infrastructure and Service Design Operations for Du. I recently had the chance to talk with him about Cisco’s flagship SDN solution – Application Centric Infrastructure (ACI) – and Du’s experience with it. I found there were many instances of Du using ACI to simply make traditional challenges easier to deal with.
Du is an Information & Communications Technology (ICT) company based in Dubai. They offer a broad range of services to both consumer and business markets, including triple play to the home, mobile voice/data, and hosting. The nature of their business means the data center, and thus the data center network, is critical to their success. They need a solution to effectively handle challenges of both deployment, as well as operations…and that’s where ACI comes in.
I’ll quickly use the metaphor of driving to summarize the challenges Aftab covers in the video. He addresses issues that are both ‘in the rear view mirror’ as well as ‘in the windshield’ – with both being generalizable to lots of other customers. What I mean is that there are issues from the past that, though they are largely behind the car and visible in the mirror, still impact the driving experience. There are also issues on the horizon that are visible through the windshield, but are just now starting to come into focus and have effect.
Rear view mirror issues – These are concepts as basic as scalability associated with spanning tree issues, or sub optimal use of bandwidth, also due to spanning tree limitations. These issues are addressed with ACI, as there is no spanning tree in the fabric, and the use of Equal Cost Multi Pathing (ECMP) allows use of all links. Additionally, use of BiDi allows use of existing 10G fiber plant for 40G upgrades, thus obviating the expense and hassle of fiber upgrades. As a result, the ACI fabric, based on Nexus 9000’s, provides all the performance and capacity Du needs.
Windshield issues – These are represented by a range of things that result from business’s need for speed, yet are diametrically opposed by the complexity of most data centers. The need for speed through automation is becoming more and more critical, as is simplifying the operating environment, particularly as the business must scale. Within this context, Aftab mentioned both provisioning and troubleshooting.
Provisioning: Without ACI, provisioning involved getting into each individual switch, making requisite changes – configuring VLANs, L3, etc. It also required going into L4-7 services devices to assure they were configured properly and worked in concert with the L2 and L3 configurations. This device by device configuration not only was time consuming, but created the potential for human error. With ACI, these and other types of activities are automated and happen with a couple of clicks.
Troubleshooting: Before ACI, troubleshooting was complicated and time consuming, in part because they had to troll through each switch, look at various link by link characteristics to check for errors, etc. With ACI, healthscores make it easy and fast to pinpoint where the challenge is.
Please take a few minutes to check out what Aftab has to say about these, and other aspects of his experience with ACI at Du.
As the leader of Cisco’s Data Virtualization and Analytics Business Units, it is my pleasure to announce Cisco Data Preparation, a new big data and analytics offering for business analysts and IT.
What is Cisco Data Preparation?
Driven by Business’s accelerating demand for analytics, Cisco Data Preparation (Data Prep) makes it easy for non-technical business analysts to gather, explore, cleanse, combine and enrich the data that fuels these analytics.
Primarily designed as a self-service application for business analysts, Data Prep is also a valuable new capability for IT data developers and even data scientists, helping these teams collaborate to achieve the following benefits:
Faster Insights: New data sets available in minutes, not weeks.
More Comprehensive Insights: Gain advantage from all your data sources.
Better Business Outcomes at Scale: Supports hundred of data preparation projects at big data scale.
Higher Productivity, with Greater Governance: Both Business and IT gain from stronger collaboration.
Why will Business Analysts like Cisco Data Preparation?
Business Analysts can use Data Prep to address the significant data integration challenges they face when preparing analytic data sets using with a self-service approach.
Every analytic project is different making every data exploration effort unique. Cisco Data Prep’s Excel-like interface and machine learning lets analysts explore data freely.
Data is messy and everywhere. This results in analysts spending as much as 80% of their time preparing data before analysis can begin. Cisco Data Prep dramatically reduces time required to prepare data.
Too few Data Scientists and too long IT backlogs puts the onus on the Business adopt self-service. Cisco Data Preps empowers Business Analysts to do this work themselves.
Why will IT like Cisco Data Preparation?
IT can use Data Prep to work in concert with the business to intelligently balance self-service needs with governance constraints, while optimizing infrastructure.
Many requirements are short lived in contrast with IT’s industrial grade orientation. Cisco Data Prep helps IT and Business meet exploratory data needs with the right level investment and when needed even provide working prototypes that IT can quickly reengineer.
Independent, ungoverned data prep efforts can lead to duplication of effort, inconsistently transformed data sets of unclear origin, resulting in inaccurate analysis and potentially bad business results. Cisco Data Prep built-in governance and data set sharing increase trust.
Rogue data preparation activity in personal sandboxes and myriad tools, prevent IT from delivering scalable, secure infrastructure. Cisco Data Prep’s ability to massively scale allows IT so support thousands of users and multiple terabytes of data with a common, cost-effective infrastructure.
A Complete Data Preparation Solution, Only from Cisco
Cisco Data Preparation is a complete software, hardware and services solution that simplifies adoption and accelerates benefits.
Leveraging an easy-to-learn and use Excel-like interface and powerful machine intelligence algorithms from Cisco partner Paxata, Data Prep removes barriers to adoption and elevates business analysts’ skills.
Two-way integration with Cisco Data Virtualization helps leverage prior IT investments and closes the loop between the business and IT.
Data Prep’s massively scalable Hadoop and Spark-based architecture ensures that Data Prep users won’t be constrained by size of data sets or complexity of analysis.
Plus, a complete set of Cisco and Partner provided “Plan” and “Build” services ensure Data Prep implementation success.
Learn More About Cisco Data Preparation
There are lots of ways you can learn more about Data Prep. You can:
Join us at Strata+Hadoop World this week from September 29 through October 1 at the Javits Center in New York. Stop by Cisco Booth 425. There you can get a Data Prep demo from Cisco Sales Engineer Bill Kellett as well as attend with Cisco Data and Analytics Director Bob Eve and Paxata Product VP Nenshad Bardoliwalla.
Join us at the 2015 Data and Analytics Conference, October 20-22, at the Hilton Chicago. Register now and join my breakout session, “Data Preparation for Self-Service Analytics.”
Cisco Prime Network Analysis Module (NAM) has been integrated with Nexus 7k/7700 Series using Cisco® Remote Integrated Services Engine (RISE) technology providing a powerful story for data center integration. RISE with Prime NAM provides high performance monitoring and packet analysis on multiple virtual device contexts along with switch interface statistics for all modules.
Cisco RISE is being used by a large number of customers to tightly integrate the Cisco Nexus series switches with the Cisco Prime NAM to provide VDC awareness and SPAN traffic across multiple VDCs without burning slots on the switch. RISE overcomes the limitation of applying SPAN configuration only in the VDC to which the management cable is attached by intelligently managing the movement of NAM data ports and SPAN configuration to other VDCs as needed. The integration includes the following main features:
NAM appliance acts as a module on Nexus switches
One NAM appliance can receive traffic from multiple Nexus VDCs without re-cabling
One NAM appliance can collect interface statistics for multiple VDCs
Dynamic vdc-aware SPAN configuration on Nexus switches using NAM GUI
Up to 4 NAM ports can be automatically assigned to Nexus VDCs using NAM GUI
Graph of per-interface ingress and egress statistics for multiple VDCs
Auto-discovery and bootstrap of NAM appliance from Nexus switch
Health monitoring of NAM appliance
Visibility to multiple VDCs from one NAM appliance with ongoing VDC configuration updates
Configurable timer intervals and VDC list for interface statistics collection
User-friendly error handling for SPAN creation/deletion/modification
Order of magnitude OPEX and CAPEX savings: reduction in configuration, simplified provisioning and data-path optimization
Figure 1. RISE Physical and logical topology
Cisco RISE supports attachment to the NAM appliance in the following modes:
Direct Attach mode with single NAM: The appliance has a management link that is directly attached to the Nexus switch. Up to 4 data links on the NAM can be attached to one or more VDCs on the Nexus switch to send SPAN traffic (Figure 2).
Figure 2. Direct Attach Mode with single NAM
Direct Attach modes with multiple NAMs: The appliance has a management link that is directly attached to the Nexus switch. Up to 4 data links on each NAM can be attached to one or more VDCs on the Nexus switch to send SPAN traffic (Figure 3).
Figure 3: Direct Attach mode with multiple NAMs
Indirect Attach modes with multiple NAMs: The appliance has a management link that is attached via an L2 network to the Nexus switch. Up to 4 data links on each NAM can be attached to one or more VDCs on the Nexus switch to send SPAN traffic (Figure 4).
Cisco RISE with NAM provides the following key features that allow the solution to provide traffic and performance analysis across all the VDCs on the Nexus switch without changing the wiring connections.
Dynamic VDC-aware SPAN Configuration
Configure SPAN sessions for up to 4 NAM dataports from NAM GUI.
Create, edit, delete SPAN sessions, select destination ports and source ports for the SPAN sessions.
SPAN sessions can be configured in other VDCs by selecting VDC and data ports from NAM GUI. Dataport will be automatically moved to required VDC.
The options of SPAN configuration available to N7K CLI users are available via NAM GUI using RISE.
Provides visibility to all VDCs from one NAM.
Multi-VDC Interface Statistics
Retrieve interface statistics of all VDCs on N7K via RISE
Set short term and long term polling intervals for getting interface statistics
Set the interested list of VDCs from which statistics needs to be retrieved
Statistics can be viewed on per interface basis as a graph or data points
Enhanced application availability via simplified provisioning and efficient manageability.
Data path optimization: ADC off-load, low latency policy engine.
Dynamic VDC-aware SPAN configuration: Create SPAN sessions on any VDC
Multi-VDC awareness: Deliver traffic and performance reports in multiple VDCs
Cisco RISE provides significant savings in capital expenditures (CapEx) and operating expenses (OpEx) through simplified provisioning and data-plane optimizations
Dramatic OpEx savings: Reduction in configuration time and ease of deployment
Dramatic CapEx savings: Reduced wiring, power, and rack-space needs
The solution provides enhanced business resiliency and stickiness to Cisco products.
Cisco RISE is supported in Cisco NX-OS Software Release 7.x and requires the Enhanced Layer 2 Package license. Please contact firstname.lastname@example.org if you have any questions.
In December 2014, we announced VersaStack, an integrated infrastructure reference solution for enterprise applications that combines technologies from Cisco and IBM. Further extending this partnership, today we are announcing support for IBM BigInsights for Apache Hadoop on our Cisco UCS Integrated infrastructure for Big Data – an industry-leading platform widely adapted for enterprise big data application deployments. The joint solution encompasses disruptive innovations in Cisco UCS and the robust and industry-compatible Apache Hadoop distribution from IBM. This solution can be installed as a standalone Hadoop cluster with powerful analytical tools or can be integrated into existing VersaStack deployments that will benefit from a common fabric and unified management capabilities to deliver the deepest possible insight into your data to help you gain a sustainable competitive advantage.
We are also announcing the availability of Cisco Validated Design (CVD) that provides step by step design guidelines comprehensively tested and documented to help ensure faster, more reliable and predictable deployments at lower total cost of ownership.
Combines innovations from Cisco UCS such as programable infrastructure with best of open source software with enterprise-grade capabilities in IBM BigInsights for Apache Hadoop
Designed and optimized for common use cases, pre-tested, pre-validated and fully documented by Cisco and IBM engineers to ensure dependable deployments that can scale from small to very large as workload demands
Provides enterprises with extensive platform management and data visualization capabilities and integration of big data with other information solutions to help enhance data manipulation and management tasks
Brings the power of SQL to Hadoop at the performance and scale ever than before accelerating data science and analytics leveraging SQL – arguably the most beautiful programming language – and integration with business applications to access data stored in HDFS and HBase with JDBC and ODBC
Deep technical expertise, global resources, and world-class support and services from Cisco, IBM and partners
This solution is built on Cisco UCS infrastructure using Cisco UCS 6200 Series Fabric Interconnects and Cisco UCS C-Series Rack Servers optimized for IBM BigInsights for Apache Hadoop with scalability to thousands of nodes with Cisco Nexus 9000 Series Switches: