Avatar

New learning lab teaches you data analysis skills

The volume, velocity, and variety of the data is increasing by the day! We are able to capture thousands of security events each month, but what can we do with all this data? This is where data analysis comes into the picture. It can help us analyze very large amounts of data to perform critical function, including:

  • Anomaly detection
  • Network Traffic Analysis
  • Path visualization

In order to analyze this data and extract meaningful information from it, we need to learn the right skills. In this blog post, we are going to explore how we can use data analysis in order to help with Cyber Security. This blog post will introduce you to a brand-new learning lab on Cisco Secure Network Analytics that will show you what Python Pandas is all about. Read more below to find out!

Cisco Secure Network Analytics

At Cisco, we have a solution called Secure Network Analytics (formerly known as Cisco Stealthwatch). It helps us gain visibility on the network, perform network traffic analysis, and detect anomalies.

Secure Network Analytics is a visibility and network traffic analysis solution that uses telemetry from the enterprise network. As an administrator, you get end-to-end visibility into traffic and a total visibility across every security touchpoint. Moreover, Secure Network Analytics runs multi-layered machine learning models and advanced behavioral analytics, which allows us to always know who is on our network and what they are doing.

Using Machine Learning, we can establish a baseline and recognize normal behavior for a particular user or host, which enables instant alerts to changes in the user’s behavior. Using Python Pandas, you can recreate a simplified version of Cisco Secure Network Analytics to recreate the anomaly detection in a simplified way.

Panda secure analytics pic1

Python Pandas

Python Pandas is an open-source Python library for data analysis and data wrangling. What makes Pandas so unique is that it transforms data into a Python object with rows and columns called a DataFrame that looks very similar to a table. This proves to be much easier to work with instead of the usual lists and dictionaries that we commonly see in Python. Look how easy the example below is:

python panda secure analytics pic2

Using Pandas, you can easily clean up the data and transform the data types into the desire data types that you can work with. Moreover, you can perform statistical analysis on the columns and transform columns by grouping them by a certain parameter. Overall, it helps us in quickly cleaning up the data, visualize and analyze the data. Below is an example of how to clean the data and the visualize it:

python panda secure analytics pic3

Pandas proves to be a great library for analyzing the vast amounts of data that we collect on our networks. Using Pandas, we can quickly clean up, visualize and analyze the data. We have created a Learning Lab where you can try out Pandas yourself using data collected from Cisco Secure Network Analytics

New learning lab teaches you how to work with datasets using Python and Pandas

In the learning lab, we use a dataset that is derived from a Cisco Secure Network Analytics flow search query export. Because Secure Network Analytics has a database with all transactions (flows) that happened in the network, you can perform precise search queries. This precision can be crucial during forensic research into cyber attacks.

For this Learning Lab, a flow search was performed without any filter parameters except for the chosen time windows (10 minutes). In the Lab exercises, you use this unfiltered flow dataset to calculate some baselines using Pandas. These exercises teach you how to use Pandas and how the Secure Network Analytics algorithms works (note: extreme simplification of the algorithms). Cisco Secure Analytics uses a complex combination of baselining, anomaly detection, and other machine learning algorithms. This Lab is an extreme simplification for educational purposes.

In the lab you will learn how to work with datasets using Python and Pandas. You will also learn how Cisco Secure Network Analytics works on a high-level. You will see how to clean up data, and how to filter and sort the data to find top outliers — outliers, or anomalies, that could be an indication of a  cyber-attack.

Want to learn more? Check out the learning lab!

Additional security developer resources

Please visit the DevNet Security Dev Center. There you’ll find videos, webinars, weekly tips & tricks, and docs for using Cisco APIs. And, a link to join the DevNet Security Developer Community.


We’d love to hear what you think. Ask a question or leave a comment below.
And stay connected with Cisco DevNet on social!

Twitter @CiscoDevNet | Facebook | LinkedIn

Visit the new Developer Video Channel

 



Authors

Simon Fang

Technical Solutions Specialist - DevNet

GVE DevNet