April 5, 2016 22 Comments

Cisco employee pays it forward in Honduras

We all face hardships in life, some more than others. There was a time in my life where I needed the help of those around me, and in return, paying that forward has become my passion. With Cisco’s recent announcement that they were giving every employee five days to give back to the global community, it struck an even deeper chord in me. Now, instead of using my own personal PTO to volunteer, Cisco was encouraging that passion in me and celebrating the giving spirit of its employees. I decided to put some of this time Cisco had given me towards my days in Honduras at an orphanage called La Finca.

Before my colleague Karla St. Louis and I left, we stuffed bags with necessary supplies. And for those who could not join us in Honduras, like my former manager Angie Miller, Amazon Prime was making regular deliveries to my house and entire suitcases were generously being filled that way. As much as the 14 of us headed to Honduras took, we all knew La Finca still needed more: basic necessities that many of us may not even think about like shower curtains, to the more obvious – clothes and shoes.

There was a little boy whose feet were so tender because there was no sole on the bottom of his shoe, and he had been running around on just the foam. There is only cold water to shower with, and the older girls are all sharing one tube of lipstick. The week we were there, four more mothers approached the gate of La Finca wanting to drop off their children and had to be turned away.

With a staff of 13 throughout the orphanage, including their school – the man who runs it even grew up there – our group brought them all outfits too. We could simply not leave anyone out.

And in our time there, we got to see the sweet smiles of those who were handed shoes and outfits. The older girls got to experience what shopping must be like when they tried on their clothes. Members of our group made all the kids a lasagna dinner one evening, and after their fifth helping, the boys had to be cut off as they will all eat until they are sick.

Photography is another passion of mine, and so I also went to La Finca to take and give the children childhood photos of themselves. Every smile is a melancholy beauty. Every photo I took made me wonder if this child knew that they were amazing. That they can be anything they want to be.

While I was there I photographed a young man named Samuel, who weighed 4lbs at 7 months old when he was dropped off. His mother said “If he lives – he lives, if he dies – bury him.” He’s now 22, and I watched as he heard that story for the first time, and saw the photo of that day.

As he processed his story, he said that Sunday was always the worst day of the week for him because it was family day. He would sit in a corner and sing to himself with no visitors.

We went back to that corner and as he sang I realized the song with the “my life will never be the same” lyrics, might be the most beautiful song I have ever heard. Tears fell from my eyes as I sat behind my camera and I snapped this photo.

Samuel is now in college, funded by the Honduras government, for agriculture. He is looking to return to the orphanage to help teach them how to farm the land that was donated to them. Usually the children only have rice and beans to eat each day.

When I arrived at the orphanage I saw a mural of a lion and told Samuel that he is that lion. He is an inspiration to so many, including many young ones from Honduras who can see in Samuel what is possible.

I was able to photograph all 66 children. Most of the time we couldn’t communicate and oddly enough we didn’t need to – smiles and hugs were our language.

I’m not sure why, but I was really surprised to see how happy the children all were. Love really does conquer all. I realized in all of them that hope was very much alive. As a photographer, you often focus on the eyes. I photograph children with cancer and in the NICU and I have seen eyes that do not have hope left in them. However, these children…the hope was there.

Now that I am home, I have hugged my 7 year old son till it annoys him. I know I will feel no guilt when he asks me for his next toy and I say “No.”

The boys in the orphanage snatched my lens cap to use it as a Frisbee – and it made me wonder where true creativity and innovation comes from. Am I removing it from my son by buying him toys or should I let him have some time to be “bored” and see what he invents on his own?

I once interviewed a CIO that said, “You spend more time with your co-workers than you do your own family, you ought to know something about them.” And, he’s right.

The time Cisco gave me to continue my passion in giving back, may have been some of the most rewarding. I realized that the days Cisco has given us to give back is not just a benefit, but it is actually a beautiful gift and I am so very grateful.

It just so happens that one of our customer executives, whom I got to know, had a direct report that grew up in a small orphanage in Honduras.

That orphanage was called La Finca.

It is just a piece of what inspired Karla and me to travel to Honduras – and maybe it will inspire future Cisconians to continue giving back and perhaps even to help La Finca.

Please consider helping by donating to La Finca.

Want to work for a company that encourages you to give back to your global community? We’re hiring!

Authors

Jodi Swanson

Project Manager Delivery Lead

Commercial Experience Team

4 Comments

What is Email Spoofing and How to Detect It

Kevin Floyd

What is Email Spoofing?

Email spoofing is the creation of email messages with a forged sender address for the purpose of fooling the recipient into providing money or sensitive information. For example: a sender 401k_Services@yourcompany.com sends a message to your business email address stating that you have one day to log into your account to take advantage of new stock investments. The message uses your company’s letterhead, looks as legitimate as the 401k notices you’ve received before and has a login link.

Spoofing attacks became very noticeable in mid-late 2015 and more prevalent in early 2016.

What is it costing companies?

In August 2015, the FBI issued a public notice indicating that Business Email Compromise (BEC) is estimated to have cost companies over $1.2 billion between October 2013 and August 2015. Spoofing is one of many forms of BEC. See table below for more details.

A Recent Attack

In one recent example, employees of a Silicon Valley tech giant were warned to be on the lookout for email phishing scams directed at finance and HR personnel. A week later, on March 1, 2016, the company suffered an email attack impersonating the organization’s CEO that requested all of the employee W-2 forms. Subsequently, the 2015 W-2 tax form information for current and former U.S.-based employees was sent to an unauthorized third party. The employee responding to the request believed it was a legitimate internal business email.

Typically, the forged address is owned by a fellow employee that the attack victim trusts. A particular category of spoofing, called whaling, involves a victim who is a high value target such as a CFO, CEO or CIO. Criminals involved in whaling perform extensive social media research prior to their attack. Figure 1 is a fictitious example of an attack on Alan who is an executive at the company called Alpha. The attacker sends this message to Alan from outside of the company. The forged address is owned by Alice, Alpha’s Global Sales Director, who is in Asia at the time of the attack.

The spoof in Figure 1 is created by the attacker, Adam, in Figure 2. Adam, who lives in the domain outside.com, opens a telnet session to the victim’s mx record for alpha.com. Typing into his PC, he introduces his domain with “helo outside.com”, his real “mail from” address adam@outside.com and then the recipient’s address. In figure 2, this is identified as the SMTP envelope. Like a physical envelope in the mail system, with the city and street address on the outside, its purpose is similar for determining where the digital message is delivered. Email servers and gateways read the SMTP envelope, and people read the message body. The fact that Mail From doesn’t agree with the Friendly From makes this a forgery. The challenge to the mail recipient is that SMTP envelope information is not readily available, particularly on smartphones.

How can you protect yourself?

Spoofers play on an employee’s fear of consequences, a sense urgency that leads to dismissing careful verification in favor of quick reaction. In these situations it is better to slow down and use available tools. Mail user agents, like MS Outlook, can compare the Friendly From address in the body with the Mail From address in the envelope. In Outlook 2010 you can do this with the following two easy steps:

Select the message and then select Message Options in upper-left tool bar, as shown in Figure 3.
Inside the Message Options menu, you can compare the Friendly From and Mail From addresses as shown in Figure 4.

All mail user agents have this ability. If this message were received by a Cisco customer, the next step would be to notify the company’s mail administrator and they would interact with Cisco support, or their Content Security sales team, to configure the needed filters that compare these address fields.

What about smartphones?

Typically mobile devices are too small to compare the Mail From and Friendly From addresses. If you suspect that your mobile device received a spoof, then you should wait until you have access to your laptop for more detailed verification.

To conclude…

We at Cisco continue to educate our email security customers and partners on how to use the current product features to combat this trend. And our email security engineers are building an even stronger automatic defense against spoofing and whaling attacks and integrating these into our Email Security solution. We offer a suite of anti-spoofing tools that include: DKIM, SPF, DMARC and content filters that verify the authenticity of the sender and allow administrators a choice of remediation.

For those of you who already own our products, please visit our new Email Security Best Practices page to obtain in depth information about how our solutions work to help you combat spoofing.

To learn more about Cisco’s email security solutions, please visit www.cisco.com/go/emailsecurity.

Authors

Kevin Floyd

Technical Marketing Engineer

Security Business Group

1 Comment

Guest Blog: Never Over or Underestimate Cloud Security, Integration or APIs

Xander Uyleman

Today’s Guest Blog comes to us courtesy of Dante Orsini, SVP of Business Development at iland.

Taking advantage of all that the cloud has to offer is no small order. That’s because extending your on-premise data center resources into the cloud completely changes how you scale performance, assure reliability, secure data and applications, and manage the costs of your day-to-day operations.

For many mid-market organizations and Software-as-a-Service companies, meeting stringent compliance and security requirements proves the toughest of these challenges. Developing or acquiring in-house security expertise is often difficult and costly, and also diverts resources away from the company’s core business.

The key to securely extending your operations into the cloud quickly and reliably is to build on proven technology. That means understanding exactly what a cloud platform includes, what costs extra and what you’ll need to do to ensure it optimally meets your business and technical requirements. Working with the right integrated cloud platform and Application Programming Interface (API) allows you to leverage the cloud the way you want while simplifying your cloud experience:

Quickly implement enterprise-class security: Gartner predicts that through 2020, 95 percent of cloud security failures will be the customer’s fault. More than anything, that figure speaks to the fact many cloud providers deliver raw infrastructure and leave their customers fully responsible for making it work correctly and securely. Further, it implies many cloud customers do not even know they are at risk. Whether you are building or delivering applications, using a cloud platform with integrated advanced security can give you a head start in protecting your business. With the right platform, developers should be able to immediately secure applications with features such as dual factor authentication, IPSEC and SSL-VPN, antivirus/malware, intrusion prevention, encrypted storage and backups, and more.

Leverage advanced management capabilities: In addition to advanced security, evaluate cloud platforms based on their native management capabilities, including performance and cost monitoring, compliance reporting, resource management, backup, and disaster recovery. When these capabilities are integrated with security on the same platform, you can easily manage and protect every facet of your operations.

Improve profitability through extensive visibility: Perhaps one of the biggest complaints about cloud is that it doesn’t offer the same transparency IT is accustomed to from an on-premise environment. When choosing a cloud provider, consider whether their platform allows you to monitor and manage performance at every level – globally and all the way down to individual Applications and VMs. Advanced cost controls and analytics can enable company leaders to accurately assess and manage costs. And access to rich performance and security data allows you to more easily prove to regulators and your customers that you comply with their strictest requirements.

Interface with cloud on your own terms: No matter how secure, integrated and transparent a cloud is, you will not be successful unless you are able to access its functionality in a way that best fits your business needs – without being a cloud expert. For many, that means relying on a management console that gives you access to all the cloud’s features in a single pane of glass. However, in some cases, DevOps, IT Ops and even managed service providers need to programmatically access the features of a cloud via an API. Make sure a provider gives you the flexibility to choose how you interface with its cloud platform.

Use the language of your choice: Choose an API based on ReST (Representational State Transfer) backed by language-specific software development kits (SDKs) such as Python, Java, and Golang. This gives you the flexibility to use the development language of your choice to accelerate time-to-market while minimizing your development costs.

iland understands that you need to be confident in your cloud solution. That’s why we deliver a cloud platform with innovation, intelligent management, and advanced security built in.

Our Enterprise Cloud Services – Advanced Security platform is designed to deliver a hassle-free cloud that doesn’t compromise the security of your data or applications. Its ReST-based API enables developers to easily access cloud data, deploy workloads, and integrate the iland cloud into their own management tools.

Overall, ECS-AS is easy to use, bringing operation management and security to a single, all-in-one platform. And you’ll enjoy the peace of mind knowing your operations are running on a world-class backend that is built on proven iland and Cisco technology.

Meet with an iland cloud specialist and learn how you can experience all the benefits of the public cloud with proven reliability and simplified advanced security.

Dante Orsini has more than 20 years of business development experience across enterprise software, managed services and technology consulting fields. He’s regarded as a thought leader in cloud infrastructure, working closely with customers and partners to anticipate and address global market needs.

Authors

Xander Uyleman

Senior Manager

Global Partner Marketing

8 Comments

Are traditional software architectures hindering your digital transformation?

Kerrie Holley

Evolve your software architectures for a hyper-connected world.

The next big phase of the Internet is the move to connect previously unconnected people, processes, data and things – potentially millions of them – in new and more productive ways. Advancements in cloud computing, distributed analytics, machine learning, robotics, the Internet of Things (IoT) and the emerging API economy are setting the stage for entirely new data-driven business models. For example:

Redefine the customer experience. For example, retailers can deliver truly personalized offers to any touch point based on your web browsing, past transactions, location awareness, in-store shopping patterns and more. Then swipe to pay without having to stop at a register.
Redefine industrial processes. With data from embedded sensors, actuators and digital identification tags, manufacturers can maintain full visibility into the location and condition of equipment, inventory, vehicles, energy and other assets.
Redefine health care. Imagine your doctor having access to data aggregated from medical tests, procedures done at any care facility, from wearable monitoring devices and global specialists to pinpoint a diagnosis and care plan for your specific genetic profile.

Sense a common theme? It’s about gaining big-picture perspective with real immediacy. Business insight is derived from what we learn right now – in real time as customers interact with us, as internal and third-party systems stream the latest data, and as the business environment changes around us. Digital business requires real-time optimization across a value chain to instantly respond to “always on” customer and business partners.

**Digital business calls for hyper-distributed computing, where the heart of the network can be everywhere and at the edges and where the network and application planes blur.**

Hyper-distributed computing (distributed computing on steroids) shifts the focus from the data center and out into a network of data sources across the wider business ecosystem – connections with suppliers, business partners, third-party data providers, customers and more. It is a pervasive computing model that can sense what is happening across the network and orchestrates timely and informed action. It is a business model where apps or algorithms do more than just provide an interface; they deliver highly personalized, context-aware and timely experiences.

This digital transformation calls for new ways of thinking about application and service software architectures. The traditional approach of moving data to a centralized data store for analysis and processing doesn’t work in today’s IoT world. Data is too widely dispersed, has too short a shelf life, and there is just too much of it to move fast enough.

In the research, “The 12 Principles of Application Architecture for Digital Business and IoT,” Gartner sets forth guidelines for successfully transitioning to the hyper-distributed computing model. Of the 12 principles, seven address design considerations pertaining to service-oriented architecture (SOA), virtualized applications, access and security, modularity and event processing. I believe those seven design considerations align neatly with Cisco’s approach to the challenges of the hyper-connected, Internet of Things (IoT) world.

Our software platform supports the seemingly contradictory requirements of hyper-distributed computing.

In an IoT world, you need to: (A) Capture more data than ever from more devices at the edge of the network, and (B) Make sense of it all and act on it without backhauling it to a data lake or analytics hub.

Cisco’s architectural approach:

Moves analytics out to where the data is – at the edge of the network – to create rapid insights.
Orchestrates complex many-to-many relationships among transient users, processes, devices and a hybrid cloud.
Maintains control over IT processes despite growing reliance on external services you don’t control.
Transcends traditional boundaries among the organization, third-parties and end users.
Supports rapid development and deployment of these new capabilities.

If you are responsible for the software foundation of your organization’s digital transformation, you’ll want to read our latest publication on the subject, “A New Software Architecture for a Hyper-Connected World: Rethinking application and service design.”

This publication presents Cisco’s response to the seven Gartner design guidelines, with a particular focus on the App Services subsystem, where data and analytic insights become action. You’ll also see some practical examples of hyper-distributed computing in action across industries.

Download this complimentary Cisco publication featuring Gartner research.

For more information about the Cisco Automation and Integration platform: www.cisco.com/go/aip

For more about the broader portfolio: Analytics and Automation Software, visit: www.cisco.com/go/analytics-automation-software

Authors

Kerrie Holley

VP, Chief Technology Officer

Software Platform Group

1 Comment

The Time Is Now for Co-Innovation: Three Reasons Why

Maciej Kranz

Traditionally, Cisco’s innovation strategy has been built around four foundational pillars: build, buy, partner, and invest. Recently, however, a fifth pillar—co-innovation —has become more prominent. In my last blog, I introduced the idea that innovation doesn’t happen in a vacuum—it requires diverse skill sets and multiple points of view. But it’s only been in the last few years that co-innovation as a formal, structured process with customers and partners has burst on the scene.

Why now? Three reasons come to mind:

First, our customers demand it. They want to partner with us, not just buy from us. They don’t want to be passive recipients of our products, occasionally consulted by a product manager or asked to test our beta. They want to bring their own needs and know-how to the table and work on solutions together. And I don’t think that our customers are making this demand simply because they can’t let go of control or don’t have enough to do on their own. They want to be actively involved because it’s a way to speed the innovation process (see point three below). They want to see faster return on investment, more ease-of-use, and seamless integration with a variety of applications relevant for them and their industry. And, by the way, it’s not just customers who want to co-innovate. Suppliers, startups, and academia are also looking for ways to partner to develop innovative solutions more quickly.

Second, line-of-business (LoB) organizations have become a major buying center for technology. Traditionally, the technology industry has focused on selling to three buying centers: service providers, consumers, and IT organizations. However, over the past 10 years, LoBs—the people who run operations, logistics, or other business functions—have emerged as a fourth buying center. This trend has increased as IT and operational technology (OT) converge to accommodate the requirements of Internet of Things (IoT) deployments. With powerful IoT use cases on the rise, the entire enterprise needs to get connected and work together. Continue reading “The Time Is Now for Co-Innovation: Three Reasons Why”

Authors

Maciej Kranz

Vice President and General Manager

Corporate Strategic Innovation Group

April 4, 2016 5 Comments

A BIGGER HELPING OF INTERNET PLEASE!

David Ward

Co-written with Maciek Konstantynowicz, Distinguished Engineer, Chief Technology & Architecture Office

“Holy Sh*t, that’s fast and feature rich” is the most common response we’ve heard from folks that have looked at some new code made available in OpenSource. A few weeks back, the Linux Foundation launched a Collaborative Project called FD.io (“fido”). One of the foundational contributions to FD.io is code that our engineering team wrote, called Vector Packet Processing or VPP for short. It was the brainchild of a great friend and perhaps the best high performance network forwarding code designer and author, Dave Barach. He and the team have been working on it since 2002 and it’s on its third complete rewrite (that’s a good thing!).

The second most common thing we’ve heard is “why did you OpenSource this?” There are a couple main reasons. First, the goal is to move the industry forward wrt virtual network forwarding. The industry is missing a feature rich, high performance/scale virtual switch-router that runs in a user space and has all the modern goodies from hardware accelerators; built on a modular architecture. VPP can run either as a VNF or as a piece of virtual network infrastructure in OpenStack, OpNFV, OpenDayLight or any of your other fav *open*. The real target is container and micro-services networking. In that emerging technology space, the networking piece is really really early and before it goes fubar; we’d like to help and avoid getting “neutroned” again. More on container networking in a follow-on conversation.

Additionally, 5 or so years ago I had been evangelizing this diagram as a critical target for SDN and a trajectory for work to move the industry forward.

Towards that goal, Services orchestration includes SDN controllers … check. Network == VPP, check. Analytics is immediately the next push in OSS and been in the works for a while and this year, we are also setting our goal to be relevant in the PaaS layer. We have focused on working with Cloud Foundry. All that’s for another blog, this one is on VPP.

So, why is the forwarding plane so important and where cool kids are hanging out? Today’s MSDCs, Content delivery networks and Fintech are operators of some of the largest and most efficient data centers in the world. In their journey to get there they have demonstrated the value of four things: 1) using speed of execution as a competitive weapon, 2) taking an industrial approach to HW&SW infrastructure, 3) automation as a tool for speed and efficiency and 4) full incorporation of a devops deployment model. Service providers have been paying attention and looking to apply these lessons to their own service delivery strategies. VPP enables not only all the features of Ethernet L2, IP4&6, MPLS, Segment Routing, Service Chaining, all sorts of L2 and IP4&6 tunneling, etc., but it does it out of the box. Unbelievably fast on commodity compute hardware and in full compliance with IETF RFC networking specs.

Most of the efforts to towards SDN, NFV, MANO, etc. have been on control, management and orchestration. FD.io aims to focus where the rubber hits the road: the data plane. VPP is straight up bit banging forwarding of packets, in real-time, at real linerates, zero packet loss. It’s enabled w/ high performance APIs northbound but not designed for any specific SDN protocol; it loves them all. It’s not designed for any one controller; it also loves them all. To sum up, VPP fits here:

What we mean is a “network” running on virtual functions and virtual devices, a network running as Software on computers. Therefore, VPP-based virtualized devices, multi-functional virtualized routers, virtualized firewalls, virtualized switches, host-stacks and often function-specific NAT or DPI virt-devices as software bumps in a wire, running in computers, and in virtualized compute systems, building a bigger and better Internet. The question service designers & developers, engineers and operators have been asking is how functional and deterministic are they? How scalable? How performant – and is it enough to be cost effective?

Further, how much their network behavior and performance depend on the underlying compute hardware, whether it’s x86 (x86_64), ARM (AArch64) or other processor architectures?

If you’re still following my leading statements, you would know answer to that, but lot’s of us would be guessing and both will be right! Of course it does depend on hardware. Internet services are all about processing packets in real-time, and we love a ubiquitous and infinite Internet. More Internet. So this SW-based network functionality must be fast, and for that the underpinning HW must be fast too. Duh.

Now, here is where reality strikes back (Doh)! Clearly, today there is no single answer:

Some VNFs are feature-rich but with non-native compute data planes, as they’re just a retro-fittings or reincarnations of their physical implementation counterparts: the old lift and shift from bare metal to hypervisor
Some are better performing with native compute data planes, but lack required functionality, and are still a long way to go to implement required network functions coverage to realize the levels of network service richness used to and demanded by the network service consumers.
VPP tries to answer what can be done, what technology or rather set of technologies and techniques can be used to progress virtual networking towards the actual fit-for-purpose functional, deterministic and performant service platform it needs to be to realize the promises of fully-automated network service creation and delivery.

Can we start talking about Deterministic Virtual Networking?

A quick tangent on what I mean by reliable and deterministic network. Our combined Internet experience taught us few simple things: fast is good and never fast enough, high scale is good and never high enough and delay and losing packets is bad. We also know that bigger buffer arguments are a tiresome floorwax-and-dessert-topping answer to everything. Translating this into the best practice measurements of network service behavior and network response metrics:

packet throughput,
delay, delay variation,
packet loss,
all at scale befitting Internet and today’s DC networks,
all per definitions in [RFC 2544] [RFC 1242] [RFC 5481].

So here is what we need and are to expect by these simple metrics:

Repeatable linerate performance, deterministic behavior, no (aka 0, null) packet loss, realizing required scale, and no compromise.

If this can be delivered by virtual networking technologies, then we’re in business; as an industry. Now that’s easy to say and possible for implementations on physical devices (forwarding asics have an excellent reason to exist and continue into the future), built for the sole purpose of implementing those network functions, but can it be done on COTS general purpose computers? The answer is: it depends for what network speed, for 10Mbps, 100Mbps, 1Gbps today’s computers work… ho, hum. Thankfully COTS computing now has low enough cost and enough clock cycles, cores and fast enough NICs for 10GE. Still a yawn for building real networks. For virtualized networking to be economically viable, we need to make these concepts work for Nx10 | 25 | 40GE, …, Nx100, 400, 1T and faster.

Why Optimization Matters

At a very basic level, a virtualized network function is simply another workload. Like any workload, it consumes some CPU cycles, memory and IO. However, the typical compute workload mostly consumes CPU and memory with relatively little IO. Often application workloads assume that IO is ubiquitous and infinite. Network workloads are the inverse. They are by definition reading and writing a lot network data (i.e. packets going in and out of the physical or virtual NIC), but because processing is typically only done on the packet header; the demand on CPU is relatively modest. There are network functions like deep packet inspection (DPI) that are more CPU intensive, but for the purposes of this conversation, we will consider network workloads primarily IO-centric. Measuring performance is pretty easy: all that matters is packets-per-second (PPS) throughput. Because the expected behavior and resource consumption of a network workload is well defined, it makes it easier to plan optimizations to increase packet-per-second performance.

The following compute hardware resources significantly affect the performance of network workloads:

CPU cores – executing packet processing instructions and packet I/O operations.
CPU cache hierarchy – caching packet processing instruction and data, without need for main memory access, because there is no time for such expensive operations.
RAM memory – reads/writes of packet processing instructions and data, and packet header data.
PCIe bus lanes – network I/O connectivity to and from NICs.
NICs – network packet I/O for physical interfaces, packet processing offload (e.g., checksum), and direct memory and cache access via PCIe.
NUMA locality for CPU core communication to PCI devices and memory resources.

We can measure network workload performance as follows:

 execution_time          = seconds/packet
                         = instructions/packet * clock_cycles/instruction *
                                      seconds/clock_cycle

The main performance contributors then are:

How many instructions are required to process the packet.
How well encoded are instructions on computer hardware, including efficiency of parallel execution and minimal dependencies between execution of different instructions or instances of instructions.
How fast is the actual computer hardware (seconds/clock_cycle)

Making this specific to virtual networking, we can re-write this as

execution_time       = seconds/packet
                     = instructions/packet * clock_cycles/instruction *
                                     seconds/clock_cycle
                     = clock_cycles/packet * seconds/clock_cycle

or simply

packet_processing_time = clock_cycles/packet * seconds/clock_cycle

Let’s propose a cycles per packet (CPP) metric to be a fundamental measure of NFV system efficiency. To make this even more specific to NFV:

packet_processing_time = CPP / core_frequency.

If we start filling in real-world numbers, we begin to see why efficiency matters. If we look at processing a line rate stream of packets on a 10GE interface we are looking at needing to process ~14.88 million packets-per-second (MPPS, assume 64 Byte Ethernet Layer-2 frames + 20B of preamble + inter-frame gap).

Simple math tells us each packet must be processed within 67.2 nanoseconds (ns). On a CPU core clocked at 2GHz has a core clock cycle of 0.5nsec. That leaves a budget of 134 CPU clock cycles per packet (CPP) on a single 2.0 Gigahertz (GHz) CPU core. For 40GE interfaces, the per packet budget is 16.7 ns with 33.5 CPP and for 100GE interfaces it is 6.7 ns and 13 CPP. Even with the fastest modern CPUs, there is very little time to do any kind of meaningful packet processing. That’s your total budget per packet, to receive (Rx) the packet, process the packet, and transmit (Tx) the packet. And always remember that the actual main memory RAM is 70nsec away! You can’t use main memory access – remember the 67nsec per packet budget – when you need to access memory it’s too late. Dammit! You need to predict what would be needed. You need to predict to create the future. Wait, what?

One simple answer we all knock about is to throw more cores at the problem to spread out the workload or just run faster CPU and chase GHz, but a well engineered solution can take a better approach that uses all the resources in a CPU complex:

Remember von Neumann bottleneck, limited throughput between the CPU and the memory compared to the CPU speed and amount of memory – it still applies!
Offloading some packet processing functions to dedicated hardware (e.g., NIC cards)
Optimal use of CPU caches to address memory access bottlenecks required for moving packet bits between network-workloads. Don’t wait for main memory.
Optimized packet processing software to take advantage of multiple cores (multi-threading)
Parallel processing that spreads the packet processing across multiple CPU cores
- Addressed with efficient use of CPU caching hierarchy and pre-fetching.
  - No time to afford cache line misses.
  - Pipeline vectors of packets, auto-adjust vector size for consistent processing latency.
  - Pre-fetch instructions from memory ahead of time, watch the byte boundaries of cpu prefetches, use every one of them.
  - Watch those cpu core cache lines, chase cache line misses, dups, evictions.
  - Watch the data flow within the cpu socket chip, it’s a “network” after all.

Enter VPP

As you might guess, this is where VPP comes in and has incorporated all the industries hard work over the years figuring out how to fling packets really fast.. Ok, we’ve rambled for a while so let’s quickly list the elements of a virtual network:

Physical NIC (pNIC) – connect the compute platform to the physical network.
Virtual Switch (vSwitch) – responsible for forwarding L2/IP4/IP6 packets or all sorts of other L2/IP4/IP6 tunneling encapsulations. This might be between pNICs and VNFs, between VNFs’ vNICs, and so on. vSwitches can support number of network functionalities including security features.
Virtual Network Functions (VNFs) – logical network functions that perform any number of packet processing functions including Layer 2 switching, IPv4 and IPv6 routing, quality of service (QoS), stateless and stateful security features and multi-protocol routing, and so on. Use Virtual NIC (vNIC) to connect to the virtual or physical network.

Virtual networking typically needs to support three types of logical links connecting the elements of a virtual network:

vSwitch-to-pNIC, VNF-to-pNIC – logical link connecting the vSwitch or VNF to a physical NIC interface using either dedicated I/O hardware mapping (PCI or PCI-passthrough drivers) or shared I/O hardware mapping (Single Route I/O Virtualization [SRIOV] drivers.
VNF-to-vSwitch ― logical link connecting VNF virtual interfaces to vSwitch virtual interfaces. VNF virtual interfaces (vNICs) on x86 machines rely on Intel virtualization technologies (VTs), including emulated or paravirtualized network I/O. vSwitch virtual interfaces provide a backend to VNF vNICs.
VNF-to-VNF ― logical link connecting VNF virtual interface to VNF virtual interface. In most cases the link is connected indirectly via vSwitch but direct connectivity options also exist.

For compute systems running say Linux, there is a clear trend to run vSwitches in the host user-mode versus kernel-mode, as this provides much better control over the packet processing performance. It also decouples networking from the electronic state of the computer which is controlled by the kernel. For example, as part of our broader VPP contribution to FD.io, we include a VPP vSwitch. This is a user-mode packet processing and forwarding software stack for commodity hardware. It is high performance (~14 Mpps on a single core) while providing rich multi-layer networking functionality ― including Layer 2, IPv4, and IPv6 forwarding with large tables ― multi-context VRFs, multiple types of tunneling, stateless security, QoS policers (all of RFCs incl. color-aware ones), and is extensible through the use of plugins.

But moving packet processing to user-space is about more than performance; its about flexibility, availability and speed of innovation. The flexibility to put upgrade your packet processing without rev-ing your kernel, provide distinct application specific networking features to different applications running on the same box, or complex packet processing for VNFs into containers. The packet processing graph architecture of VPP together with its powerful plugin capabilities allow anyone to write new features, or support for hardware acceleration, and deploy them simply by dropping a plugin binary into a directory. Never has innovation in networking been more open and focused on easy, rapid development of features (and potentially new interoperable standards).

I’ll touch a little more on FD.io and its goals towards the end of this conversation.

What is VPP?

Originally born from a Barach and developed by a team at Cisco, Vector Packet Processing (VPP) is a proven technology shipping in many Cisco existing networking products. Combined with a compute-efficient network I/O (e.g., DPDK), VPP enables very efficient and high performing design and offers a number of attractive attributes:

User-space network packet processing stack for commodity hardware.
High performance:
- Multiple Mpps (Millions packets per second) from a single x86-64 CPU core.
- 120 Gbps Ethernet full-duplex on a single physical compute host with 2-Socket-CPU and 12@10GE.
- On a single physical compute host with 4-Socket-CPU with 18@40GE, the current box-full bandwidth number (1518B L2 packet size): 462Gbps.
- Current box-full PPS number (64B L2 packet size): 297Mpps.
The same code runs on baremetal hosts, in VMs, or in Linux containers.
Integrated vhost-user virtio backend for high speed VM-to-VM connectivity.
Layer 2 and Layer 3 functionality and multiple tunnel/encapsulation types.
Uses best-of-breed open source network I/O driver technology: Intel DPDK.
Extensible by use of plug-ins.
Control-plane and orchestration plane via standards-based /common APIs and SDN protocols (via Honeycomb).
It is multi-platform, running on x86_64, AArch64, PowerPC, MIPS processor types and is Endian clean and 32-bit and 64-bit clean. It uses widely available kernel modules (such as uio and igb_uio) and supports DMA-safe memory.
It has full spec-compliant error checks for all network encapsulations, extensive counter support, export of telemetry and full input checks on by default (many other implementations skip this one).
Full counter support and telemetry “push” export without degradation of performance.

VPP runs a Linux user space process. The same image works in hosts, guests in VMs, and containers. When running in hosts it supports driver access to NICs over PCI. When running in guests in KVM or ESXi virtual environments it can access the NICs over PCI-pass-through or SRIOV. VPP integrates DPDK PMD device drivers that support a wide range of NICs. In addition, VPP natively supports the following NIC devices:

Intel 82599 physical function, 82599 virtual function, Intel e1000, virtio.
HP rebranded Intel Niantic MAC / PHY.
Cisco virtual interface cards (VICs).
Integration with DPDK and all SOCs/NICs that support DPDK. With strong support by DPDK community and Intel is the communities get out of jail for having to support every NIC ever produced.

VPP supports improved fault tolerance compared to kernel code. A VPP crash seldom (read never) requires more than a process restart. In-Service Software Upgrade (ISSU) updates do not require a system reboot, just a process restart or a feature unload/load.

From a security perspective, VPP code has been reviewed and tested by a white-box security consulting team and I expect it to be more subjected to even more inspection now that it is part of FD.io VPP implements the following security functionality:

Image segment base address randomization
Shared-memory segment base address randomization
Stack bounds checking
Debug CLI “chroot”

Why is Vector Packet Processing Faster?

VPP relies on processing multiple packets at a time to optimize the use of commodity hardware resources. It uses a completely different software architecture compared to the traditional scalar approach of processing one packet at a time.

Here is a brief comparison of scalar and VPP approaches and their impact on compute resources:

Scalar Packet Processing Operation

The traditional approach that processes one packet at a time, generates per-packet-interrupt:

process.A
            process.B
                  process.C
                          {stuff happens}
                   return
            return
return

… then return the interrupt. This approach requires considerable stack depth and introduces to issues:

Issue 1: Thrashing the I-cache
- When code path length exceeds the primary I-cache size
- Each packet incurs an identical set of I-cache misses
- The only workaround is bigger caches
- Remember very tight per packet time budget!
Issue 2: Dependent read latency on big MAC or IP forwarding tables
- Example: 4 x 8 mtrie walk
  - Assume tables do not fit in cache
  - Lookup: 5.6.7.8.
  - Read: root_ply[5], then ply_2[6], the ply_3[7], the ply_4[8].
- Each read will stall the CPU for ~170 cycles
- Every CPU cycle counts to get low cycle-per-packet and high packets-per-second!

Vector Packet Processing Operation

VPP does process more than one packet at a time:

Grab all available packets from device RX ring buffer
Form a frame (vector) comprising packet indices in received order
Process frames using a directed graph of nodes

This approach results in a number of performance optimizations

Fixes the I-cache thrashing problem so Issue 1 is solved!
Mitigates the dependent read latency problem (e.g., for mtrie lookup). Issue 2 solved!
Reduces stack depth and addresses D-cache misses on stack addresses
Circuit time reaches a stable equilibrium value based on the offered load

VPP Graph and Graph Scheduler

The VPP network packet processing stack is built on a packet processing graph. A graph modular approach makes it easy to add new graph nodes to add functionality and address specific purposes. I fully expect the community of developers to be working on fido sub-projects on these feature graph nodes. They can write features to their heart’s content … standardized features, research new ones, write them as they write standards … evolve the possibilities of the internet afap without hassle of modifying a boatload of infrastructure.

At runtime, VPP code grabs all available packets from receive (RX) rings to form a vector of packets. A packet processing graph is applied node by node (including plugins) to the entire packet vector. Graph nodes are small, modular and loosely coupled. This makes it easy to both introduce new graph nodes and relatively easy to rewire existing graph nodes.

A plugin can introduce new graph nodes or rearrange the packet processing graph. You can also build a plugin independently of the VPP source tree, which means you can treat it as an independent component. A plugin can be installed by adding it to a plugin directory. It’s the best thing for researchers, protocol developers, network designers since sliced bread. TBH, it’s really the desire from the original SDN work in Gates 104 as well.

VPP code can be used to build any kind of packet processing application. It can be used as the basis for a load balancer, a firewall, an intrusion detection system (IDS), or a host stack. You could also create a combination of applications. For example, you could add load balancing to a vSwitch. Therefore, the combinations of feature pipe-lines are completely deployment specific. Any kind of network function, any feature, any order running on any compute platform; L/M/S and 4-/2-/1-CPU-Socket. And if you want to on XS ones too, there are quite a few ARM 64-bit platforms.

The VPP graph scheduler is there to ensure that you always process as many packets as possible. The simple logic behind it is that as vector size increases, processing cost per packet decreases and I-cache misses are amortized over a larger set of packets. There are pros and cons depending on type of traffic and service being forwarded so, it’s not as trivial as “set as large as possible and forget.” Thankfully like Luke, you can “use the API.”

The VPP graph scheduler calculates in runtime a table vector size equilibrium following a simple algorithm:

Start with vector size in equilibrium.
Add a delay such as clock-tick interrupt or similar.
Vector size increases but cost per packet decreases.
Computation returns to the previous equilibrium vector size.
The result is an optimal vector size and consistent per-packet latency.

VPP Instruction-cache Optimizations

CPU cache usage optimization and the increasing number of cache entry hits versus misses are among the most important factors reducing the average time to access instructions (code) and data from the memory. It is also fundamental for I/O intensive packet processing operations as outlined earlier. Most CPUs have different independent caches for instruction and data. VPP code optimizes the use of both.

VPP optimizes I-cache usage by making each VPP graph node process more than one packet. The first packet warms up the I-cache and all other packets hit in the I-cache. In addition, dedicated code is used to warm up the CPU branch predictor to avoid branch misses. VPP node dispatch functions typically comprise ~150 lines of code and easily fit in the Layer 1 I-cache in cases where they are not distributed across multiple VPP graph nodes.

VPP Pipelining

Many packet processing tasks require multiple dependent reads, usually including buffer metadata, packet data, a database entry derived from packet data (e.g., a forwarding table lookup), and rewrite information. Context (or packet) vectorization means that nodes can hide dependent read/write latency by breaking up the required work into pipelined stages. Applying pipelined stage processing to an 8-8-8-8 IPv4 trie lookup results in two operations executed in sequence: (1) read a previously-prefetched trie ply, (2) use the read results to prefetch the next ply.

Schematically, nodes process contexts as follows. In the following table, C0…C4 comprise 5 packet contexts.

```
    Stage 0 C0  C1  C2  C3  C4  C5  x   x   x
    Stage 1 x   C0  C1  C2  C3  C4  C5  x   x
    Stage 2 x   x   C0  C1  C2  C3  C4  C5  x
    Stage 3 x   x   x   C0  C1  C2  C3  C4  C5
```

Analyzing it in terms of concurrent execution order:

Stage 0 processes C0.
Stage 0 processes C1. Stage 1 processes C0.
Stage 0 processes C2. Stage 1 processes C1. Stage 2 processes C0.
Stage 0 processes C3. Stage 1 processes C2. Stage 2 processes C1. Stage 3 processes C0.
Etc.

This way VPP can handle large MAC, IPv4, and IPv6 forwarding tables as well as large network address translation (NAT) or any other binding and forwarding tables (e.g. IP4-over-IP6-softwire, security white-/black-lists, access control filters) with minimal/almost undetectable performance impacts. This is clearly visible in high-scale vNet benchmark results reported in the benchmark sections in this paper.

The latest testing being done with the VPP code sitting in the FD.io repo, we expect to see up to 720Gbps of IMIX from 4-CPU – Socket computer. At this point NICs, PCIe and CPU uncore are the bottlenecks, and we’ve found a lot of bugs and marketing numbers in commodity NICs, but the industry is working out the issues. VPP is really the first forwarder to exercise the limits of the NICs. That’s an expectation of 180Gbps per socket, let’s see what we get. Add another 4 sockets to become an 8 socket computer; we are targeting well over 1Tbps from a COTS computer. That’s reaching OMFG-type numbers. And the VPP code is not completely utilizing the CPU cores, with lots of cycle capacity to keep developing more network or compute functionality onto the very same sockets.

So, if we put this all together, we can see two things: 1) VPP has the raw packet handling capacity to work well beyond 100GbE without breaking a sweat; and 2) the efficiency of VPP is such that it provides a LOT OF the CPP headroom to do some really interesting and useful packet processing. Given VPP is multi-thread (and safe), multi-core; it’s extremely close to linear scaling when adding more cores – it follows Amdahl’s law of course (can’t be faster than the slowest element) and multi-core speedup is almost linear. This also means, from a compute perspective; you have deterministic IO/PPS and can place VPP and other compute workloads efficiently.

VPP Vignettes

VPP performance data points (measured)

Hopefully you’ve read this far and even taken an interlude to check out the code. You might find it nicely written and it all makes perfect sense, seems indeed like the right way forward. But does the VPP code actually work and does it do all of those nice and cool things? What does it actually mean to raw bit flinging that VPP is fast and efficient when running on general purpose compute nodes.

Well, let’s look at some benchmarking data, the actual measured test data. We start with

VPP packet throughput: some baselines of VPP packet throughput (Non Drop Rate, NDR), and CPU system utilization (Cycles Per Packet, CPP) while VPP is doing some meaningful work – IPv4 and IPv6 routed forwarding with large forwarding tables. We then add some stateless security: access-list filters and white-lists.
VPP packet delay soak test – 7 days of work: VPP packet delay over a long-ish period, a week of non-stop work forwarding packets, no lunch breaks, no weekend.
We finish the conversation with VPP binary API programming rate – a speedy one.

Clearly we could share here many more areas and data points, including CPP efficiency per processed packet per coded packet processing function, PCI3.0 connection utilization efficiency, x86_64 L1, L2, L3(LLC) cache utilization efficiency, memory lanes utilization efficiency and so on. Not to mention measured data points for other CPU architectures like Aarch_64 and Power8, the number of combination is bit overwhelming to a human reader, and we are running into a risk of exceeding attention span of the most patient readers. Hopefully you’re still with us. Hence we are not going to do this in this initial blog, which sole purpose is to warm you up to the idea of software based Internet infrastructure platform like VPP. We want everyone to get familiar with the basic properties and fundamentals of FD.io VPP, before unveiling the full picture. The full picture is going to take more than one blog, clearly. J

Let’s start the speed-and-feeds Fest then…

VPP performance benchmarking approach

How do you test performance of a piece of software that claims to be the fastest optimized software for packet processing on computers? Well, we thought, we are busy, we don’t have much time to invent new benchmarking approaches, so we start with applying best practice of testing huge effing routers, like the one below, and see what we get.

Figure 1. That’s the image you get when you type “hfr” into your favorite Internet search engine, and search for images.

And that’s exactly what we did. And we were surprised. We were ACTUALLY nicely surprised. Because suddenly SUT (System Under Test) wasn’t the computer with VPP running on it, it was our hardware routing tester IXIA. Running RFC2544 and RFC2442 tests, we were actually testing IXIA. And the first rules of measuring while testing a system of many components is to make sure that it is your target you’re measuring, not other devices or systems involved. We fixed that by working closely with our hardware routing tester supplier (IXIA) and arrived to the configuration that actually worked and tested computer plus VPP. The interesting tidbit is that a single computer, was requiring many many 10GE or 40GE ports in IXIA chassis. We were filling the IXIA chassis J. Here goes the high-level detail.

We tested FD.io VPP on two different computers, 2-CPU and 4-CPU, looking like these specs and these photos:

Both computers running the same open-source software stack:

VPP isn’t as pretty as a real router, as it is software, and you can see it in FD.io VPP git repo. But, only reading skills required if you want to know WHAT it does – it’s commented – search for functional spec references – the IETF RFCs. C-code reading skills required if you want to know HOW it does it, but in a nut-shell it runs CPU architecture optimized C-code. C-code writing skills required if you want to know WHAT it actually does to your packets.

All measured data points are from benchmark tests performed in Cisco labs in San Jose.

The basics – Ethernet L2 MAC switching

Starting with the basics – benchmarking Ethernet L2 MAC switching performance – across simple virtual network topologies: NIC-to-NIC, NIC-to-VM and VM-to-VM. Benchmark tests were performed back in September 2015 by EANTC, and compared VPP and OVSDPDK throughput. Configuration was the simplest you can imagine, single core used for running all VPP or OVSDPDK threads. Results are shown in the slide below. And they speak for themselves. VPP wins with larger MAC forwarding tables. Expected. No surprises.

VPP packet throughput

The testbed topology with the UCS C240M4 2-CPU 120GE compute node looks like this.

Nothing special you say, bunch of 10GE ports hooked up together. The extraordinary thing is the quantity of ports and the aggregate bandwidth – 12 of IXIA 10GE ports pumping packets at linerate into 2-CPU compute node, that is running VPP code and processing packets at rate. In this case, it is actually 120GE linerate, as we are not exceeding any of server hardware I/O limits, that is there is enough PCIe3.0 bandwidth and per NIC capacity (each NIC has packets-per-second and bandwidth limits) to support IMIX line-rate. (IPv4 IMIX = sequence of 7x 64B, 4x 570B, 1x 1518B; IPv6 customIMIX = sequence of 7x 70B, 4x 570B, 1x 1518B; all packet sizes are Ethernet L2 untagged).

Here are some sample benchmark results:

VPP and OVSDPDK running with large IPv4 routed forwarding tables

VPP does 120GE IMIX linerate at 2 Million IPv4 routes

VPP running with large IPv6 routed forwarding table

VPP does 120GE cIMIX linerate at 2 Million IPv6 routes

OVSDPDK not tested for stability reasons

Let’s look at the four socket x86_64 server – the testbed topology with the UCS C460M4 4-CPU 720GE (18x 40GE) compute node looks like this.

Again, nothing extraordinary, bunch of 40GE ports hooked up together. The extraordinary thing is the quantity of ports and the aggregate bandwidth – 18 of IXIA 40GE ports pumping packets at high rate into 4-CPU compute node, that is running VPP code and processing packets at rate.

Here are some sample benchmark results:

VPP running with large IPv4 routed forwarding tables, up to 8M IPv4 routes

VPP does 297Mpps 64B and 462GE 1518B linerate

Ok, put that into perspective for a second. Back in 2004 a router I was co-architect of, the CRS-1 (aka HFR) was a 16 @40Gbps LC, 7’ chassis router (see pic of router porn above). We just flung 12 LCs worth of traffic in a 2 RU computer … roughly 10 years later. I love technology advancements. Also note that this test went all the way to 8M routes in the FIB. The table size just doesn’t matter for IP, L2, etc.

Adding stateless security (VPP iACL nodes or VPP white-list/black-list nodes in the packet processing graph of nodes) does not make any difference to system throughput for IMIX packet size.

But clearly, there is no free lunch – the white-list has a CPP cost. CPU cores are doing work filtering packets, and the amount of work, the CPPs, is measured by VPP code and can be read using binary API and also with VPP CLI per outputs here.

The CPP cost is measured for every VPP node, every feature and function implemented and running in VPP graph of nodes. We always note what is the actual real-time cost of processing packets; the actual cost of network service expressed in CPPs. What this means is that if flinging 462 Gbps of packets takes 10% of the capacity of the VPP worker thread and 90% of the time for work is left over for feature work … there’s a ton of headroom for features before we see any degradation. Also note in the benchmarking that all cores aren’t used. In the V4|6 tests only 36 cores are being used for 462Gbps and 36 are free for other work. Add ACLs and still only 36 core in use w/ 36 spare. Biz economists should be smiling at this point.

VPP packet delay soak test – 7 days of work

Talking about soak-testing – we did that with 8M IPv4 routes – pounded at 18x 17Mpps for one week. Here are the results:

That’s right – 18 x 7.7trillion packets forwarded – and maximum packet delay registered by IXIA tester is less than 3.5msec. And this includes all outliers. Not bad for a modern x86_64 computer with Linux. Running specs as before – the 4-CPU server (UCS C460M4).

VPP binary API programming rate

What if we tell you VPP is faster than HW routers for programming the forwarding table, the FIB? Well, as it happens, it actually is. (though removing the requirement for an asic driver helps or some might say is cheating with the comparison). Here is a raw output from a running VPP code instance benchmarked with 8 Million IPv4 entries, while programming those entries into the FIB on the 4-CPU UCS460M4 server with 18x40GE interfaces:

Assuming 0.408 usec/route, converging all of the Internet table (Internet BGP table size as of 21 Mar 2016 was 600612 per http://www.cidr-report.org/as2.0/) would take 245 msec. LESS THAN A QUARTER OF SECOND. Mic drop.

Next steps in benchmarking: VPP and CPU systems

Here is a teaser of a few things we are testing and exploring with VPP running the servers and computers, running them hard, sweating on processing packets:

PCIe3.0 sub-system benchmarking of 8x and 16x lanes, switching and DMA, DCA.
CPU microarchitectures efficiency, core and uncore subsystems.
CPU cache and memory coherence.
Multi-core x86_64, AArch64, other – for L/M/S/XS deployments.

In other words squeezing the packet processing efficiency from commodity CPU platforms and their sub-systems that themselves are networks of connected things. Making networking where the cool kids hang out. Getting a huge helping of Internet, to everyone.

The Consoritum

FD.io Overview

While VPP is remarkable technology, it is only part of the solution for the industry, so this is where FD.io comes in. The FD.io Project is a Linux Foundation project to provide an IO services framework for the next wave of network and storage software. In another blog we’ll talk about using VPP for storage subsystems. The design of FD.io is hardware, kernel, and deployment (bare metal, VM, container) agnostic.

The FD.io project parses IO services into three layers:

I/O – the movement of packets to/from hardware/vHardware to cores and threads
Processing – classify, transform, prioritize, forward and terminate actions
Management agents – controlling and managing I/O processing

Within the FD.io framework, VPP fits into the Packet Processing layer. This layering and the modular architecture of the project allow you to quickly build upon existing modules or easy add new ones. VPP heavily uses DPDK today. The reason being, VPP doesn’t want to be in the NIC support business.

Why Another Consortium?

We are all suffering from OpenSource Foundation Fatigue. This being said, there was a problem in the area of opensource data plane efforts, including the Linux kernel. There was a lack of:

support for diverse/custom hardware
portability across compute platforms
optimal use of compute microarchitectures
network level instrumentation
- Few debugability features
- Few if any Statistics/Counters exposed
basic Governance of the community
leads to unpredictable system behavior
consideration of the data plane for mid-point forwarders vs solely hosts
performance and scaling
features and feature velocity
automated end-to-end system testing frameworks
- leads to unpredictable system behavior

Therefore, to solve these issues: Fd.io was created. There really are no other efforts around dataplane technology that really covers all these bases.

Continuous Performance Lab (CPL)

An aspect of the FIDO community is that at the outset given the focus on performance and scaling the group included support for a continuous performance lab. It includes another contribution we made that is a fully automated testing infrastructure. It tests both the data plane and the northbound APIs. The goal is to support continuous verification of code and features for both functionality and performance. This will catch code breakage and performance degradations before code/patch review by the community. This full framework was used for generating and testing by multiple industry partners and we’ll be standardizing it at the IETF and open sourcing the whole thing.

Is there room for more projects?

It’s expected that the vast majority of the community will focus on writing feature (graph nodes). There’s plenty of room the community to develop new and extend functionality. As well as have commercial feature nodes as well. Examples include: Load balancers, Firewalls, Intrusion Detection, Additional hardware accelerators, more RFC support (e.g. OAM, BFD), spanning tree, Deep packet inspection, more test tools, support your favorite SDN or P4, etc. There’s a lot done already but there’s always more to do. A great example is one of telemetry. You’ll find in the code support for IPFIX v10, binary export and other manners of streaming data out of VPP. Features can be written to export different combinations of data to enable the reactive part of SDN.

What’s currently supported

Out of the box, VPP has more than all the basic functionality to be a deployed virtual switch and router. The highlighted box in the lower left around the mandatory input checks seems like basic functionality, but VPP is the only available forwarder that includes them in race-track testing and everyday functionality. As well as counters, counters, counters…. Key to note is all the IP/MPLS features and encapsulations. Indirect nexthops was just added which is key to be a big-boy router.

What also become critical is being able to debug and develop high performance code so we also contributed a developer environment and toolset to help the community write highly performing features. It’ll give all sorts of information down to impact of CPP.

FD.io & VPP Resources

The best part of open source projects is the opportunity to work with the code. In the open. You can get instructions on setting up a dev environment here and look at the code here. Finally check out the develop wiki here. There are a ton of getting started, tutorials, videos, docs on not only how to get started but how everything works. Even videos of Dave Barach. Please don’t mind the choice of language; we don’t let him out of his cave very much.

The code is solid and completely public and under the Apache license. We’re already using an “upstream first” model at Cisco and continuing to add more features and functionality to the project. I encourage you to check it out as well as consider participating in the community as well. There’s room for newbies and greybeards. We have a great set of partners and individuals contributing to and utilizing FD.io already in the open source community and a good sized ecosystem already emerging. Clearly from this conversation you can tell I think VPP is a great step forward for the industry. It has a great potential for fixing a number of architectural flaws in current SDN, orchestration, virtual networking and NFV stacks and systems. Most importantly it enables a developer community around the data plane to emerge and move the industry forward by having a modular, scalable, high performance data plane with all the goodies; readily available.

Authors

David Ward

Senior Vice President

Chief Architect & CTO of Engineering

Research Spotlight: Enabling Evil for Pocket Change

Talos Group

This post is authored by Tazz.

EXECUTIVE SUMMARY

At the end of February, one of the researchers on the team received a solicitation email from a domain reseller, which she reviewed the first week of March. The email was from Namecheap offering deeply discounted domains for .88 cents. The timing of the email couldn’t have been more ironic as it overlapped with some current research into determining if there is a relationship between domain pricing and an aggregation of domains related to malware/phishing/spamming. This article will discuss the relationship between deeply discounting domains and nefarious activities. For the purpose of discussion in this article, the word malicious will include malware, phishing, and spam activities.

BACKGROUND

Talos has previously investigated the magnetic relationship between bad guys and cheap/free services. When it comes to the Internet, undoubtedly you get what you pay for, and when it’s cheap/free it’s bound to be infested with bad guys. We saw this philosophy ring true with dynamic DNS, and saw bad guys leveraging cheap services when actors migrated to dynamic DNS which you can read more on https://blogs.cisco.com/security/dynamic-detection-of-malicious-ddns.

Any businessman, good or bad, seeks to make money, fast. To do this, one must maximize return on investment and/or find a market with a low cost of entry. If it costs $5,000 to get started, that might not be feasible for many, especially a criminal, but if it only costs $50 or even $5, well there’s obviously a greater chance that many people will seek out that market. These rules are no different when it comes to bad guys doing bad things on the Internet. So, given this email offering deeply discounted TLDs, the team formed a hypothesis and we began digging.

Hypothesis: When domain prices are <= $1 there will be an increase in registrations and a corresponding increase in malicious activities associated with the TLDs.

Authors

Talos Group

Talos Security Intelligence & Research Group

Join us for a #CiscoChat: Closing the OT& IT Knowledge Gap in Manufacturing

Douglas Bellin

IoT represents a $14.4 trillion business opportunity for manufacturers. While some manufacturers are increasing plant efficiency and production by harnessing the power of IoT, others are struggling to keep their industrial operations running. Why the disparity? IoT, despite representing a sizeable opportunity, also presents challenges.

As more devices are connected, manufacturers are experiencing a deficit in skilled workers able to maintain the equipment. Manufacturers are simply not able to keep up with the IoT’s exponential growth, and are left unable to address these challenges. The skills gap will continue to widen as more and more devices get connected to the network.

Also, nearly 3.5 million manufacturing jobs will need to be filled over the next decade – and the skills gap is expected to result in 2 million of those jobs going unfilled. Building a skilled workforce will be challenging in the face of aging demographics and the changing roles and responsibilities on the plant floor. Continue reading “Join us for a #CiscoChat: Closing the OT& IT Knowledge Gap in Manufacturing”

Authors

Douglas Bellin

Global Lead, Industries

Manufacturing and Energy

2 Comments

You Deserve a Flexible and Modern Network

Brian Levin

Ask any network administrator and you’ll hear the same plea: “I simply need my wireless network to work smarter!”

One of the key trends we’ve been seeing is more and more 5GHz clients joining the network. This is driven by the higher data rates, wider channels, and more free channel bandwidth which 802.11ac provides. It’s not just happening in one field either, this trend has spread across vertical such as: higher education, enterprise, hospitality, and retail. It is most noticeable at large trade shows such as Mobile World Congress (87% 5GHz clients), and Cisco Live (90% 5GHz clients), in which we do the show network. We are also seeing majority of the clients at these shows supporting 802.11ac (greater than 65%).

Other Wi-Fi vendors seem to be ignoring this trend. As they roll their new 802.11ac wave 2 access points, more and more seem to be fixed on a 2007 technology in the 2.4GHz band, 802.11n. One Access Point with two radios, one 2.4GHz 802.11n, one 5GHz, means 50% of your radios are operating in 2.4GHz mode. This is the perfect design for a network with 50% 2.4GHz clients—and that is a design of the past.

You know what else was cutting edge in 2007? MySpace. When was the last time you logged into your MySpace account to see what Tom was up to?

The new Cisco Aironet 2800 and 3800 Access Points brings intelligence and flexibility into your network via Flexible Radio Assignment (FRA). FRA gives your access point the flexibility to dynamically change between 2.4GHz and 5GHz depending on your client load. As more clients join your network, you can add additional capacity by providing more 5GHz coverage. As the client count falls, the flexible radio will dynamic adjust to provide the best coverage.

The best part is that this is all done automatically; you won’t have to do a thing.

FRA is the type of forward-thinking innovation that differentiates Cisco from the rest of the Wireless companies stuck in 2007. To learn more about FRA, click on the Cisco Aironet 2800 or 3800 pages.

Authors

Project Manager Delivery Lead

Commercial Experience Team

What is Email Spoofing?

What is it costing companies?

A Recent Attack

How can you protect yourself?

What about smartphones?

To conclude…

Authors

Technical Marketing Engineer

Security Business Group

Authors

Senior Manager

Global Partner Marketing

Evolve your software architectures for a hyper-connected world.

Digital business calls for hyper-distributed computing, where the heart of the network can be everywhere and at the edges and where the network and application planes blur.

Our software platform supports the seemingly contradictory requirements of hyper-distributed computing.

Authors

VP, Chief Technology Officer

Software Platform Group

Authors

Vice President and General Manager

Corporate Strategic Innovation Group

Can we start talking about Deterministic Virtual Networking?

Why Optimization Matters

Enter VPP

What is VPP?

Why is Vector Packet Processing Faster?

Scalar Packet Processing Operation

Vector Packet Processing Operation

VPP Graph and Graph Scheduler

VPP Instruction-cache Optimizations

VPP Pipelining

VPP Vignettes

VPP performance data points (measured)

VPP performance benchmarking approach

The basics – Ethernet L2 MAC switching

VPP packet throughput

VPP packet delay soak test – 7 days of work

VPP binary API programming rate

Next steps in benchmarking: VPP and CPU systems

The Consoritum

FD.io Overview

Why Another Consortium?

Continuous Performance Lab (CPL)

Is there room for more projects?

What’s currently supported

FD.io & VPP Resources

Authors

Senior Vice President

Chief Architect & CTO of Engineering

EXECUTIVE SUMMARY

BACKGROUND

Authors

Talos Security Intelligence & Research Group

Authors

Global Lead, Industries

Manufacturing and Energy

Authors

Product Manager, Engineering

Platform WLAN - US

**Digital business calls for hyper-distributed computing, where the heart of the network can be everywhere and at the edges and where the network and application planes blur.**