The Napkin Dialogues: “Open”-ing up to SDN
I consider myself to be a reasonably intelligent individual. Well, perhaps “reasonably” is a debatable term; just ask my friends. Or my wife. (Then again, don’t ask my wife.)
Reasonable or not, though, I’ve been trying to wrap my head around what all this “software defined” stuff is supposed to mean, and I have to confess it’s been a bit circular: it’s almost as if you have to already know the information you’re trying to learn.
So where are the Napkin Dialogues written for people like me? Is everyone a super-genius programmer-cum-networker-cum-programmer and I just missed the boat? People are throwing around these “Open” terms left and right (e.g., OpenStack, OpenFlow, OpenDaylight, etc.) as if it’s an “open” and shut case.
Well shut. The. Front. Door. I’m going to have to be on the receiving end of my own napkin then. For me, it’s been feeling like I’ve been dropped into the middle of a maze with the lights turned off.Yeah, kinda like that.
If you already ‘get’ this stuff, feel free to help a poor storage networking guy along in his journey, because I already know this iceberg goes all the way down.
To someone who is familiar with tried-and-true Data Center designs, I’m just having a hard time getting my head wrapped around 1) getting from here to there, and 2) just where there is!
Talk to Me Like a Kindergartner
Seems to me that I want to start off with the basics – but where to start? There’s a smorgasbord of terminology that’s thrown around, like dumping Halloween candy on the table after a particularly good haul: DevOps, OpenFlow, Puppet, Chef, JSON, XML, NetConf, SNMP, SMI-S, Python, API, Automation, OpenDaylight, OpenStack, etc.
The reality is, however, that these are not discrete elements that fall into an “all-things-being-equal” category. So, like the proverbial candy choices, where do you start?
I need to start in a place where the “terminology” isn’t used in the definition of the word you’re trying to define. I need this explained to me without using the words “automation,” “orchestration” or “controller”.
No, I’m not making this up. I asked someone to define “orchestration” for me and the answer I was given was, “Orchestration is when you want to orchestrate something.”
“What is something?”
“Whatever it is you want to orchestrate.”
So I went to my friend Jim.
[Note: Jim is not a real person. Jim is an amalgamation of several people who were kind enough to sit me down and explain very patiently what this stuff means, but I’m smushing them together to make one person for the purpose of this dialogue.]
[Note 2: Yes, “smushing” is a real word. I looked it up: http://dictionary.reference.com/browse/smushing?s=t]
[Note 3: Am I oversimplifying this? Yes. Because I have to. Even if it’s just for me.]
So, now I present to you…
Breaking “Open” Down into Plain English
Or, a napkin dialogue with Jim. Thankfully, I’ve helped Jim out before with some of my FCoE napkins, so he was willing treturn the favor.
“Where do I start?” I whined.
I think that Jim was afraid that if I didn’t get answers soon, I’d actually start to pout. Then comes the foot stomping. That wouldn’t be pretty.
“Okay,” he said, trying to calm me down. “Where are you getting stuck, mostly?”
“I’m embarrassed to say,” I confessed, “I keep getting all this ‘Open’ stuff mixed up. Like most of these terms, I thought I knew what they meant and then they are used in ways that I don’t quite understand. OpenFlow, OpenDaylight, OpenStack, etc. Then you throw in ‘northbound’ and ‘outhbound’ APIs, and different programming tools…”
“Good,” he said, cutting me off. “Before we get into the programming tools, let’s try to take a bigger picture of what all of this looks like.”
“Right,” I agreed.
“Let’s start with the three terms you began with: OpenFlow, OpenDaylight, and OpenStack,” he said, pulling out his napkin. “We’ll begin with OpenFlow.”
He drew a box on the napkin. “Keep in mind,” he said as he drew, “that these are just some of several ways to do things. This is only to help understand and describe some of the terminology, and help you place yourself into the larger context of what’s going on.
“At a high level, OpenFlow is all about a way of separating out the data plane from the control plane. In other words, if you have a switch or a router a packet goes in one port and out the other – this is “the data plane” – but you have to tell it to do that – and that happens in “the control plane.”
“Okay,” I said, “but isn’t that how switches and routers are supposed to work?”
“Of course,” he agreed. “but, there are at least two reasons why OpenFlow can make this easier.
“First, the traditional way that they do that is by configuring all that information inside the same box, usually through some sort of CLI. OpenFlow is a protocol that allows you to tell the box what to do from outside the box.
“Second, not all boxes do this the same way. But if all those boxes talk the OpenFlow protocol…”
“Then you only have to worry about writing to the protocol, rather than to each specific switch,” I finished.
“Exactly,” he said. “OpenFlow gives you a flexibility here. The switches can talk “OpenFlow” to allow those commands to work regardless of what the underlying infrastructure is, or who makes it. That’s just the beginning. You can take these building blocks and build them upon one another to create some very interesting behaviors for the switch.”
“For example?” I prompted.
“Okay, let’s go one step deeper,” he suggested. “Suppose that you want to do a deep-packet inspection on Ethernet traffic going across the switch, and you want to prioritize it, and possibly drop it if it’s not something that’s important. Or perhaps you want to take traffic for a particular application and handle it a specific way, or even take a specific route. Maybe some traffic needs to go over a secure route or path, or perhaps it’s not important at all.
“This is what OpenFlow can let you do: match on some particular criteria and then do some sort of action. Then every switch behaves the same way, taking on characteristics across the board that are consistent.”
“And the Nexus switches can do this now?” I asked.
“Some of these things, yes,” he said. “There are different versions of what OpenFlow can do, just like any other software release. The next release will add quite a few capabilities.”
He started scribbling on a new napkin, making a list. “You’ll have IPv6, MPLS, Equal-Cost Multipathing, multiple parallel channels between switches and the controller…”
“I see,” I said. “Slow down, I’m just a storage guy, remember, trying to get my head wrapped around this. I’ve only heard about half of what you have written down there.”
He smiled. “No worries. We can come back to those later at some other time.”
He shook his head. “No, no,” he corrected me. “That’s what some people thought when it was first announced, but that’s not true. In fact, Kyle Mestery already wrote about this specifically. You should look that up and read what he has to say.”
“Okay,” I said, “but can you give me a quick breakdown about what it is in the meantime?”
“Sure,” he said. “Remember that OpenFlow tells a switch what to do with a packet, right?”
“Right,” I agreed.
“What it doesn’t do, though, is understand what that switch should do based upon things like performance, security, or scale – just to name a few.”
“So OpenFlow is about the flow, but not the quality of the flow is handled?” I asked.
“Precisely,” he said. “All of those decisions have to be made outside of an OpenFlow device.”
I pointed to the napkin. “You mentioned a controller,” I suggested.
“Yup,” he said. “And that controller becomes responsible for all of those aspects of quality of the flows. All of those qualities – performance, security, scale – belong to various policies that are handled by that controller. This can be fine for a few dozen or even hundred devices, but when you get to thousands – “
“You can overburden the controller,” I finished for him.
“Exactly.” He took one of the earlier napkins and started drawing on it a bit more.
“So what this means is that you have an embedded agent inside of a device, whether it’s a switch or a load-balancer or whatever,” he said. “And that means that you’re taking off some of that burden from the controller and give some autonomy to the devices as well.”
“I think I’m going to need some more explanation about this later,” I said. “I mean, I think I’m starting to get it, but I’ll need a bit more at some point down the road.”
“Not a problem,” he said, and started drawing again. “We can do that. Let’s move on, then.”
“So you have a good idea of what OpenFlow is dealing with, correct?” he summarized. “It’s an Esperanto-like software layer between the switch and the instructions for that switch, instructions that can be part of an external controller that, well, controls the behavior of the switch. This is where OpenDaylight comes in.”
“Nice mustache,” I teased.
He gave me a look, and I shut up. Ahem.
“At its basic level,” he continued, “it’s probably easier to think of OpenDaylight as a way of programming a policy across multiple switches at one time.”
“So,” I said, starting to understand, “you take what you can do at a switch level, and apply it to the fabric level.”
“Exactly,” he said, smiling. I felt as if I had gotten a gold star on my homework. “It’s about programming multiple network components. It is a ‘SDN controller.'”
“If I understand this, then,” I said, “This OpenDaylight controller talks with OpenFlow in order to get these different switches to do what you want.”
“Correct,” Jim said. “That’s the ‘Southbound’ API you were hearing about.”
“Is this the only way to do this?” I asked. “I mean, do you have to use OpenDaylight to control these switches?”
“No, not at all. In fact, like I said before there is no ‘one’ way to do any of this,“ he reminded me. The agents don’t have to be OpenFlow, and the controller doesn’t have to be OpenDaylight. OpenDaylight is Open Source, meaning it’s licensed a certain way. OpenDaylight actually has a strong license that is both user, corporate and customer friendly and it can work across multiple vendors’ hardware and is starting to be used pretty broadly so that was the best example to learn from.
“We’ll focus on the core technologies now versus options that we can discuss later. As you’ve found out, it’s very easy to get deep in the weeds in a heartbeat.”
He took out another napkin. “Now, the last one is OpenStack,” he said. “So far we’ve been talking about network switches and what we can do to control them. But as you know Data Centers are comprised of two other major areas: Compute and Storage.”
He pointed to each of the three categories in turn. “What OpenStack is designed to allow you to do is look at the entire picture and provision all of these three parts of the Data Center at a time.”
“The reason why you may want to do this is because perhaps you want to have an application that you want to deploy in a data center.”
“For example?” I asked. I like examples.
He shrugged. “Let’s just take a very straightforward one. If you look around online, you’ll see one particular example used over and over, so let’s use that and place it into the context that we’ve been talking about.
“Let’s suppose that you want to start up or “spin-up” (to use some lingo), a web server, like Apache, in your data center. Think of all the things that you need to do – all the little bits and pieces that have to get done – to get it to work. As an infrastructure or network guy you have to find room on a server that can handle the workload first.
“Then you have to set up the configuration and security permissions for the developer and their application, plus any additional requirements that the application may have. For example, there are different workloads for different types of traffic – from straight data to video streaming, which is highly bandwidth-intensive. From there you have to determine where it accesses the data, that it can access the data and that it has the proper network settings.”
I understood. In a previous life, I’d set up numerous web servers. There are a lot of needles to thread. If you’re trying to do simple such as static pages it’s easy, but the world has changed over the past 20 years and most sites today are complex, dynamic database-driven websites with additional requirements, especially when you look at if that website is delivering video to it’s end-user, or is highly transactional like an e-commerce site or financial trading website.
“So if I wanted to create a WordPress site to sell something online and have everything done for me OpenStack has the ability to align the pieces together for me?” I asked.
“WordPress is a great example, and that’s the one I was telling you is used often to demonstrate this stuff. It’s a great and simple blogging software, publishing platform and CMS, and easy to understand and widely used,” Jim said. “Although it’s funny you should mention that. Mark Collier, the COO of the OpenStack Foundation, did a video on that very thing. In it you can see just what it looks like.
“Here’s where one of those terms comes into it that often gets thrown around,” he said. “All of these bits and pieces take an awful lot of coordination to get to the end goal. So you have all of these pieces that need to be lined up in order to make it all work together.”
I saw where he was going with this. “So what you’re saying is that OpenStack allows for the orchestration across the computing, networking, and storage components to get the application – in this case WordPress – to work.”
“Right,” he said.
“How does all this fit in with what I hear about Cinder,” I asked, trying to tie this back to my storage background. “I know that Cinder is the block storage component of OpenStack.”
“Glad you asked,” he said, smiling. “It makes sense that you would probably hear about Cinder first, as it’s the part of OpenStack that deals with iSCSI, Fibre Channel, and also Fibre Channel over Ethernet.”
“Finally,” I said, sighing. “Something I can understand.”
His smile faded a little. “Yeah, well J, I hate to tell you, but the majority of the work in OpenStack storage has been in object-based.”
“That project is called Swift,” Jim concluded, writing on the napkin.
“The reason for this is that much of the work that’s being done is consistent with IP-based technologies, and iSCSI, NFS, and Object-based storage fits in nicely with these types of initiatives,” he explained.
“That makes sense,” I said. “But why object-based storage, specifically?”
Jim shrugged. “It lends itself to some pretty flexible deployment options,” he said. “Object-based storage allows OpenStack to decide where to store the information rather than needing to know exactly where information sits on a specific drive. This way, you simply refer to a file, and OpenStack decides where to keep it based on the best option possible. Developers don’t have to keep track of how much space or capacity they have, how to back it up, how to scale, or what happens in case of a network outage.
“OpenStack coordinates all of that for the developer so they don’t have to worry about it. That right there is what people mean by orchestration. OpenStack has the ability to do this for the user versus the user having to do this themselves. That’s automation,” he said.
“Ah,” I replied. Eloquent and Shakespeare-like, I’m not. “What about the networking and compute part?”
“Absolutely,” he said, scribbling furiously. “Keep in mind that there are many parts to OpenStack. These are just the basic ones that we’re talking about here, and we started off with the simplest application. There’s more.”
“After all, we’re talking the tip of the iceberg,” he said, as he finished breaking everything down for me.
That made a great deal of sense, and the bigger picture was starting to come into focus. “But what’s with all this ‘Havana’ and ‘Grizzly’ stuff I hear about?” I said, wondering if I was digging myself deeper into a complex puzzle.
Jim drew a few lines over the napkin. “Those are the releases for OpenStack,” he explained. Nova is compute. Neutron is the network component, and Cinder and Swift are for Storage.
“OpenStack releases basically follow an alphabet-based ordering for releases. The current release is Havana, the eighth,” he explained. “Basically it’s the release that each of the individual projects contribute towards in terms of milestones. There are improvements to infrastructure on the Nova compute side that improves resource management, improvements to the Swift object storage side to optimize disk operations and caching, Cinder block storage improvements for migration, performance, and scalability, and the Neutron networking side got VPN and Firewall capabilities.”
“Is that all?” I joked, impressed.
He took me at face value, though. “Of course not,” he said. “Those are just some of the improvements. According to the OpenStack website there are more than 400 new features that deal with security, user interface, plug-in compatibilities, and role-based access controls. With each release these features are improved and when the community has them working together (Nova + Neutron + Cinder) is when there is a new version, such as Havana, released.”
The Big Picture
“Does Cisco have this…” I was at a loss for the right word, waving at all the napkins around us, “stuff then, in our Nexus switches?”
“Some,” Jim confessed. “Different areas within Cisco focus on different parts of the larger picture. We have a lot of OpenStack work being done on our UCS environment, for instance, and our Nexus 3000 series switches have a lot of this orchestration and provisioning that we’ve talked about. Cisco is a major contributor to OpenDaylight and have got quite a few hands-on, real-world-type solutions under our belts. Not only that but we’re working hand in hand with the community, customers and the Linux Foundation to get more code out there, and of course we’re also utilizing OpenDaylight with our own products as well.”
“Fair enough,” I said. “I definitely have a clearer understanding of what this stuff is now – much better than I did before. But where does all this stuff about Puppet, Chef, Python, etc., come into play?”
Jim shook his head. “Patience, child,” he scolded. “Let this sink in for now, and we’ll return to the napkins a little later.”
I frowned. I think I had used the same phrase on someone else once before. I didn’t like it as much when it was used on me.
“Okay,” I said. I had a feeling that I was on a roll, though, and if I didn’t cram all the rest of this stuff into my brain it might, well, leak.
“Don’t worry,” he said, seeing my hesitation. “There’s a lot of stuff to get your head around. Most of this stuff is community-centric, which is why you’ve been having a lot of the problems you’ve been having understanding it.”
I cocked my head. “What do you mean?”
“Do you remember when the hobbyists got ahold of a few chips in the 1970s?” he asked.
The shift in gears caught me by surprise. I nodded.
“Well, don’t read too closely into the analogy,” he said, “but it’s that kind of paradigm shift. Back then it was the big companies that had the control over the computing power. Once these chips got into the hands of hobbyists, groups like the Homebrew Computer Club began to breed a whole new way of thinking about how computers could work.
“We’re seeing the same thing now,” he said. “We have a lot of smart people who are more interested in finding out what’s possible and sharing with people whose technical abilities they respect – and whose respect they want to receive – than in following very rigid procedures that are typically ‘corporate’ in nature. It’s the same conversation happening all over again.
“The question isn’t really whether Cisco is going to support this, but when and in what fashion,” he continued. “There is a great deal of merit in the idea that the customer knows best what they want to do with their data center. It is, however, a different paradigm than the way customers used to ask for our help.
“It used to be that customers would come to us – or any other vendor – and look at the box that we had to offer and try to determine where it fit (if at all) into their data center. It was like selecting a menu at a restaurant. Sure, you could order off the menu from time to time, but for the most part customers were used to fitting their expectations into whatever the vendors could provide.
“But now,” he said, growing intense. “It’s a different story. Customers are a lot more comfortable with the technology. They’re far more technically savvy than in the past, and they want to cooperate with vendors like Cisco in their solutions. They want to go back into the kitchen – continuing on with that analogy – and help us make the meal. They want a toolkit that they can then go build things with.
“Or, to put it another way,” he said, a light bulb coming on as he thought of a new example. “You’re rebuilding a Jeep, right?”
I nodded. “Two, actually.”
“Okay then, let’s use this example. Imagine that you go to Sears, and you say that you want a tool kit. Sears tells you that you don’t need that toolkit, because they’ve got this pre-assembled Jeep right here for you.”
I raised an eyebrow. “Okay, okay,” he said. “I know Sears doesn’t sell Jeeps. But work with me here.”
“I think I see your point,” I said. “I may not even know why I want a tool, but I know that I’ll need it eventually. I don’t necessarily want to bring in a Jeep every time I need a bolt tightened, or oil changed. I just want to do it myself.”
“Exactly,” Jim said, smiling. “This is a fundamental shift in the customer-vendor relationship. And Cisco is working on all kinds of those tools for customers to use. Cisco’s architectures are shifting to become more open, interoperable and flexible. They’re also providing lots of choices in tools for people that may need a different tool for the language or code they are working on or have expertise in. The Nexus platform, in particular, has all kinds of tools. Python, Puppet, Chef, OpenFlow, onePK…”
“Yeah,” I said, “What’s that all about?”
He held up a hand, smiling. “That’s my fault,” he said, contrite. “I got ahead of myself there. We’ll talk more about this later. For now, I’ll leave you with this diagram to dig into interoperability and what that means. We’ll continue this conversation soon.”
He handed me a piece of paper from his bag:
“What,” I asked. “You have this thing just… lying around?”
“Always be prepared,” he grinned.
Every Journey Starts With A Single Step
Okay, so I’m not running a marathon quite yet, but I feel like I’ve at least gotten to the starting gate. To Jim’s credit, he never used the word “cloud” once, and left me with the idea that I could at least get a foundation for understanding what was going to happen next.
I also know that people who deal with this stuff on a daily basis are going to look at me as if I just recited the alphabet back to them. For now, though, I’m okay with that. At least for the moment, I think there are a lot more people who could stand to learn what the alphabet is before they start reading the SDN-equivalent of Marshall McLuhan or S. I. Hayakawa.
I did find out a lot about some of these terms that people use to describe “software-defined data centers.” At the very least, this is a decent starting point. I feel ready to start attacking this whole “programmability” juggernaut, too.