Avatar

Welcome back to my series on how we are putting network automation to work in the DevNet Sandbox.  This time around I wanted todive into some of the deep technical bits on how we’re using automation to configure and manage our network in a “Network Service” based model.   For this entry, I couldn’t decide whether to do a detailed written blog or a video so I crowd sourced the decision with a Twitter poll.  The votes were very much for a video option, so here it is!

And as implied by Part 1, there was just too much to cover in a single video.  Part 2, where I look at how the network service was designed and coded, is now available as well.

Oh.. and don’t forget to checkout the previous blogs in the series

If the idea behind using Cisco NSO and Network Services has you intrigued, don’t forget you can download Cisco NSO for FREE from DevNet to test out for non-production use right now.  And here are some other handy links for more resources.

Thanks for stopping by, please do let me know what you think about the work we are doing, and any questions you might have. You can use the comments here, or find me over on Twitter @hfpreston or on LinkedIn.  Until next time!

 

Video Transcript….

And a final bonus for the blog… while most folks were looking for a video, I know there are some who prefer to read than watch… so here is a transcript for the video above.  You can also use it to jump in to specific points in the video if you’d like.

Hank Preston: 00:08 Hey everybody. Hank with DevNet Sandbox here with the latest update on how things go with our new data center build where the network is completely being driven through network automation. Now in today’s update, I want to talk a bit about network services and how that fits into our experience of managing our data center going forward. You may be asking yourself, “Hank, what’s a network service?” Well, let’s think about it this way. Take our data center. A data center network is made up of many different individual device components. We’ve got spine switches, we’ve got access switches, we’ve got borders switches or maybe access distribution cord depending on how you set up. We’ve got routers that provide access in and out of the data center. There are security devices, firewalls that provide security as they go through and then even inside the network, the network extends out into things like the compute environment.

Hank Preston: 00:59 Here in our data center, clearly we’re going to use Cisco UCS, and so Cisco UCS is a big part of our data center network as is the Hypervisor platform. We use VMware for a lot of our Hypervisor work. Inside of there, the actual network inside of VMware is part of the network as it goes through. Now traditionally with network automation, you might manage individual device by device and you articulate what you’re trying to accomplish across all of those pieces. The idea behind network services is rather than tackling your network and automating it device by device, we actually work out can we talk about what we want to accomplish inside of the network as a whole? So, for our example is we wanted to be able to enable services like Layer 2 connectivity, Layer 3 connectivity. We want to make sure that traffic can get into these different network segments through the firewalls as they go through.

Hank Preston: 01:50 Every time I add a new network, every time I add a new VLAN or Layer 3 segment, I don’t want to have to go out and articulate individual changes device by device. The network service is a way of designing your automation to wrap everything up so you’re managing the network as a whole. Now, enough talk. Let’s actually see this and break this apart how we’re using it inside of our data center environment. Here, what we’re taking a look at is a representation of our data center network as it goes through. If you’ve been following along, you know that we’ve actually used Cisco VIRL’s platform to build an entire simulation for our data center topology. Inside of here at the top, we have our internet edge point, or we’ve got our internet as it comes through, and then we have a DMZ zone inside of our network here in the middle. We’ve got our firewall layer that provides our secure edge as we come from the internet through the DMZ.

Hank Preston: 02:41 And, then we’ve got our internal zone back here, which has got all of the workstations that are there. Now the first component of our network service based automation is I need to be able to provide and carve up Layer 2 segments across this environment. I’m going to call those Layer 2 fabrics. In our environment, I actually have three different zones of Layer 2 fabrics. At the edge out here, we’ve got … and I’ll zoom in so it’s easier to see … So, at our edge we have our edge fabric that goes through and then underneath that we have our DMZ fabric that’s in here, and then the largest one of our fabrics is down here at the bottom. It’s our internal fabric that lives behind our firewall zone as it goes through. And what goes into these fabrics are all of the the Layer 2 switching devices.

Hank Preston: 03:28 In our environment, we’ve got a series of Nexus 9000’s that will provide the basic access distribution core or spine-leaf type of an architecture for our fabric. But, it’s also composed of the virtual switches that provide connectivity to VM. So, our firewalls here are actual VM’s, so we have a DVS that provides the connectivity for those. And then we can see V switches down here for the different VMs that represent it. So our first piece that we want to manage is I don’t want to have to go through and handcraft the configuration device by device. I want to just consider this an entire fabric as it goes in. Now, what do I layer on top of the fabric? Well, this is where the idea of tenants are going to come through. Inside of our internal fabric that’s down here, I need to bring up a couple of internal tenants.

Hank Preston: 04:20 So, what we can see is off of our fabric, I’ve got our admin tenant that’s down here and we’ve got a shared services tenant. These are going to be parts of our network where we put things like active directory servers, where we put our management components, where we put all the applications that drive Sandbox, our portal, the website. All of those pieces will live inside of this tenant that is going to be baked on top of the fabric inside, and then you can also see where the tenants get exposed up on the firewalls themselves. Now, in addition to the admin tenants that go through, every time you reserve a Sandbox, you get access to a pod where all of your Sandbox resources will be provisioned.

Hank Preston: 04:59 So, off of our internal fabric, we also have these pod tenants that are going to spin up. You can see here I’ve got pod one has a firewall that’ll be attached to it, as well as the resources that need to be connected to these. So in these cases, VMs or maybe physical servers and resources that we plug in to those tenants. But again, on top of the VLAN fabric that’s there, and you can see we’ve also got a couple other pods in the area. Now, no matter how you’re tackling your automation strategy is a big part of your goal is likely to make it a more pleasant experience, to make the user experience of actually configuring, operating and managing your network easier than it has been in the traditional fashion. And, network services are no different.

Hank Preston: 05:41 Let’s see how network services will make designing this tenant and this Layer 2 structure inside of our data center a more enjoyable experience. We’re going to focus on the end state, what is actually delivered when we go down the network services path. Now, in order to build network services, you need some sort of a network automation framework to tackle that. Now, the tool that we’re using is Cisco NSO, Network Service Orchestrator. It’s designed with network service based use cases in mind. Now, the concept around managing your network as a service rather than device by device, it doesn’t matter the tool you’re using. You can do that whether you’re using Ansible or Raw Python or anything else that’s out there.

Hank Preston: 06:17 For our use case, I really like NSO as it goes through. One of the reasons I like NSO is it feels like a network engineer’s tool. Also, sometimes I feel a bit guilty because it doesn’t feel like I’m actually doing automation because it’s so much feels like a network engineer’s tool. Let me show you what I mean. Here I’ve connected in to NSO server and I’m going to enter into config mode. Now, remember our use case here is we want to add a new network into our admin tenant that’s there, so I’m going to go ahead and enter into my VLAN tenant admin. Now, the tenant already exists. I’m just going to add a new network segment that’s there.

Hank Preston: 06:55 The network segment we’re going to add is going to be called ESXI Management, and I need to provide some details about this new network segment that’s there. So, we’ll say the VLAN ID for this network segment will be 30. It’s network, so the prefix is 10.101.6.0/23. That’ll set up the actual Layer 2 / Layer 3 pieces for it. But remember, we’ve got different hosts that we’re going to connect. We actually have ESXI servers that need to physically connect into this fabric that’s out there, so the next piece we’re going to say is give all of the connection information. So, connections, switch, pair leaf 02. That was the pair of switches that these new servers are going to connect to. So, on interface 1/20. We’ll give it a description of ESXI01, and we’ll say that this needs to be a trunk interface, right?

Hank Preston: 07:52 We’re going to trunk in the VLAN’s here for this tenant that goes through. Now, we’ll jump in and we’ll do the next one. We’ll say interface 1/21. Description, ESXI02, mode trunk, exit, and then we’ll say interface. Our last one here, 123 mode trunk and description of ESXI03. All right, I’ve got them all in there. So, I’ve gone ahead and I’ve configured, updated NSO with what my goal is here. Now let’s see exactly what did I put in there. I put in a bunch of pieces, but if I just do a show configuration, it’ll actually print out exactly what are the new updates that need to be built in. Now, you’ll notice I am calling out the individual switch pair that’s there, but I don’t have to talk about the switches inside of the fabric that aren’t explicitly connected to the physical hosts.

Hank Preston: 08:49 We’ll still get those configured. Now, if I’m curious what the actual configuration, the automation, like where does the automation come in, well, I can do a check and say, “Okay, well show me what are we going to actually push out.” I can say I’m going to do a commit dry-run and I want to see the actual final native configuration from all of the devices that will be touched based on this small chunk of configuration. The network service instance that I’m going to push out. So, we’ll get and run this, and now what we’ve done is we’ve actually consumed that network service definition, or that configuration I put in, and it’s generated the full configuration that needs to be pushed out across the fabric that this tenant exists on. So, if we take a look device by device, what’s actually going to be rendered and pushed out? We can see on the Admin V switch, we’ll go ahead and create this new VLAN 30, give it the proper name.

Hank Preston: 09:39 Now, LEAF01, remember I didn’t call out LEAF01 at all in our configuration, but those are the border leafs, the core or the area where all the SVIs, the routing to the outside world take place, so on LEAF01, we’re going to create the VLAN, but we’re also going to go ahead and create the interface for that VLAN, and then configure it appropriately including putting it into the VRF for the tenant. This new network is being added to the admin tenant, and so there’s a VRF that goes along for that. You can see the IP information, the OSPF, HSRP. All of these have been configured based on the standard templates that our network engineering team designed for how we deploy new tenants and network segments into the environment. We can see LEAF012, the secondary in that pair is also set up for the SVI.

Hank Preston: 10:26 And, if I scroll down to LEAF01 and 2, that’s the pair of switches where these ESXi hosts are connected. You can see that interfaces 120, 21 and 23 have been indeed configured as you would expect for these ESXi hosts. And then, finally down here at the bottom, the rest of the fabric. Remember, we only talked about a couple of switches in the fabric, but this fabric entirely needs to be updated with the configuration in case in the future I need to add something else to this network on one of those switches, we can make sure that that VLAN becomes configured everywhere it needs to be configured across the environment. All of that configuration gets put together. So, to push this out into the network, we just do a commit and this will actually generate it.

Hank Preston: 11:09 Now before I hit enter, you might be wondering, “Hank, prove it to me. Is this actually going to go through?” So, let’s dive into one of these switches and see if this information is there. We’ll look at LEAF02-2. I’m going to jump over here and we will say viral console LEAF02-2. All right, so I’m connected into LEAF02-02 and then we’ll check on a couple of things. We’ll look at those interfaces, so we’ll show run interface, ethernet, 1/21-23, and we’ll see that there’s no configuration on them currently.

Hank Preston: 11:49 I’m going to go back and we will finalize and commit this configuration out there. All right. The commit shows complete. Let’s go double check and see if the configuration is indeed on that switch. So, I’m right back here where we were before. I’ll retake a look at those interfaces and lo and behold … Oh, it looks like one of them actually did not go. I’m looking at my note. We’ll see on there. If I go back and we look, so where is 122? Why does that one not look right? And, it’s because my actual configuration wasn’t 2122 and 23. When I typed it in, it was 20, 21 and 23. So, let’s go a look at that one. Okay, show me 20 through 23 and there we go. Now we can see 20, 21 and 23, and my typo in the live demo, I skipped over 22 as we went in.

Hank Preston: 12:40 But we can see our configuration was indeed pushed out as it went through. All right, so we’ve seen how we can add a new network into an existing tenant that’s there. But, as I go through this use case, let’s think about what else do I need if I’m adding ESXi hosts? Well, I need a vMotion network so that I can send traffic between the hosts that are out there. Now, that vMotion traffic could be just part of the admin tenant, but vMotion doesn’t have to be routeable to the internet. These are just private traffic as it goes through. So, what I want to do is I want to create an entirely new tenant, a new Layer 3 zone where we can put things like vMotion, traffic and other cluster and heartbeat networks in the future, so let’s go ahead and add a new tenant right from scratch as it goes through onto our fabric.

Hank Preston: 13:27 We’ll say VLAN tenant admin … We’ll call this an admin private, so it’s our administrative private network, stuff that’s not going to go out to the public. We’ll say fabric internal, or what do we call the fabric? Production. Fabric production. That’s the inside fabric that makes up in there. And, then we’ll say network, vMotion management one, so this is the network we’re going to go ahead and create, and then we will say a VLAN ID 32. Oops. And, then we’ll say this one’s network 10.225.16.128/25. All right, so that’ll go ahead and create the piece that’s in place, so let’s go ahead and push this in and see what actual configuration gets generated. And, we’ll use that same flow we did before. We’ll do commit, dry-run, outformat native. I want to see what’s rendered based on this network service.

Hank Preston: 14:27 All right, so what gets rendered when we create a new tenant is out here. So, I’ll scroll up and we’ll say, “Okay, well, admin vSwitch. We need a new VLAN created, but LEAF01-1 and 01-2, this is where all the Layer 3 stuff goes, and we can see that a new VRF context called admin private is indeed being set up, and then the interface VLAN for this is going to be configured as you expect in the proper VRF that’s out there.

Hank Preston: 14:53 Now, if I go down and look, LEAF102, our LEAF102 pair, the VLAN’s set up but if I need to vMotion for these hosts, well I need to make sure that that VLAN actually gets configured to those trunks that we set up, so let’s go ahead and actually add those same interfaces for ESX 01, 02 and 03 into this new network that we’re setting up. So, we will go ahead and say connections, switch, pair, LEAF 02, so it’s the same LEAF pair, and it’s actually the same interfaces as well. Now, I will make sure we stay consistent with my earlier demo, so we’ll stick with interface 120 and we will say mode trunk, and then we will say interface, exit, interface 121, mode trunk, and then we will say interface 123, mode trunk.

Hank Preston: 15:52 Now, let’s think a bit. When we add a new network onto a trunk interface, what’s the mistake that we often think we go through, right? It’s that dreaded add word when we need to add those through. So, all of that to make sure that we don’t make that mistake, that human error, we bake that into the service templates as they go in. And so now if I go ahead, let’s actually see what’s the full configuration that we’ve gone ahead and we’ve built for this new tenant. We’ll say show config, so just show me that the part that we built. This is it. This is what we’re going to do to create a brand new tenant as well as a network for vMotion across our environment, tying it into the same interfaces that were there. So, that’s the only piece I have to configure.

Hank Preston: 16:32 But what gets rendered for this, if I do my commit, dry-run, outformat and native … All right, so what’s getting configured? We saw a lot of this before. The new piece is these interfaces. So, here on LEAF02-1, we can see interface ethernet 120, and we’re getting switchport trunk allowed VLAN add, so it’s going to add VLAN 32 along with VLAN 30, which was the ESXi management one as it goes through. I haven’t pushed this out. Let’s go ahead and see what our status is, so I’m going to jump back into our host over here on LEAF102. We’ll look at our interfaces, and indeed this is what we’ve got so we’ll make sure that we get the new VLAN added. We don’t want to see that one taken away, so we will do our commit.

Hank Preston: 17:17 All right. The commit finished pushing that configuration out to all of the devices in this fabric, and if we go back and we look in here at LEAF02-2 and we look at those interfaces, and indeed we do have our additional VLAN. So, network automation, there’s a lot of things we go after for it. One of them is to make sure that we stick to our templates and we don’t make configuration mistakes mistakenly. That’s a big part of that configuration management piece and we can see how network services, we design that stuff up front and it just gets consumed and deployed out across our entire network, and it makes deploying of services and things like VLAN’s and networks and tenants across our data center in Sandbox much easier because if I scroll up here to … Where was that configuration that we just did? So, right here.

Hank Preston: 18:04 To go ahead and add this bit in, just enable a new tenant with one single network that goes through, we can see the hundreds of lines of config that needed to be done. Now, before we finish this one up, there’s a couple more things I want to dive into but I can already tell I’m going to have to do more videos on network services to dive into them, but there’s a couple more I want to do before we call it a day. Now, let’s think about we’ve seen how we’ve pushed out configurations and how network services make that easier. What about the the reverse? When things go away? There’s that dreaded, “I have to un-provision some service in my data center across my network.” What do we do a lot of the times? Well, we just leave stuff there because it’s hard to keep track off what actually was required for that specific service or that short-term network that had to go through.

Hank Preston: 18:52 We just get a lot of cruft that sticks around, right? Nothing ever gets removed from an ACL, or policy-based routing or even just interfaces that are out there. Now with the network services, everything is packaged up with that service and remembered. So, I can always go back through and look and see what did it take to deploy that service that was there. Let’s look at something like that new tenant that we just did. If I do VLAN-tenant, admin-private, so that’s the tenant and I want to say, “You know what? Show me all of the modifications to my entire network that had to be done in order to provision this new tenant that was out there.” And so, here we go. It shows me exactly what was configured across the environment. I’m just going to scroll here to the top.

Hank Preston: 19:41 All right, there we go. This breaks down device by device, so we can see an on admin vSwitch we created the VLAN and LEAF01. All of the components that were necessary on LEAF01 to configure this, so the new VRF, the interfaces, all of the components in OSPF that needed to be updated. It’s all been packaged up and ready to go and keep track on it. Everything is remembered. So, if I ever need to remove this tenant, it can all be removed at once. In fact, it’s so easy to remove something that we’re going to go ahead and we’re going to remove, not the tenant this time, but the this last demo I want to run through is I’m going to go ahead and say VLAN-admin, and we’re going to remove that. VLAN-tenant-admin. And, we’re going to remove the network for ESXi management from the first demo. So, I’m going to do no network ESXi management.

Hank Preston: 20:40 That’s it. Now, regarding with ESXi management, there was all the configuration on the LEAF’s that were there. And if I say, “Okay, well is it going to take to remove this?” If I do my commit, dry-run, outformat native, what is actually going to be pulled off this device just by removing that network? All right, these are the commands that are going to be run across my environment. Some of the interesting ones here, so LEAF02-1, we can see it’s going to remove the VLAN, and then it’s going to go into all of those interfaces on the ESXi host and use the remove keyword to pull that VLAN off it. We can see it on those and go through.

Hank Preston: 21:15 So, now if I go ahead and we’ll say, “Commit this for reals.” All right, so our commit went through, so now if I go back and we just check on our status again. Here’s that LEAF that we’ve been looking at. Remember, the way it was before is we had our VLAN’s. We just pulled off ESXi management, which is VLAN 30, so when I look at those interfaces, in fact VLAN 30 is gone because that was captured. We knew exactly what was there. Now, the final thing I want to show as we go through is, “Okay, Hank, the CLI is great, right? We’re all network engineers. We know how much we like the CLI, but network automation, it’s about APIs, and you’re just showing me a CLI tool.” Now, this is automation, right?

Hank Preston: 21:53 It’s network services. We’ve described what we wanted, all the templates were built and we’ll cover how the templates look in a future video on network services, but what if we wanted to go ahead and actually drive this not through a CLI. We wanted to tie this into some upstream orchestration tool or maybe into a ticketing system so somebody could request a service. One of the nice things that we get out of the network service orchestrator, Cisco NSO, is that we’ve got the CLI, which is great for us network engineers to work with on demand and do the configurations, but all of that is actually exposed as an API as well, and a very easy to use one. So, let’s see what that looks like using Postman, our favorite REST API tool. All right, so I switched over here to Postman and you can see that I’m looking at my NSO local install, and this is just running right on my laptop so we can see local hosts, the default credentials or admin admin.

Hank Preston: 22:42 I’m connected there, so the first thing we want to do is let’s go ahead and look at the VLAN tenant networks that currently exist inside of our admin structure. The URL for our REST API is RESTCONF, right? This is model driven programmability, so it’s RESTCONF is the API we’re using, data, and then VLAN tenant. VLAN tenant equals admin. That’s that admin-tenant we’ve been working on with ESXi management. We can say, “You know what? Show me the networks that exist.” So, a really well structured REST API call, and we’ll do our “get request”. Our get request comes back and we can see, “Okay, I’ve got my admin firewall transit, admin main, admin small. But remember, there’s no ESXi management. We just pulled that off. Now, what if I needed to provision this but I didn’t want to do it through the CLI, I wanted to use the API for it?

Hank Preston: 23:30 For that, I’m going to go ahead and grab a “put request”. It’s RESTCONF retrieves information, put is going to go ahead and add new information into our system, so our URL is pretty much the same, VLAN tenant, VLAN tenant equals admin, and then network equals ESXi management. That’s the network we want to create, and then the payload in this case is a JSON blob that models out the exact same thing that we saw with the CLI. In fact, it looks exactly like the CLI. What we’ve done is we’ve taken that CLI and it’s rendered now as a JSON payload, so we can see name, ESXi management, VLAN ID 30, network, the network prefixes there and then all the connections for the switches, so in this case connection switch pair LEAF02, and then the interfaces that need to be set up. So, that’s my payload. I can go ahead and say “Send this away.”

Hank Preston: 24:25 All right, we’ve got our 201 created back here, so if I go back to our Get Request and we rerun this one, let’s see if the network is there, and in fact, indeed it is here at the bottom. And, just to show again, it’s in the real network, let’s go check our actual console on that device, LEAF02. Back over here on LEAF02. Remember, the last time we were here, VLAN 30 was gone. I go ahead and look at it again, and lo and behold, VLAN 30 is now back once more. Then, I’ll work out the entire workflow here. Remember, with Postman and REST API’s, I could delete it if I wanted to. So, I could run a delete request to that same URL for network ESXi management, and that would go ahead and pull that off as well. All right, I’ve gone probably as long as I want to go in the first part for this video, but what we focused in on in today’s video really was the experience that network services offer network engineers to actually deliver and configure and operate the configuration across the network.

Hank Preston: 25:23 We even tacked in how with that network service, we can expose it through REST API and through other systems. In part two, we’re going to jump in and we’re going to extend and look under the hood at what it takes to build a network service that looks like this. You’ll get a chance to see some of the Python code that actually generates the final configurations that go through and see the templates, and we’ll talk a bit about how those templates were created as I went in, and maybe some more. Who knows? All right. Hopefully you enjoyed this video on network service introduction, and these are actually the real services that we’re going to use to drive the automation across the DevNet Sandbox network. These are legitimate services and automation that we’re going to use in production to deliver all of those Sandboxes for you. So, you’re getting a peep behind the hood with us. All right, I’ll see you next time. Thanks.