Automating your Network Operations, Part 5 – Just the Facts

Well, I had not intended to take a two-month hiatus, but it was an interesting time of holidays and head injuries for me. The good news is that my head is as good as it ever was! (For better or worse)

We covered a number of topics related to automating your network operations in the first four parts in this blog series. So, what’s new for this year? Well, we’ve pushed configuration data to devices, but what about getting data from devices? It turns out that getting data from the devices is quite useful and often key to sustainable automation. Before I say why, let’s talk about actual deployments for a bit. Many people pushing automation would portray a picture of sunshine and lollipops with the wave of their automation magic wand. To demonstrate proof of their capabilities, they show you a playbook that sets your SNMP community strings. While I do not discount the importance of an SNMP community string … it does not a network configuration, make.

To push configuration to a device without concern for the current configuration on that network device, you must have everything on that device templated. If you’ve reached the immutable infrastructure panacea, that’s great. The problem is that most operations have not. Their configurations have grown organically over the years and, even if they have subsets of devices that are/can be templated (e.g. access switches), it is far from the bulk of their automation requirements. You most likely have core routers, border routers, firewalls, load balancers, etc. with bespoke and often inconsistent configurations. To come up with a template or model for all of these devices before you automate any of these devices would delay the benefits of automation to the point where you might as well wait for the next major technology refresh (or your next job).

In reality, most organizations identify a set of workflows that are most impactful to their business. Most often, those workflows involve deployment of firewall rules, load-balancer rules, and new compute resources. These workflows can be automated as a “service” irrespective of the underlying infrastructure that supports that service. To do this, however, you often need to know about the current state of the device before you blindly send the configuration to that device.

Let us look at a simple example of this by re-investigating the original NTP issue covered in the first blog. The problem was that we were sending configuration to the device before we knew the configuration of the device, which was yielding undesired results (i.e. more NTP servers then we intended).

To see what NTP servers exist on the device, we’ll use the cli_command module to show us the ntp servers in the running configuration:

- hosts: routers
  connection: network_cli
  gather_facts: no
  tasks:
   - name: Get current NTP server
     cli_command:
       command: show run | inc ntp
     register: output

   - debug:
      var: output.stdout_lines

The output from the debug task looks like this:

TASK [debug] 
************************************************************************************************************************************************
ok: [core] => {
    "output.stdout_lines": [
        "ntp server 1.1.1.1",
        "ntp server 2.2.2.2"
    ]
}

But what do we do with this semi-unstructured data? We cannot use it directly in our playbooks, so we need to parse it first using TextFSM. TextFSM is Python module which implements a template-based state machine for parsing semi-formatted text. To leverage the TextFSM library, we’ll use the parse_cli_textfsm network cli filter:

- hosts: routers
  connection: network_cli
  gather_facts: no
  tasks:
   - name: Get current NTP server
     cli_command:
       command: show run | inc ntp
     register: output

   - set_fact:
      actual_ntp_servers: "{{ output.stdout | parse_cli_textfsm('templates/ios-ntp.textfsm') }}"

   - debug:
      var: actual_ntp_servers

The truncated output from that debug task looks like this:

TASK [debug] 
************************************************************************************************************************************************
ok: [core] => {
    "actual_ntp_servers": [
        {
            "SERVER": "1.1.1.1"
        },
        {
            "SERVER": "2.2.2.2"
        }
    ]
}

We now have a list of dictionaries, each with a single element. This is the TextFSM template that generated it:

Value SERVER (\S+)

Start
  ^ntp server ${SERVER} -> Record

The first section enumerates the values that are picked out of the text and the regular expressions that define them. The second section lists the regular expressions that match the output lines with the previously enumerated values. Now we have something that we can use, but we really just want a list of NTP server addresses. To do that, we use the Ansible map filter to pull out the elements with the key ‘SERVER’:

- hosts: routers
  connection: network_cli
  gather_facts: no
  tasks:
   - name: Get current NTP server
     cli_command:
       command: show run | inc ntp
     register: output

   - set_fact:
      actual_ntp_servers: "{{ output.stdout | parse_cli_textfsm('templates/ios-ntp.textfsm') | map(attribute='SERVER') | list }}"

   - debug:
      var: actual_ntp_servers

Giving us a legit list:

TASK [debug] 
************************************************************************************************************************************************
ok: [core] => {
    "actual_ntp_servers": [
        "1.1.1.1",
        "2.2.2.2"
    ]
}

This invocation of the map filter (there are many different permutations that perform a surprising, and often confusion, number of different functions) picks out the value in each dictionary whose key is ‘SERVER’.

Now we can employ the Ansible difference filter to figure out which NTP servers we want to add and which ones we want to remove. Here is the full playbook for this example. At the top, we pass in the list of NTP servers that we want on the device. As I’ve said previously, you generally want to read in this data through the inventory system, but I place it at the top of the playbook for this is an example.

- hosts: routers
  connection: network_cli
  gather_facts: no
  vars:
   desired_ntp_servers:
    - 1.1.1.1
    - 3.3.3.3

  tasks:
   - name: Get current NTP server
     cli_command:
       command: show run | inc ntp
     register: output

   - set_fact:
      actual_ntp_servers: "{{ output.stdout | parse_cli_textfsm('templates/ios-ntp.textfsm') | map(attribute='SERVER') | list }}"

   - debug:
      var: actual_ntp_servers

   - debug:
      msg: "Add the following: {{ desired_ntp_servers | difference(actual_ntp_servers) | join(',') }}"

   - debug:
      msg: "Remove the following: {{ actual_ntp_servers | difference(desired_ntp_servers) | join(',') }}"

   - name: Add the new servers
     cli_config:
       config: ntp server {{ item }}
     loop: "{{ desired_ntp_servers | difference(actual_ntp_servers) | default([]) }}"

   - name: Remove old servers
     cli_config:
       config: no ntp server {{ item }}
     loop: "{{ actual_ntp_servers | difference(desired_ntp_servers) | default([]) }}"

In the first invocation of the difference filter, we are asking which of the servers in our desired list are not in the device’s actual configuration. We would then add these to the device. The second invocation of the difference filter asks which of the servers actually configured on the device are not in the list of desired servers. We would then remove those.

Here is the full output of the playbook in action:

PLAY [routers] 
**********************************************************************************************************************************************

TASK [Get current NTP server] 
*******************************************************************************************************************************
ok: [core]

TASK [set_fact] 
*********************************************************************************************************************************************
ok: [core]

TASK [debug] 
************************************************************************************************************************************************
ok: [core] => {
    "actual_ntp_servers": [
        "1.1.1.1",
        "2.2.2.2"
    ]
}

TASK [debug] 
************************************************************************************************************************************************
ok: [core] => {
    "msg": "Add the following: 3.3.3.3"
}

TASK [debug] 
************************************************************************************************************************************************
ok: [core] => {
    "msg": "Remove the following: 2.2.2.2"
}

TASK [Add the new servers] 
**********************************************************************************************************************************
changed: [core] => (item=3.3.3.3)

TASK [Remove old servers] 
***********************************************************************************************************************************
changed: [core] => (item=2.2.2.2)

PLAY RECAP 
**************************************************************************************************************************************************
core :      ok=7 changed=2 unreachable=0 failed=0

This is a simple example, but the same technique can be leveraged to address a myriad of other use cases. I often use this technique to add entries to prefix lists or access lists, testing whether the addition is already a part of an existing entry.

Ansible filters are very useful when pulling data from devices, whether its CLI output or JSON returned from a REST API call. I highly recommend them. As far as TextFSM templates, I use them heavily with CLI as well. The Network to Code folks have lots of templates in their ntc-templates repo. If you cannot find exactly what you need there, you can usually find a template from which to build.

Now you might be asking yourself, “self, why is he telling me about CLI again after he opined on the virtues of model-driven automation?”. Although the model-driven approach is definitely the best way to configure network devices, not all devices support it. Also, there are many occasions where it is useful to parse semi-formatted information and TextFSM is a good way to do that. The only model-driven approach that we’ve looked at so far is NETCONF, which uses XML encoding for its payload. Dealing with XML in Ansible makes my grumpy; so, in the next blog, I am going to present RESTCONF for both reading and writing data models.

We’d love to hear what you think. Ask a question or leave a comment below.
And stay connected with Cisco DevNet on social!

Twitter @CiscoDevNet | Facebook | LinkedIn

Visit the new Developer Video Channel