Network Configuration Assurance With NetBox and Ansible

In a recent webinar we explored how to get started with Network Automation using NetBox and Ansible, and one of the tasks we automated was backing up network device configuration files. We then took that one step further and compared the running configuration file ie. the actual state of the device with the intended state as defined in NetBox, and then reported on any differences between the two states. This is a really powerful use case for network automation and this blog post takes a deeper dive into the solution.

Subscribe to our community newsletter for more tutorials like these!

To set the scene this diagram illustrates where both tools fit into a modern network automation reference architecture. You will note that NetBox is at the centre as the Network Source of Truth (NSoT) and defines the intended state, and Ansible (bottom right) is an automation platform that extracts the intended state from NetBox and uses it to perform its automation tasks.

Note that in the diagram other tools in the top-left corner are performing observability and assurance tasks (comparing intended and actual states), but in this example we are going to leverage Ansible in that role also. This is a reference architecture after all so you can find some tools performing multiple roles:

The Intended State – NetBox

We are using containerlab for our network, and have three Arista cEOS switches modelled in NetBox at a site called ContainerLab. Each device is has an IPv4 management IP address and has 3 VLANs defined (100, 200 and 300) that are linked to the site:

For illustrative purposes, we have a simple config template (written in the Jinja templating language) for the devices and this is stored in a remote data store (Git repository), which is synchronized with NetBox. When you click on the Render Config tab of each device you can see the rendered configuration, built using device the data from NetBox, and this is the intended state of each of the network device configurations:

NetBox as a Dynamic Inventory for Ansible

By using the NetBox Inventory Plugin for Ansible we can integrate the two tools seamlessly so that Ansible gets all its inventory data from NetBox:

The example code for this is in the accompanying Git repo, but there are two files that we are interested in. The first one is ansible.cfg which tells Ansible where to look for it’s inventory data:

# ansible.cfg

[defaults]
inventory = ./netbox_inv.yml

And the second one is netbox_inv.yml which uses the netbox.netbox.nb_inventory plugin. This file is highly configurable as per the documentation, and in this example we are mapping the slug of the OS from NetBox to the ansible_network_os and grouping our devices by device_roles, sites and platforms:

# netbox_inv.yml

plugin: netbox.netbox.nb_inventory
validate_certs: False

compose:
  ansible_network_os: platform.slug
  
group_by: 
 - device_roles
 - sites
 - platforms

If we run the command ansible-inventory -i netbox_inv.yml --list to list the inventory, we can see what is returned. I’ve limited the output here just to show the contents of the group sites_container_lab which contains our three target devices:

    "sites_container_lab": {
        "hosts": [
            "ceos-sw-1",
            "ceos-sw-2",
            "ceos-sw-3"
        ]
    },

Ansible Playbook Structure

Following best practices that make our code re-usable, our ansible playbooks are split into Roles for each of the functions that we are automating, and our file structure looks as follows:

(venv) ubuntu@ip-172-31-32-145:~/netbox-learning/netbox-ansible-webinar$ tree roles/
roles/
├── ACTUAL_STATE
│   ├── tasks
│   │   └── main.yml
│   └── vars
│       └── main.yml
├── COMPARE_STATES
│   ├── tasks
│   │   └── main.yml
│   └── vars
│       └── main.yml
└── INTENDED_STATE
    ├── tasks
    │   └── main.yml
    └── vars
        └── main.yml

9 directories, 6 files

Each role has a tasks/main.yml file that contains the playbook code, and a vars/main.yml file that contains any variables that the playbook will reference when it executes. Let’s explore the code for each Role:

role: ACTUAL_STATE

The purpose of this role is simply to collect the running configuration (actual state) from each device and then store this in a file for use by a later playbook. The only variable we are using in this play sets the path for the backups directory where we will store the running configuration of each device:

# roles/ACTUAL_STATE/vars/main.yml

---
backup_root: ./backups

The playbook itself uses the eos_command module to execute the show running-config command against each device (play 1), then creates the main backup folder (play 2), creates a folder for each device (play 3) and then copies the output from play 1 ie. the running config to a file in the device folder for each device (play 4). This file is called {{ inventory_hostname }}_running.conf:

# roles/ACTUAL_STATE/tasks/main.yml

---
- name: 1 - Run 'show running-config' on remote devices
  eos_command:
    commands: "show running-config"
  register: running_config

- name: 2 - Ensure backup folder is created
  file:
    path: "{{ backup_root }}"
    state: directory
  run_once: yes

- name: 3 - Ensure device folder is created
  file:
    path: "{{ backup_root }}/{{ inventory_hostname }}"
    state: directory

- name: 4 - Write the device configuration to file
  copy:
    content: "{{ running_config.stdout[0] }}"
    dest: "{{ backup_root }}/{{ inventory_hostname }}/{{ inventory_hostname }}_running.conf"

role: INTENDED_STATE

The purpose of this role is to query NetBox (remember it’s the source of truth) for the intended state of each device and then store this in a file to be compared against the actual config in a later play. Starting with the variables used, we are setting the URL of the NetBox instance and the API token we are using to connect to it, as the values we have set in our system environment (see git repo for instructions), and we are setting the directory that we will store the intended configurations in:

# roles/INTENDED_STATE/vars/main.yml

---
netbox_url: "{{ lookup('ansible.builtin.env', 'NETBOX_API') }}"
netbox_token: "{{ lookup('ansible.builtin.env', 'NETBOX_TOKEN') }}"
intended_configs_root: ./intended_configs

The playbook itself makes an API call to retrieve the details for each device from NetBox (play 1), then makes another API call that uses the device id (from play 1) and renders the intended configuration of each device (play 2), then creates the main folder (play 3), creates a folder for each device (play 4) and then copies the output from play 1 ie. the intended config to the device folder for each device (play 5). This file is called {{ inventory_hostname }}_intended.conf:

# roles/INTENDED_STATE/tasks/main.yml

---
- name: 1 - Get device details from NetBox
  uri:
      url: "{{ netbox_url }}api/dcim/devices/?name={{ inventory_hostname }}"
      method: GET
      return_content: yes
      headers:
          accept: "application/json"
          Authorization: "Token {{ netbox_token }}"
  register: device

- name: 2- Get intended state from NetBox based on device ID from play 1
  uri:
      url: "{{ netbox_url }}api/dcim/devices/{{ device.json.results.0['id'] }}/render-config/"
      method: POST
      return_content: yes
      headers:
          accept: "application/json"
          Authorization: "Token {{ netbox_token }}"
  register: intended_config

- name: 3 - Ensure folder for intended configs exists
  file:
    path: "{{ intended_configs_root }}"
    state: directory
  run_once: yes

- name: 4 - Ensure folder for each device's intended config exists
  file:
    path: "{{ intended_configs_root }}/{{ inventory_hostname }}"
    state: directory

- name: 5 - Copy intended config for each device to folder
  copy:
    content: "{{ intended_config.json.content }}"
    dest: "{{ intended_configs_root }}/{{ inventory_hostname }}/{{ inventory_hostname }}_intended.conf"

role: COMPARE_STATES

The purpose of our final role is to compare the two files generated for each device by the playbooks in the ACTUAL_STATE and INTENDED_STATE roles and report on any differences. Let’s break it down:

Firstly we set variables for the paths to the actual and intended config files for each device, and define a Regex pattern that will be used to exclude parts of the file text that we don’t want to check when we compare the files:

# roles/COMPARE_STATES/vars/main.yml

---
actual_conf_dir: "./backups/{{ inventory_hostname }}"
intended_conf_dir: "./intended_configs/{{ inventory_hostname }}"
exclusion_pattern: "(^! Command:)|(^! device:)"

The playbook itself creates a temporary file containing the actual (running) configuration for each device (play 1) then removes any lines that match the regex exclusion_pattern from the temporary file (play 2), then compares the file with the intended configuration file (play 3), then if there is a difference between the two files it displays it to the console output (play 4), and finally it cleans up by removing the temporary files.

# roles/COMPARE_STATES/tasks/main.yml

---
- name: 1 - Create temp file for actual configuration
  ansible.builtin.copy:
    src: "{{ actual_conf_dir }}/{{ inventory_hostname }}_running.conf"
    dest: "/tmp/{{ inventory_hostname }}_temp.conf"
    remote_src: yes
  register: actual_temp_file
  delegate_to: localhost

- name: 2 - Remove lines matching the exclusion pattern from temp actual configuration
  ansible.builtin.lineinfile:
    path: "{{ actual_temp_file.dest }}"
    regexp: "{{ exclusion_pattern }}"
    state: absent
  delegate_to: localhost

- name: 3 - Diff compare temp actual configuration file with intended configuration file
  command: "diff /tmp/{{ inventory_hostname }}_temp.conf {{ intended_conf_dir }}/{{ inventory_hostname }}_intended.conf"
  register: diff_output
  ignore_errors: yes
  changed_when: false
  delegate_to: localhost

- name: 4 - Show delta of intended state vs actual State
  debug:
    msg: "{{ diff_output.stdout_lines }}"
  when: diff_output.stdout != ""
  delegate_to: localhost

- name: 5 - Cleanup temp actual configuration file
  ansible.builtin.file:
    path: "{{ actual_temp_file.dest }}"
    state: absent
  delegate_to: localhost

Putting it All Together

OK, if you’ve made it this far you’re obviously interested in how it all plays out, so let’s take a look! For demo purposes I have deleted VLAN 300 from the running configuration of ceos-sw-3:

ceos-sw-3(config)#no vlan 300 
ceos-sw-3(config)#sh run | sec vlan
vlan 100
   name Data
vlan 200
   name Voice

We have a master playbook called compare_intended_vs_actual.yml which calls the other playbooks in the roles we have outlined above, and targets the devices at the ContainerLab site:

# compare_intended_vs_actual.yml

---
- name: Compare intended state in NetBox to actual device state and show delta
  hosts: sites_container_lab
  gather_facts: false
  connection: network_cli

  roles:
    - role: ACTUAL_STATE
    - role: INTENDED_STATE
    - role: COMPARE_STATES

We’ll run this with the command ansible-playbook compare_intended_vs_actual.yml. The full output is quite long as there are a lot of plays being run, so I will limit the output to just the interesting parts. The action starts at play 3 in the COMPARE_STATES role, where the play fails when it compares the intended state to the actual state for device ceos-sw-3 and spots that there is a difference. The next task then displays the missing part of the configuration on ceos-sw-3 in the console output:

<output shortened for brevity> 

TASK [COMPARE_STATES : 3 - Diff compare temp actual configuration file with intended configuration file] ****************************************
ok: [ceos-sw-2 -> localhost]
ok: [ceos-sw-1 -> localhost]
fatal: [ceos-sw-3 -> localhost]: FAILED! => changed=false 
  cmd:
  - diff
  - /tmp/ceos-sw-3_temp.conf
  - ./intended_configs/ceos-sw-3/ceos-sw-3_intended.conf
  delta: '0:00:00.003368'
  end: '2024-03-19 14:34:59.768706'
  msg: non-zero return code
  rc: 1
  start: '2024-03-19 14:34:59.765338'
  stderr: ''
  stderr_lines: <omitted>
  stdout: |-
    19a20,22
    > vlan 300
    >    name DMZ
    > !
  stdout_lines: <omitted>
...ignoring

TASK [COMPARE_STATES : 4 - Show delta of intended state vs actual State] ************************************************************************
skipping: [ceos-sw-1]
skipping: [ceos-sw-2]
ok: [ceos-sw-3 -> localhost] => 
  msg:
  - 19a20,22
  - '> vlan 300'
  - '>    name DMZ'
  - '> !'

So you can now clearly see that the actual state of device ceos-sw-3 does not match the intended state as defined by NetBox, and we can now do something about that – ideally using automation tools to push out the intended state to the device!

NetBox and Ansible Resources for Configuration Assurance

So, I hope that this has been a useful overview of how you can build a simple network configuration assurance workflow using NetBox and Ansible. You can try this out yourself as all the code (including the containerlab definition file) is in the accompanying Git Repository.

Also, if you would like to watch the full webinar that this blog post is based on then it is available to watch on-demand, and if you are interested in receiving tutorials like this in your inbox, subscribe to our community newsletter.