In a recent webinar we explored how to get started with Network Automation using NetBox and Ansible, and one of the tasks we automated was backing up network device configuration files. We then took that one step further and compared the running configuration file ie. the actual state of the device with the intended state as defined in NetBox, and then reported on any differences between the two states. This is a really powerful use case for network automation and this blog post takes a deeper dive into the solution.
Subscribe to our community newsletter for more tutorials like these!
To set the scene this diagram illustrates where both tools fit into a modern network automation reference architecture. You will note that NetBox is at the centre as the Network Source of Truth (NSoT) and defines the intended state, and Ansible (bottom right) is an automation platform that extracts the intended state from NetBox and uses it to perform its automation tasks.
Note that in the diagram other tools in the top-left corner are performing observability and assurance tasks (comparing intended and actual states), but in this example we are going to leverage Ansible in that role also. This is a reference architecture after all so you can find some tools performing multiple roles:
The Intended State – NetBox
We are using containerlab for our network, and have three Arista cEOS switches modelled in NetBox at a site called ContainerLab. Each device is has an IPv4 management IP address and has 3 VLANs defined (100, 200 and 300) that are linked to the site:
For illustrative purposes, we have a simple config template (written in the Jinja templating language) for the devices and this is stored in a remote data store (Git repository), which is synchronized with NetBox. When you click on the Render Config tab of each device you can see the rendered configuration, built using device the data from NetBox, and this is the intended state of each of the network device configurations:
NetBox as a Dynamic Inventory for Ansible
By using the NetBox Inventory Plugin for Ansible we can integrate the two tools seamlessly so that Ansible gets all its inventory data from NetBox:
The example code for this is in the accompanying Git repo, but there are two files that we are interested in. The first one is ansible.cfg
which tells Ansible where to look for it’s inventory data:
# ansible.cfg
[defaults]
inventory = ./netbox_inv.yml
And the second one is netbox_inv.yml
which uses the netbox.netbox.nb_inventory
plugin. This file is highly configurable as per the documentation, and in this example we are mapping the slug
of the OS from NetBox to the ansible_network_os
and grouping our devices by device_roles
, sites
and platforms
:
# netbox_inv.yml
plugin: netbox.netbox.nb_inventory
validate_certs: False
compose:
ansible_network_os: platform.slug
group_by:
- device_roles
- sites
- platforms
If we run the command ansible-inventory -i netbox_inv.yml --list
to list the inventory, we can see what is returned. I’ve limited the output here just to show the contents of the group sites_container_lab
which contains our three target devices:
"sites_container_lab": {
"hosts": [
"ceos-sw-1",
"ceos-sw-2",
"ceos-sw-3"
]
},
Ansible Playbook Structure
Following best practices that make our code re-usable, our ansible playbooks are split into Roles
for each of the functions that we are automating, and our file structure looks as follows:
(venv) ubuntu@ip-172-31-32-145:~/netbox-learning/netbox-ansible-webinar$ tree roles/
roles/
├── ACTUAL_STATE
│ ├── tasks
│ │ └── main.yml
│ └── vars
│ └── main.yml
├── COMPARE_STATES
│ ├── tasks
│ │ └── main.yml
│ └── vars
│ └── main.yml
└── INTENDED_STATE
├── tasks
│ └── main.yml
└── vars
└── main.yml
9 directories, 6 files
Each role has a tasks/main.yml
file that contains the playbook code, and a vars/main.yml
file that contains any variables that the playbook will reference when it executes. Let’s explore the code for each Role
:
role: ACTUAL_STATE
The purpose of this role is simply to collect the running configuration (actual state) from each device and then store this in a file for use by a later playbook. The only variable we are using in this play sets the path for the backups
directory where we will store the running configuration of each device:
# roles/ACTUAL_STATE/vars/main.yml
---
backup_root: ./backups
The playbook itself uses the eos_command
module to execute the show running-config
command against each device (play 1), then creates the main backup folder (play 2), creates a folder for each device (play 3) and then copies the output from play 1 ie. the running config to a file in the device folder for each device (play 4). This file is called {{ inventory_hostname }}_running.conf
:
# roles/ACTUAL_STATE/tasks/main.yml
---
- name: 1 - Run 'show running-config' on remote devices
eos_command:
commands: "show running-config"
register: running_config
- name: 2 - Ensure backup folder is created
file:
path: "{{ backup_root }}"
state: directory
run_once: yes
- name: 3 - Ensure device folder is created
file:
path: "{{ backup_root }}/{{ inventory_hostname }}"
state: directory
- name: 4 - Write the device configuration to file
copy:
content: "{{ running_config.stdout[0] }}"
dest: "{{ backup_root }}/{{ inventory_hostname }}/{{ inventory_hostname }}_running.conf"
role: INTENDED_STATE
The purpose of this role is to query NetBox (remember it’s the source of truth) for the intended state of each device and then store this in a file to be compared against the actual config in a later play. Starting with the variables used, we are setting the URL of the NetBox instance and the API token we are using to connect to it, as the values we have set in our system environment (see git repo for instructions), and we are setting the directory that we will store the intended configurations in:
# roles/INTENDED_STATE/vars/main.yml
---
netbox_url: "{{ lookup('ansible.builtin.env', 'NETBOX_API') }}"
netbox_token: "{{ lookup('ansible.builtin.env', 'NETBOX_TOKEN') }}"
intended_configs_root: ./intended_configs
The playbook itself makes an API call to retrieve the details for each device from NetBox (play 1), then makes another API call that uses the device id
(from play 1) and renders the intended configuration of each device (play 2), then creates the main folder (play 3), creates a folder for each device (play 4) and then copies the output from play 1 ie. the intended config to the device folder for each device (play 5). This file is called {{ inventory_hostname }}_intended.conf
:
# roles/INTENDED_STATE/tasks/main.yml
---
- name: 1 - Get device details from NetBox
uri:
url: "{{ netbox_url }}api/dcim/devices/?name={{ inventory_hostname }}"
method: GET
return_content: yes
headers:
accept: "application/json"
Authorization: "Token {{ netbox_token }}"
register: device
- name: 2- Get intended state from NetBox based on device ID from play 1
uri:
url: "{{ netbox_url }}api/dcim/devices/{{ device.json.results.0['id'] }}/render-config/"
method: POST
return_content: yes
headers:
accept: "application/json"
Authorization: "Token {{ netbox_token }}"
register: intended_config
- name: 3 - Ensure folder for intended configs exists
file:
path: "{{ intended_configs_root }}"
state: directory
run_once: yes
- name: 4 - Ensure folder for each device's intended config exists
file:
path: "{{ intended_configs_root }}/{{ inventory_hostname }}"
state: directory
- name: 5 - Copy intended config for each device to folder
copy:
content: "{{ intended_config.json.content }}"
dest: "{{ intended_configs_root }}/{{ inventory_hostname }}/{{ inventory_hostname }}_intended.conf"
role: COMPARE_STATES
The purpose of our final role is to compare the two files generated for each device by the playbooks in the ACTUAL_STATE
and INTENDED_STATE
roles and report on any differences. Let’s break it down:
Firstly we set variables for the paths to the actual and intended config files for each device, and define a Regex pattern that will be used to exclude parts of the file text that we don’t want to check when we compare the files:
# roles/COMPARE_STATES/vars/main.yml
---
actual_conf_dir: "./backups/{{ inventory_hostname }}"
intended_conf_dir: "./intended_configs/{{ inventory_hostname }}"
exclusion_pattern: "(^! Command:)|(^! device:)"
The playbook itself creates a temporary file containing the actual (running) configuration for each device (play 1) then removes any lines that match the regex exclusion_pattern
from the temporary file (play 2), then compares the file with the intended configuration file (play 3), then if there is a difference between the two files it displays it to the console output (play 4), and finally it cleans up by removing the temporary files.
# roles/COMPARE_STATES/tasks/main.yml
---
- name: 1 - Create temp file for actual configuration
ansible.builtin.copy:
src: "{{ actual_conf_dir }}/{{ inventory_hostname }}_running.conf"
dest: "/tmp/{{ inventory_hostname }}_temp.conf"
remote_src: yes
register: actual_temp_file
delegate_to: localhost
- name: 2 - Remove lines matching the exclusion pattern from temp actual configuration
ansible.builtin.lineinfile:
path: "{{ actual_temp_file.dest }}"
regexp: "{{ exclusion_pattern }}"
state: absent
delegate_to: localhost
- name: 3 - Diff compare temp actual configuration file with intended configuration file
command: "diff /tmp/{{ inventory_hostname }}_temp.conf {{ intended_conf_dir }}/{{ inventory_hostname }}_intended.conf"
register: diff_output
ignore_errors: yes
changed_when: false
delegate_to: localhost
- name: 4 - Show delta of intended state vs actual State
debug:
msg: "{{ diff_output.stdout_lines }}"
when: diff_output.stdout != ""
delegate_to: localhost
- name: 5 - Cleanup temp actual configuration file
ansible.builtin.file:
path: "{{ actual_temp_file.dest }}"
state: absent
delegate_to: localhost
Putting it All Together
OK, if you’ve made it this far you’re obviously interested in how it all plays out, so let’s take a look! For demo purposes I have deleted VLAN 300 from the running configuration of ceos-sw-3
:
ceos-sw-3(config)#no vlan 300
ceos-sw-3(config)#sh run | sec vlan
vlan 100
name Data
vlan 200
name Voice
We have a master playbook called compare_intended_vs_actual.yml
which calls the other playbooks in the roles we have outlined above, and targets the devices at the ContainerLab site:
# compare_intended_vs_actual.yml
---
- name: Compare intended state in NetBox to actual device state and show delta
hosts: sites_container_lab
gather_facts: false
connection: network_cli
roles:
- role: ACTUAL_STATE
- role: INTENDED_STATE
- role: COMPARE_STATES
We’ll run this with the command ansible-playbook compare_intended_vs_actual.yml
. The full output is quite long as there are a lot of plays being run, so I will limit the output to just the interesting parts. The action starts at play 3 in the COMPARE_STATES
role, where the play fails when it compares the intended state to the actual state for device ceos-sw-3
and spots that there is a difference. The next task then displays the missing part of the configuration on ceos-sw-3
in the console output:
<output shortened for brevity>
TASK [COMPARE_STATES : 3 - Diff compare temp actual configuration file with intended configuration file] ****************************************
ok: [ceos-sw-2 -> localhost]
ok: [ceos-sw-1 -> localhost]
fatal: [ceos-sw-3 -> localhost]: FAILED! => changed=false
cmd:
- diff
- /tmp/ceos-sw-3_temp.conf
- ./intended_configs/ceos-sw-3/ceos-sw-3_intended.conf
delta: '0:00:00.003368'
end: '2024-03-19 14:34:59.768706'
msg: non-zero return code
rc: 1
start: '2024-03-19 14:34:59.765338'
stderr: ''
stderr_lines: <omitted>
stdout: |-
19a20,22
> vlan 300
> name DMZ
> !
stdout_lines: <omitted>
...ignoring
TASK [COMPARE_STATES : 4 - Show delta of intended state vs actual State] ************************************************************************
skipping: [ceos-sw-1]
skipping: [ceos-sw-2]
ok: [ceos-sw-3 -> localhost] =>
msg:
- 19a20,22
- '> vlan 300'
- '> name DMZ'
- '> !'
So you can now clearly see that the actual state of device ceos-sw-3
does not match the intended state as defined by NetBox, and we can now do something about that – ideally using automation tools to push out the intended state to the device!
NetBox and Ansible Resources for Configuration Assurance
So, I hope that this has been a useful overview of how you can build a simple network configuration assurance workflow using NetBox and Ansible. You can try this out yourself as all the code (including the containerlab definition file) is in the accompanying Git Repository.
Also, if you would like to watch the full webinar that this blog post is based on then it is available to watch on-demand, and if you are interested in receiving tutorials like this in your inbox, subscribe to our community newsletter.