Is a Network Source of Truth Essential for Automation?

Move fast and try not to break things

At NetBox Labs we often share stories about how our users are succeeding with NetBox in their network automation initiatives, and at the same time we also talk about how teams often need to start with small automation experiments to get buy-in and then expand their network automation strategies after initial wins. Are these stories contradictory? Does including NetBox from the outset slow down the early wins? Let’s explore a few examples of small automation projects that teams might use to prove the value of network automation safely.

Automated Configuration Backups: A script that periodically logs into network devices and backs up their configurations, ensuring that you always have the latest configuration files saved without manual intervention.

Generating Device Configurations: A simple tool to generate device configurations using Ansible/Nornir, a config template and some variables in a YAML file. This allows engineers to view the “end state” for desired configuration changes and paves the way to future automation configuration deployments.

Generating Monitoring Configurations: Generate configuration files for a network monitoring tool like Nagios or Zabbix based on the current network inventory. This ensures that the monitoring system is always up-to-date with the latest network topology, reducing manual effort and improving monitoring accuracy.

These are all great examples of small automation projects that teams could invest in to show how automation can reduce repetitive tasks and manual errors, and none of them strictly need a source of truth like NetBox to work. After getting some initial wins under their belts, teams are often excited to start expanding these use cases and with good reason, these fledgling network automation initiatives can often save networking teams a lot of time.

Network Data Sharing and Compatibility

As these initial wins turn into increased focus and investment a type of sprawl starts to occur. Take the examples above: the automated config backups are probably associated with a device name and IP, the device configs will need various variables including hostnames and IPs, and the automated creation of monitoring configurations needs to associate the types of monitoring to be performed with the device name and its IP address.

Already here we see that these separate tools have started to introduce their own mini data models which overlap, and therefore need to be kept in sync through some form of integration. If these fledgling automation scripts are then to be used together to drive additional workflows, you now also have to start thinking about normalizing that data into a single data model. This challenge is very common, whether in network automation initiatives, or in day to day network operations.

Network Automation Growing Pains

As the number of scripts, tools or software systems starts to grow, any workflows that need to interact with them need to speak whatever language, or data model, each tool they touch speaks. The maximum number of integrations required to connect all tools grows quadratically with the number of tools involved, which is just math talk for “avoid if possible”. If you’re just integrating 2 tools, that’s just one integration you need to worry about, but once you get to 5 tools, you now have 10 integrations, and it just gets worse from there.

Without a plan for integrating network automation initiatives teams can become overwhelmed by interdependencies

That ain’t good, as you need to maintain all these integrations, and now teams need to start communicating with each other about how their tools will change over time so that the system doesn’t break through divergence. And this assumes that we just maintain a one to one compatibility matrix between tools where each tool supports one version of the other. If we want to support multiple versions of the integrated tools to avoid backwards compatibility issues, the integration overhead becomes unmanageable even quicker.

This leaves organizations with a difficult decision to make: prohibitive maintenance costs or disjointed data and frustrating error-prone workflows. Most choose the latter, which is often referred to as “Swivel Chair Operations.”

Reducing Swivel Chair Operations

Even if you haven’t heard the term before, you’re probably familiar with the idea of swivel chair operations in which network engineers and operations teams must switch between different tools and systems to complete their tasks. Instead of having an integrated, seamless workflow, teams must physically or digitally “swivel” from one system to another, entering and re-entering data, translating formats, and ensuring consistency across disparate platforms.

This approach is labor-intensive, error-prone, and inefficient: everything we seek to reduce in our network automation efforts, and this is exactly where NetBox fits in. By acting as a central data model through which many networking tools can integrate, NetBox reduces the integration burden significantly.

Introducing NetBox as an “integration hub” can significantly reduce the integration overhead

In February I wrote about our the NetBox Labs Big Tent philosophy.

“Core to this philosophy is our focus on accessibility and extensibility, through our open source approach, robust APIs, and our plugin framework. Strategically, NetBox sits at the center of modern network automation architectures, and we are in the business of giving network operators options by working well with other companies in the space and not choosing favorites. Simply put, you are best positioned to decide which combinations of tooling will work for your business- it’s our job to make sure you can use that hard-won knowledge when using NetBox Cloud.”

That’s because at NetBox Labs we’re focussing heavily on reducing the integration costs for your network initiatives. Just this year we’ve announced integrations with Forward Networks, IPFabric, Kentik, Lightyear.ai, Netpicker, ServiceNow, SlurpIT, Splunk, and SuzieQ on top of the integrations that NetBox already enjoyed with Ansible, Nornir, Terraform, Zabbix, StackStorm, Salt, Itential, Icinga, and many many more. This is the power of the Netbox community; it creates a self-reinforcing cycle of integrations so that Googling “NetBox” and the name of any networking tool almost always yields results. Try it!

Outside of our ever expanding list of partner integrations we’ve also kicked off programs to help reinforce this cycle further. Our $100k NetBox Community Fund set aside $20k for our Plugin Bounty Program to support community members who implement plugins from our community driven Plugin Ideas Board, expanding the NetBox data model, integrating with other tools, and adding additional functionality. Our Plugin Certification Program works to make sure that end users can discover and trust NetBox integrations, and recently Diode, which lowers the barriers for integrating with NetBox, has just entered Private Preview.

NetBox as a Single Pane of Glass

Not a month goes by without me seeing some poor community member being corrected for asking a perfectly reasonable question. It normally goes something like this:

Q: Hey! I’d like to import a ton of data into NetBox from my company’s network. Do you have any suggestions for how I could achieve this?

A: NetBox should only hold authoritative data that has been vetted by a human. You’re doing it wrong!

We appreciate all the wonderful community members who participate in our public forums, but I’d like to use this opportunity to dispel this myth: Using NetBox as a single pane of glass view (sometimes called “Observed State Mode”) of the network that is automatically updated by other systems is a perfectly valid use case that we often see deployed successfully by companies of all sizes. So why the strong opinions?

It seems that the confusion here has spread because of a misunderstanding of how usage of NetBox changes as companies evolve their approaches. Some companies jump straight to intent based networking where NetBox holds the intended state of the network and is used to drive change in the network. Some companies, for the reasons I mentioned above, start off by using NetBox as a common integration point to create a holistic view of what many different systems believe the network is while reducing their integration overheads. Often, companies that start using NetBox as a Single Pane of Glass view transition to using it as the intended state of the network. These approaches are complementary, not contradictory.

NetBox as the intended state of the network

So what causes companies who didn’t go straight to using NetBox as the intended network, to later decide to make that transition? Let’s use a few examples.

As we saw in our recent blog about how NetBox fits into the Design stage of the network lifecycle, teams looking to improve their throughput of successful changes to the network often seek to “shift left” their design into NetBox. In this scenario, if NetBox is automatically updated to match the data in the network and other systems constantly it becomes impossible to do any design in NetBox, because the current state is always changing.

Imagine trying to map out various scenarios in NetBox to explore design options. This requires that we’re able to iterate on a model of the network which isn’t tracking the actual network as it evolves. Some might say “well if the current state of the network is always changing I want to know that as I’m designing” and while that seems reasonable at first glance, imagine if your High Level and Low Level Design Documents were constantly changing while you were working on designing a new network upgrade. It would be chaos.

Following that example a little further, let’s now imagine that you’ve finished your design and you want to pass it to your Build teams. This is a common pattern as I covered in the recent Build and Deploy stage blog, in which organizations want to use the updated state in NetBox to generate instructions like cabling plans. These plans require that we understand the desired state and the current state of the NetBox so if NetBox is always updated to the current state of the network, we lose this ability.

Following the example further still, now imagine that we want to deploy updated device configurations to the network, which is covered in some detail in the recent Operate stage blog. From which source of data are we generating these updated configs if NetBox always matches the current state of the network?

Staying with the Operate stage of the network lifecycle, imagine you get an alert from your monitoring. You likely want to know both how the network should be configured and also how the network actually is configured. In Observed State Mode where NetBox automatically synced to the network, where would you go to understand what the network should be doing? A written High Level Design Document? Is that likely to be up-to-date?

Summary

In this blog we’ve seen that you can definitely start your network automation initiatives without a Source of Truth like NetBox, and in fact this is a great way to get moving quickly to show early results so you can get buy-in for further investments. We saw how if left unaddressed burgeoning automation initiatives can find themselves facing a ton of maintenance work to keep their various systems integrated with each other and how NetBox can reduce this burden, reducing Swivel Chair Operations.

Lastly we looked at how using NetBox as a Single Pane of Glass for disparate systems to reduce swivel chair operations is a perfectly valid pattern that we see many companies adopt as a stepping stone to intent based network automation, and looked at some of the reasons why organizations make this step.

—-

If you’re interested in more content like this check out our blog, our Network Automation Heroes playlist on YouTube, and subscribe to our newsletter to make sure you catch all the written deep dives to come.