Customer stories/Wikimedia Foundation
Wikimedia Foundation Logo

Wikimedia Foundation: From Fragmented Documentation to a Unified Source of Truth

Company name
Wikimedia Foundation
Industry
Technology
Products used
NetBox
Arzhel Younsi
We have colleagues across teams who contribute to and benefit from our NetBox implementation. Team members interested in networking can help out even when the core network team isn't available."
Arzhel Younsi
Staff network SRE

Overview

The Wikimedia Foundation, the nonprofit organization behind Wikipedia and other knowledge projects, faced significant challenges in managing their complex infrastructure spanning multiple data centers worldwide. With thousands of servers, hundreds of network devices, and virtual machines across global locations, the Foundation needed a solution that could provide visibility and control over their diverse technology landscape.

The solution came through a dedicated team within Wikimedia’s Site Reliability Engineering (SRE) department who implemented NetBox as their core infrastructure documentation and automation platform. What began as a replacement for disconnected systems evolved into a sophisticated automation foundation that now serves as the unified system of record for Wikimedia’s entire infrastructure.

Background

Wikimedia operates a global network infrastructure that includes data centers in the US, caching sites across multiple continents, and networking equipment from various vendors. Their environment runs primarily on Debian Linux with virtualization platforms including Ganeti and Kubernetes.

Before NetBox, Wikimedia relied on disconnected systems including RackTables for data center management and flat DNS files in Git repositories for IP address management. This fragmented approach created information silos, making it difficult to maintain accurate documentation and automate network operations.

As Arzhel Younsi, a member of Wikimedia’s SRE Infrastructure Foundations team, recalls about their pre-NetBox environment: “We were managing IP addresses with flat DNS files in Git repositories, with all the limitations you can imagine – inability to visualize IP space, frequent typos, and forgotten records when servers were decommissioned.”

The NetBox Journey: Building a Better Foundation

Wikimedia’s journey with NetBox began in 2017 when they started evaluating it as a replacement for their existing systems. The team was drawn to NetBox’s comprehensive capabilities, clean user interface, and most importantly, its open-source nature, which aligned perfectly with Wikimedia’s commitment to open-source solutions.

“It seemed to check all the boxes,” explains Younsi. “The UI was nice, and the data model was structured yet flexible. It included an API, which was a huge requirement for moving forward with automation.”

Cathal Mooney, who joined the Foundation later, had already experienced NetBox’s benefits in previous roles: “It seemed like a great tool – streets ahead of any IPAM system or data center inventory system I’d used before, and it combined both functionalities. I was instantly sold.”

The Automation Revolution: Beyond Documentation

While NetBox initially served as a documentation platform, the real transformation came as Wikimedia integrated it into their automation workflow. Riccardo Coccioli, Staff Site Reliability Engineer at Wikimedia, led the development of “Homer,” an open-source network configuration manager that pulls data from NetBox and uses it to generate configurations for Juniper devices.

“We needed something that could get data from NetBox and actually configure our Juniper devices,” explains Coccioli. “With NetBox as our source of truth, we could drive our infrastructure configuration from a central, reliable database.”

Homer represented a significant shift in how Wikimedia approached network management. Instead of manually configuring devices or pulling data from devices into documentation, they were now pushing configurations from NetBox outward to the network.

This shift to NetBox-driven automation was transformative, as Mooney explains: “We flipped the automation so that NetBox drives the configuration on our devices. Now if you want to change an IP address on an interface, you change it in NetBox and run the automation.”

Unleashing Operational Excellence: The Benefits

Wikimedia’s NetBox implementation has delivered substantial benefits across multiple dimensions of their operations.

A Single Source of Truth

Perhaps the most significant impact has been establishing NetBox as the definitive source of truth for the infrastructure. The team established a clear principle from the beginning: only add data to NetBox that would drive infrastructure configuration and that they could keep accurate and up to date.

“One of the driving principles was deciding what data to move to NetBox,” Coccioli shares. “NetBox is really powerful and can store a lot of data types, but we only wanted to include data that would actually serve as a source of truth and drive our infrastructure.”

Enhanced Reliability Through Validation

The team leverages NetBox’s validation capabilities to ensure data integrity, initially through reports and later through custom validators as the platform evolved.

“When we migrated the data, we created reports to identify invalid entries,” notes Younsi. “We later replaced these with custom validators to prevent invalid data from being entered in the first place.”

This validation approach has significantly reduced errors and increased confidence in the data, creating a virtuous cycle where NetBox becomes increasingly trusted and valuable.

Empowering Other Teams

The implementation has extended benefits far beyond the network team, enabling self-service capabilities for data center operations and other infrastructure teams.

“We wanted to make it easy for our data center teams to request servers and for the technicians to connect them without requiring manual switch port configuration,” explains Mooney. “NetBox has helped us create a streamlined handoff process for other SRE teams.”

Through NetBox, these once-manual processes have been streamlined and automated, allowing a small team to effectively manage a substantial infrastructure footprint.

Operational Efficiency

The automation enabled by NetBox has transformed daily operations, significantly reducing manual effort and minimizing errors.

“In the past, people had to manually configure switch ports when connecting new devices,” Younsi reflects. “Being a small team, we always had to carefully prioritize automation efforts, but NetBox has made this much more manageable.”

Now, with NetBox-driven automation, these routine tasks are handled efficiently and consistently, freeing the team to focus on more strategic initiatives.

Improved Cross-Team Collaboration

NetBox has improved collaboration across Wikimedia’s technology teams, serving as an integration point for multiple systems including Puppet, DNS management, and monitoring tools.

“Everyone is looking for automation,” Younsi notes, “and we have colleagues across teams who contribute to and benefit from our NetBox implementation. Team members interested in networking can help out even when the core network team isn’t available.”

This collaborative environment extends to troubleshooting and problem resolution, with Mooney observing: “This is the place with the least knee-jerk ‘blame the network’ reaction I’ve experienced. When someone suggests there might be a network issue, it’s usually because they’ve already eliminated other possibilities.”

The Road Ahead: Future Plans

Wikimedia continues to enhance and expand their NetBox implementation, with several initiatives in progress or planned:

Staying Current with NetBox Evolution

The team looks forward to upgrading to newer NetBox versions to take advantage of enhanced features like L2 circuit functionality.

Expanding the Automation Scope

They’re exploring gNMI (gRPC Network Management Interface) for more granular network device configuration, moving beyond full-configuration generation to atomic, targeted changes.

“We’re looking at gNMI to interact more directly and specifically with device configurations,” explains Coccioli.

Deepening Infrastructure Integration

The team plans to drive more infrastructure elements from NetBox, particularly server network configurations that are currently managed through other systems.

“Our next stage is to drive server network configurations from NetBox,” Coccioli shares. “Currently this is managed by Puppet, and while we export some data from NetBox to Puppet, we want to make NetBox the primary driver for all network configurations.”

Conclusion

Wikimedia’s NetBox implementation demonstrates how a small, dedicated team can leverage the platform to transform infrastructure management for a global organization. By establishing NetBox as their authoritative source of truth and building automation around it, they’ve significantly enhanced operational efficiency, improved data quality, and empowered teams across the organization.

As Mooney summarizes the transformation: “If data in NetBox is incorrect, the network will be configured incorrectly and things won’t work. Similarly, if someone makes manual changes directly on a device, those changes will be overwritten when automation runs. NetBox enables this powerful automation journey.”

From its beginning as a replacement for legacy systems to becoming the cornerstone of Wikimedia’s infrastructure management strategy, NetBox has proven its value through enhanced visibility, operational efficiency, and as a foundation for ongoing automation initiatives that keep Wikipedia and other Wikimedia projects running smoothly for users worldwide.