Enterprise

Monitoring

The nbe-operator exposes Prometheus metrics and health check endpoints for observability.

Operator Metrics

The operator exposes metrics at :8080/metrics in Prometheus format.

Prometheus Annotations

When metrics.enabled: true (the default), Prometheus scrape annotations are added to the operator pod:

prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"

Key	Type	Default	Description
`metrics.enabled`	bool	`true`	Enable Prometheus annotations on the operator pod
`metrics.podAnnotations`	bool	`true`	Add standard `prometheus.io/*` annotations

ServiceMonitor

For clusters using the Prometheus Operator, create a ServiceMonitor:

serviceMonitor:
  enabled: true
  interval: "30s"
  scrapeTimeout: "10s"

Full ServiceMonitor configuration:

Key	Type	Default	Description
`serviceMonitor.enabled`	bool	`false`	Create a ServiceMonitor resource
`serviceMonitor.namespace`	string	Release namespace	Target namespace
`serviceMonitor.labels`	object	`{}`	Labels for ServiceMonitor selection
`serviceMonitor.interval`	string	`30s`	Scrape interval
`serviceMonitor.scrapeTimeout`	string	`10s`	Scrape timeout
`serviceMonitor.scheme`	string	`http`	HTTP scheme
`serviceMonitor.honorLabels`	bool	`true`	Honor labels from metrics

NetBox Application Metrics

Enable the NetBox /metrics endpoint separately via the NetBoxEnterprise spec:

netboxEnterprise:
  spec:
    netbox:
      config:
        metricsEnabled: true

The operator aggregates metrics from NetBox and Diode deployments and exposes them at its own /metrics endpoint, so Prometheus only needs to scrape the operator.

note

Deployment metrics aggregation (collecting metrics from NetBox/Diode pods) is a new feature. Contact NetBox Labs support for early access.

Diode Component Metrics

Enable per-component metrics with telemetry configuration:

netboxEnterprise:
  spec:
    diode:
      config:
        ingester:
          telemetryConfig:
            metricsEnabled: true
        reconciler:
          telemetryConfig:
            metricsEnabled: true
        auth:
          telemetryConfig:
            metricsEnabled: true

Supported exporters: prometheus, otlp, console, none.

Health Check Endpoints

The operator exposes two health endpoints:

Endpoint	Port	Purpose
`/healthz`	8081	Liveness probe — is the operator process alive?
`/readyz`	8081	Readiness probe — is the operator ready to serve?

These are configured as Kubernetes liveness and readiness probes on the operator pod.

Operator Log Levels

Adjust operator verbosity for debugging:

operator:
  logging:
    level: "debug"     # Or: info, info,kube=warn, operator=debug,info
    format: "json"     # Or: auto, compact, pretty, gcp, aws, otlp

Format	Use Case
`auto`	Detects environment (JSON in Kubernetes, compact locally)
`json`	Structured logging for log aggregation (Elasticsearch, Loki)
`compact`	Single-line human-readable for local development
`pretty`	Multi-line verbose for debugging
`gcp`	Google Cloud Logging format (auto-ingested in GKE)
`aws`	CloudWatch-optimized JSON (auto-ingested in EKS)
`otlp`	OpenTelemetry export for Azure Monitor, Jaeger, etc.

Next Steps

Troubleshooting — Using metrics and logs to diagnose issues
Helm Values Reference — Full operator configuration

Operator Metrics​

Prometheus Annotations​

ServiceMonitor​

NetBox Application Metrics​

Diode Component Metrics​

Health Check Endpoints​

Operator Log Levels​

Next Steps​