Skip to main content
Enterprise

Monitoring

The nbe-operator exposes Prometheus metrics and health check endpoints for observability.

Operator Metrics

The operator exposes metrics at :8080/metrics in Prometheus format.

Prometheus Annotations

When metrics.enabled: true (the default), Prometheus scrape annotations are added to the operator pod:

prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"
KeyTypeDefaultDescription
metrics.enabledbooltrueEnable Prometheus annotations on the operator pod
metrics.podAnnotationsbooltrueAdd standard prometheus.io/* annotations

ServiceMonitor

For clusters using the Prometheus Operator, create a ServiceMonitor:

serviceMonitor:
enabled: true
interval: "30s"
scrapeTimeout: "10s"

Full ServiceMonitor configuration:

KeyTypeDefaultDescription
serviceMonitor.enabledboolfalseCreate a ServiceMonitor resource
serviceMonitor.namespacestringRelease namespaceTarget namespace
serviceMonitor.labelsobject{}Labels for ServiceMonitor selection
serviceMonitor.intervalstring30sScrape interval
serviceMonitor.scrapeTimeoutstring10sScrape timeout
serviceMonitor.schemestringhttpHTTP scheme
serviceMonitor.honorLabelsbooltrueHonor labels from metrics

NetBox Application Metrics

Enable the NetBox /metrics endpoint separately via the NetBoxEnterprise spec:

netboxEnterprise:
spec:
netbox:
config:
metricsEnabled: true

The operator aggregates metrics from NetBox and Diode deployments and exposes them at its own /metrics endpoint, so Prometheus only needs to scrape the operator.

note

Deployment metrics aggregation (collecting metrics from NetBox/Diode pods) is a new feature. Contact NetBox Labs support for early access.

Diode Component Metrics

Enable per-component metrics with telemetry configuration:

netboxEnterprise:
spec:
diode:
config:
ingester:
telemetryConfig:
metricsEnabled: true
reconciler:
telemetryConfig:
metricsEnabled: true
auth:
telemetryConfig:
metricsEnabled: true

Supported exporters: prometheus, otlp, console, none.

Health Check Endpoints

The operator exposes two health endpoints:

EndpointPortPurpose
/healthz8081Liveness probe — is the operator process alive?
/readyz8081Readiness probe — is the operator ready to serve?

These are configured as Kubernetes liveness and readiness probes on the operator pod.

Operator Log Levels

Adjust operator verbosity for debugging:

operator:
logging:
level: "debug" # Or: info, info,kube=warn, operator=debug,info
format: "json" # Or: auto, compact, pretty, gcp, aws, otlp
FormatUse Case
autoDetects environment (JSON in Kubernetes, compact locally)
jsonStructured logging for log aggregation (Elasticsearch, Loki)
compactSingle-line human-readable for local development
prettyMulti-line verbose for debugging
gcpGoogle Cloud Logging format (auto-ingested in GKE)
awsCloudWatch-optimized JSON (auto-ingested in EKS)
otlpOpenTelemetry export for Azure Monitor, Jaeger, etc.

Next Steps