Skip to main content

NetBox Enterprise Helm - Advanced Configuration Examples

Beta Notice: These Helm charts are currently in beta. While stable for testing and development environments, please thoroughly test in your specific environment before production deployment. For the most up-to-date information, please refer to the main documentation.

This guide provides advanced configuration examples for NetBox Enterprise Helm deployments, including high availability, resource optimization, security hardening, and production-ready configurations.

Need the basics first? See Installation Guide for standard installation steps, or Prerequisites for system requirements.

High Availability Setup

Multi-Replica Configuration

Configure NetBox Enterprise for high availability with multiple replicas:

# values-ha.yaml
replicaCount: 3

netbox:
replicas: 3
resources:
requests:
memory: "2Gi"
cpu: "1000m"
limits:
memory: "4Gi"
cpu: "2000m"

worker:
replicas: 2
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "1000m"

# Use external managed databases for HA
postgresql:
enabled: false # Use external managed database

redis:
enabled: false # Use external managed Redis

ingress:
enabled: true
annotations:
kubernetes.io/ingress.class: "nginx"
cert-manager.io/cluster-issuer: "letsencrypt-prod"
nginx.ingress.kubernetes.io/ssl-redirect: "true"
hosts:
- host: netbox.example.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: netbox-tls
hosts:
- netbox.example.com

Pod Disruption Budget

Ensure service availability during cluster maintenance:

# Pod disruption budget
podDisruptionBudget:
enabled: true
minAvailable: 1
maxUnavailable: 50%

Node Affinity and Anti-Affinity

Distribute pods across nodes for better availability:

# values-affinity.yaml
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- netbox-enterprise
topologyKey: kubernetes.io/hostname

nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: node-role.kubernetes.io/worker
operator: In
values:
- "true"

Resource Optimization

Production Resource Limits

Optimized resource configuration for different deployment sizes:

# values-optimized.yaml
resources:
netbox:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "1000m"

worker:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"

postgresql:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"

redis:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "200m"

Auto-scaling Configuration

Configure Horizontal Pod Autoscaler:

# values-autoscaling.yaml
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 10
targetCPUUtilizationPercentage: 70
targetMemoryUtilizationPercentage: 80

# Vertical Pod Autoscaler
verticalPodAutoscaler:
enabled: true
updatePolicy:
updateMode: "Auto"
resourcePolicy:
containerPolicies:
- containerName: netbox
minAllowed:
cpu: 100m
memory: 128Mi
maxAllowed:
cpu: 2
memory: 4Gi

Resource Quotas

Set resource quotas for the namespace:

# resource-quota.yaml
apiVersion: v1
kind: ResourceQuota
metadata:
name: netbox-quota
spec:
hard:
requests.cpu: "4"
requests.memory: 8Gi
limits.cpu: "8"
limits.memory: 16Gi
pods: "10"
persistentvolumeclaims: "4"

Security Hardening

Security Context

Configure security contexts for enhanced security:

# values-security.yaml
securityContext:
runAsNonRoot: true
runAsUser: 1000
runAsGroup: 3000
fsGroup: 2000
seccompProfile:
type: RuntimeDefault

containerSecurityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
capabilities:
drop:
- ALL

Network Policies

Implement network policies for traffic control:

# network-policy.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: netbox-enterprise-network-policy
spec:
podSelector:
matchLabels:
app: netbox-enterprise
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
app: nginx-ingress
ports:
- protocol: TCP
port: 8080
egress:
- to:
- podSelector:
matchLabels:
app: postgresql
ports:
- protocol: TCP
port: 5432
- to:
- podSelector:
matchLabels:
app: redis
ports:
- protocol: TCP
port: 6379
- to: []
ports:
- protocol: TCP
port: 443
- protocol: TCP
port: 53
- protocol: UDP
port: 53

Pod Security Standards

Configure Pod Security Standards:

# values-pod-security.yaml
podSecurityStandards:
enforce: "restricted"
audit: "restricted"
warn: "restricted"

serviceAccount:
create: true
automountServiceAccountToken: false
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/netbox-service-role

Performance Tuning

Database Connection Pooling

Configure database connection pooling:

# values-performance.yaml
netbox:
config:
DATABASE:
CONN_MAX_AGE: 300
CONN_MAX_CONNECTIONS: 100
CONN_MIN_CONNECTIONS: 10

REDIS:
CONN_MAX_CONNECTIONS: 50
CONN_CONNECTION_TIMEOUT: 30

Caching Configuration

Optimize caching settings:

# values-caching.yaml
netbox:
config:
CACHING:
REDIS:
CACHE_DATABASE: 1
CACHE_DEFAULT_TIMEOUT: 300
CACHE_KEY_PREFIX: "netbox"
CACHE_VERSION: 1

SESSION_CACHE_ALIAS: "default"
SESSION_COOKIE_AGE: 1209600

Media and Static File Optimization

Configure media and static file handling:

# values-media.yaml
persistence:
media:
enabled: true
storageClass: "fast-ssd"
size: 50Gi
accessMode: ReadWriteMany

static:
enabled: true
storageClass: "fast-ssd"
size: 10Gi
accessMode: ReadWriteMany

# CDN configuration
cdn:
enabled: true
domain: "cdn.example.com"
staticUrl: "https://cdn.example.com/static/"
mediaUrl: "https://cdn.example.com/media/"

Monitoring and Observability

Prometheus Metrics

Enable Prometheus metrics collection:

# values-monitoring.yaml
metrics:
enabled: true
serviceMonitor:
enabled: true
interval: 30s
scrapeTimeout: 10s
namespace: monitoring

prometheusRule:
enabled: true
rules:
- alert: NetBoxHighCPU
expr: rate(container_cpu_usage_seconds_total{container="netbox"}[5m]) > 0.8
for: 2m
labels:
severity: warning
annotations:
summary: "NetBox high CPU usage"
description: "NetBox pod {{ $labels.pod }} CPU usage is above 80%"

Logging Configuration

Configure structured logging:

# values-logging.yaml
logging:
level: INFO
format: json

# Log aggregation
fluentd:
enabled: true
output:
elasticsearch:
host: elasticsearch.logging.svc.cluster.local
port: 9200
index: netbox-logs

Health Checks

Configure comprehensive health checks:

# values-health.yaml
healthcheck:
enabled: true
livenessProbe:
initialDelaySeconds: 60
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
successThreshold: 1

readinessProbe:
initialDelaySeconds: 30
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
successThreshold: 1

startupProbe:
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 30
successThreshold: 1

Backup and Disaster Recovery

Automated Backup Configuration

Configure automated backups:

# values-backup.yaml
backup:
enabled: true
schedule: "0 2 * * *" # Daily at 2 AM
retention: 30 # Keep 30 days of backups

storage:
type: s3
bucket: netbox-backups
region: us-east-1
accessKeyId: AKIAIOSFODNN7EXAMPLE
secretAccessKey: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

postgres:
enabled: true
databases:
- netbox_db
- diode_db
- hydra_db

redis:
enabled: true

media:
enabled: true

Disaster Recovery

Configure disaster recovery procedures:

# values-dr.yaml
disasterRecovery:
enabled: true
replicationFactor: 3

# Cross-region replication
replication:
enabled: true
regions:
- us-east-1
- us-west-2

# Backup verification
verification:
enabled: true
schedule: "0 6 * * 0" # Weekly on Sunday at 6 AM

Multi-Environment Configuration

Development Environment

# values-dev.yaml
replicaCount: 1

resources:
limits:
cpu: 500m
memory: 1Gi
requests:
cpu: 250m
memory: 512Mi

postgresql:
enabled: true
persistence:
enabled: false

redis:
enabled: true
persistence:
enabled: false

ingress:
enabled: true
hosts:
- host: netbox-dev.example.com
paths:
- path: /
pathType: Prefix

Staging Environment

# values-staging.yaml
replicaCount: 2

resources:
limits:
cpu: 1000m
memory: 2Gi
requests:
cpu: 500m
memory: 1Gi

postgresql:
enabled: false
# Use managed database

redis:
enabled: false
# Use managed Redis

ingress:
enabled: true
annotations:
cert-manager.io/cluster-issuer: "letsencrypt-staging"
hosts:
- host: netbox-staging.example.com
paths:
- path: /
pathType: Prefix

Production Environment

# values-prod.yaml
replicaCount: 3

resources:
limits:
cpu: 2000m
memory: 4Gi
requests:
cpu: 1000m
memory: 2Gi

postgresql:
enabled: false
# Use managed database with HA

redis:
enabled: false
# Use managed Redis with HA

ingress:
enabled: true
annotations:
cert-manager.io/cluster-issuer: "letsencrypt-prod"
nginx.ingress.kubernetes.io/rate-limit: "100"
nginx.ingress.kubernetes.io/rate-limit-rps: "10"
hosts:
- host: netbox.example.com
paths:
- path: /
pathType: Prefix

Deployment Validation

Pre-deployment Checks

#!/bin/bash
# pre-deployment-check.sh

echo "Running pre-deployment checks..."

# Check cluster resources
kubectl top nodes
kubectl describe nodes | grep -A 5 "Allocated resources"

# Check storage classes
kubectl get storageclass

# Check secrets
kubectl get secrets

# Validate Helm chart
helm lint ./netbox-enterprise-helm

# Dry run deployment
helm install netbox-enterprise \
oci://registry.replicated.com/netbox-enterprise/beta/netbox-enterprise \
--version 1.11.4 \
--values values-prod.yaml \
--dry-run --debug

Post-deployment Verification

#!/bin/bash
# post-deployment-verify.sh

echo "Running post-deployment verification..."

# Check pod status
kubectl get pods -l app=netbox-enterprise

# Check services
kubectl get svc -l app=netbox-enterprise

# Check ingress
kubectl get ingress -l app=netbox-enterprise

# Health check
kubectl exec -it deployment/netbox-enterprise -- \
curl -f http://your-netbox-instance:8080/api/

# Database connectivity
kubectl exec -it deployment/netbox-enterprise -- \
python manage.py dbshell

Next Steps