Kubernetes Deployment
Deploy Nemesis to a lightweight Kubernetes cluster using either k3d (k3s-in-Docker) or native k3s, with Dapr operator-managed sidecars and KEDA event-driven autoscaling.
Note
Docker Compose remains the primary development environment. Kubernetes deployment is additive and intended for production-like environments and autoscaling testing. See the quickstart guide for Docker Compose setup.
System requirements are the same as Docker Compose (4 cores, 12+ GB RAM, 100 GB disk).
Quick Start (k3d)
k3d runs k3s inside Docker containers — ideal for local development and testing.
Prerequisites
| Tool | Install |
|---|---|
| k3d | curl -s https://raw.githubusercontent.com/k3d-io/k3d/main/install.sh \| bash |
| kubectl | See docs |
| Helm | curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 \| bash |
| Docker | Must be running |
Dapr and KEDA are installed automatically via Helm by the setup script.
Setup
# 1. Create cluster with Traefik, Dapr, and KEDA
./k8s/scripts/setup-cluster-k3d.sh
# 2. Deploy using pre-built images from ghcr.io
./k8s/scripts/deploy.sh install
# 3. Verify everything is running
./k8s/scripts/verify.sh
# Access Nemesis at https://localhost:7443 (default user: n / password: n)
Teardown
# Delete cluster (preserves registry for faster rebuilds)
./k8s/scripts/teardown-cluster-k3d.sh
# Delete cluster AND registry
./k8s/scripts/teardown-cluster-k3d.sh --registry
Quick Start (k3s)
k3s runs natively on the host — suited for VMs, bare-metal servers, and production-like environments where Docker is not available or desired.
Prerequisites
| Tool | Install |
|---|---|
| kubectl | See docs |
| Helm | curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 \| bash |
| curl | System package manager |
Docker is not required. k3s uses containerd directly.
Dapr and KEDA are installed automatically via Helm by the setup script.
Setup
# 1. Install k3s and configure Traefik, Dapr, KEDA
./k8s/scripts/setup-cluster-k3s.sh
# 2. Deploy using pre-built images from ghcr.io
./k8s/scripts/deploy.sh install
# 3. Verify everything is running
./k8s/scripts/verify.sh
# Access Nemesis at https://localhost:7443 (default user: n / password: n)
Teardown
# Remove everything including k3s
./k8s/scripts/teardown-cluster-k3s.sh
# Remove only Nemesis and Helm releases, keep k3s running
./k8s/scripts/teardown-cluster-k3s.sh --keep-k3s
Building Locally
The --build flag auto-detects the cluster type and uses the appropriate method:
- k3d: Builds images and pushes to the k3d local registry
- k3s: Builds images and loads them into k3s containerd via
k3s ctr images import
Both require Docker to build images.
# Deploy using locally built images (auto-detects k3d or k3s)
./k8s/scripts/deploy.sh install --build
# Or build images separately:
./k8s/scripts/build-and-push-k3d.sh # k3d: push to local registry
./k8s/scripts/build-and-load-k3s.sh # k3s: load into containerd
# Build specific services only
./k8s/scripts/build-and-push-k3d.sh web-api frontend
./k8s/scripts/build-and-load-k3s.sh web-api frontend
Note
k3s --build requires Docker on the same host to build images. The images are exported via docker save and imported into k3s containerd. The pods use imagePullPolicy: Never to ensure they use the locally-loaded images.
Deploy Script
./k8s/scripts/deploy.sh <action> [options]
Actions:
install Install or upgrade Nemesis
uninstall Remove Nemesis
status Show deployment status
Options:
--build Build images locally before deploying (k3d or k3s)
--monitoring Enable monitoring (deferred)
--dry-run Render templates without deploying
--values FILE Additional Helm values file
--set KEY=VAL Override a specific value
Examples
# Deploy with ghcr.io images
./k8s/scripts/deploy.sh install
# Deploy with locally built images (k3d or k3s)
./k8s/scripts/deploy.sh install --build
# Override credentials
./k8s/scripts/deploy.sh install \
--set credentials.postgres.password=StrongPass \
--set credentials.rabbitmq.password=StrongPass \
--set credentials.s3.secretKey=StrongPass
# Disable autoscaling
./k8s/scripts/deploy.sh install --set autoscaling.enabled=false
# Preview rendered templates
./k8s/scripts/deploy.sh install --dry-run
# Custom values file
./k8s/scripts/deploy.sh install --values my-values.yaml
Architecture
Key Differences from Docker Compose
| Aspect | Docker Compose | Kubernetes |
|---|---|---|
| Dapr sidecars | Manual -dapr containers |
Operator-injected via pod annotations |
| Dapr control plane | Standalone placement + scheduler | Helm-installed Dapr operator |
| Secrets | .env file + secretstores.local.env |
K8s Secrets + secretstores.kubernetes |
| Reverse proxy | Traefik container with Docker provider | Traefik with IngressRoute CRDs |
| Autoscaling | Manual docker compose up --scale |
KEDA ScaledObjects on RabbitMQ queue depth |
| Connection pooling | Per-service pools only | PgBouncer (transaction mode) between services and PostgreSQL |
| Service discovery | Docker DNS | K8s Service DNS |
What the Setup Scripts Install
Both setup-cluster-k3d.sh and setup-cluster-k3s.sh install the same Helm components:
- Traefik (Helm chart v34.3.0) — reverse proxy with TLS termination
- Dapr (Helm chart v1.16.9) — sidecar injection, pub/sub, workflows, secrets
- KEDA (Helm chart v2.16.1) — event-driven autoscaling from RabbitMQ queue depth
All versions are pinned for reproducibility.
k3d vs k3s
| Aspect | k3d | k3s |
|---|---|---|
| Runtime | k3s inside Docker containers | Native on host |
| Docker required | Yes | No (uses containerd) |
| Local image builds | Via k3d local registry | Via k3s ctr images import |
| Traefik service type | NodePort (mapped via k3d port) | LoadBalancer (built-in ServiceLB/Klipper) |
| Default HTTPS port | 7443 | 7443 |
| Best for | Local dev, CI | VMs, bare-metal, production-like |
KEDA Autoscaling
KEDA monitors RabbitMQ queue depth and CPU utilization to scale services automatically:
| Service | Trigger | Threshold | Min | Max | Cooldown |
|---|---|---|---|---|---|
| file-enrichment | Queue: files-new_file |
10 messages | 1 | 5 | 60s |
| document-conversion | Queue: files-document_conversion_input |
5 messages | 1 | 5 | 60s |
| titus-scanner | Queue: titus-titus_input |
10 messages | 1 | 5 | 60s |
| dotnet-service | Queue: dotnet-dotnet_input |
5 messages | 1 | 3 | 60s |
| gotenberg | CPU utilization | 70% | 1 | 3 | 120s |
All thresholds are configurable in values.yaml under autoscaling.
Tip
Queue names are created by Dapr as {consumerID}-{topic}. Verify actual queue names in the RabbitMQ management UI after first deployment and update values.yaml if they differ. Gotenberg uses CPU-based scaling (not queue-based) since it receives synchronous HTTP requests rather than consuming from a queue.
PgBouncer Connection Pooling
PgBouncer sits between all services (including Dapr sidecars) and PostgreSQL, multiplexing hundreds of client connections onto a small pool of real database connections. This prevents connection exhaustion during KEDA autoscaling when many pod replicas open connections simultaneously.
- Pool mode: transaction — connections are returned to the pool after each transaction
- Max client connections: 500 (configurable in
values.yaml) - Default pool size: 20 real PostgreSQL connections
All services connect to pgbouncer:5432 instead of postgres:5432. The postgres service remains unchanged and is only accessed by PgBouncer directly.
Helm Chart Structure
k8s/helm/nemesis/
├── Chart.yaml
├── values.yaml # All configuration
├── values-dev.yaml # Local registry overrides (k3d)
├── values-dev-k3s.yaml # Local image overrides (k3s)
├── files/ # Static files (SQL, configs)
└── templates/
├── _helpers.tpl
├── namespace.yaml
├── secrets.yaml
├── configmap-*.yaml
├── dapr/ # Dapr CRDs (secretstore, statestore, pubsub, configs)
├── infra/ # PostgreSQL, RabbitMQ, SeaweedFS, Hasura
├── apps/ # Application deployments + services
├── ingress/ # Traefik IngressRoute + middleware
├── keda/ # KEDA ScaledObjects + TriggerAuthentication
└── tests/ # Helm test pod
Operations
Check Status
kubectl get pods -n nemesis
kubectl get svc -n nemesis
kubectl get components.dapr.io -n nemesis
kubectl get scaledobject -n nemesis
# Or use the deploy script
./k8s/scripts/deploy.sh status
View Logs
kubectl logs -f deployment/web-api -n nemesis
kubectl logs -f deployment/file-enrichment -n nemesis
kubectl logs -f deployment/file-enrichment -c daprd -n nemesis # Dapr sidecar
Run Helm Tests
helm test nemesis -n nemesis
Configuration
All configuration is in k8s/helm/nemesis/values.yaml. Key sections:
| Section | Description |
|---|---|
credentials.* |
PostgreSQL, RabbitMQ, S3 (SeaweedFS), Hasura passwords |
nemesis.* |
URL, log level, expiration defaults |
autoscaling.* |
KEDA scaling thresholds and limits |
fileEnrichment.* |
File enrichment replicas, resources, env vars |
postgres.* |
Database image, storage, max connections |
pgbouncer.* |
Connection pooling image, pool size, max client connections |
rabbitmq.* |
Message queue image, storage |
seaweedfs.* |
Object storage (SeaweedFS) image, buckets, storage |
PostgreSQL Connection Tuning
All services connect through PgBouncer in transaction pooling mode, which multiplexes many client connections onto a small pool of real PostgreSQL connections. This prevents connection exhaustion during autoscaling.
Key settings (configurable in values.yaml under pgbouncer):
| Setting | Default | Description |
|---|---|---|
maxClientConn |
2000 | Max simultaneous client connections PgBouncer accepts |
defaultPoolSize |
60 | Real PostgreSQL connections per database |
minPoolSize |
20 | Minimum idle PostgreSQL connections maintained |
reservePoolSize |
15 | Extra connections when pool is exhausted |
poolMode |
transaction | Return connections to pool after each transaction |
Per-service pool settings are still configurable via environment variables:
DB_POOL_MAX_SIZE(default: 20) — maximum connections per pod (to PgBouncer)DB_POOL_MIN_SIZE(default: 2) — minimum idle connections per pod (to PgBouncer)
With KEDA scaling to 5+ replicas across multiple services, PgBouncer keeps real PostgreSQL connections at ~60-75, well within the 300 max_connections limit.
Deferred Features
The following will be added in future updates:
- Monitoring stack (Grafana, Prometheus, Jaeger, Loki) — toggle:
monitoring.enabled - LLM stack (LiteLLM, Phoenix, Agents) — toggle:
llm.enabled - Jupyter — toggle:
jupyter.enabled
The values.yaml toggles exist but no templates are generated yet.
Troubleshooting
Pods stuck in CrashLoopBackOff
Check infrastructure first — app pods depend on PostgreSQL, RabbitMQ, and SeaweedFS:
kubectl logs deployment/postgres -n nemesis
kubectl logs statefulset/rabbitmq -n nemesis
kubectl logs statefulset/seaweedfs -n nemesis
Dapr sidecar not injecting
Verify the namespace label:
kubectl get ns nemesis --show-labels
dapr.io/inject=true.
KEDA not scaling
Verify queue names match what Dapr created:
kubectl port-forward svc/rabbitmq 15672:15672 -n nemesis
# Open http://localhost:15672 and check queue names
autoscaling.* section of values.yaml if names differ.
Connection pool exhaustion
If you see FATAL: sorry, too many clients already in PostgreSQL logs:
- Check PgBouncer stats:
kubectl exec deployment/pgbouncer -n nemesis -- env PGPASSWORD=Qwerty12345 psql -U nemesis -h 127.0.0.1 -p 5432 pgbouncer -c "SHOW POOLS;" - Check PostgreSQL connections:
kubectl exec deployment/postgres -n nemesis -- psql -U nemesis -d enrichment -c "SELECT count(*) FROM pg_stat_activity;" - Check per-service pool stats: port-forward to file-enrichment and hit
/system/pool-stats - Tune PgBouncer: increase
pgbouncer.defaultPoolSizeinvalues.yaml(adds more real PostgreSQL connections) - Tune per-pod pools: reduce
DB_POOL_MAX_SIZEenv var to lower client connections to PgBouncer