Skip to content

Graceful deploys on Kubernetes

runApp runs a fixed shutdown sequence on SIGTERM: flip /ready to 500, wait for the load balancer to drain, refuse new connections, finish in-flight requests, run plugin shutdown hooks in reverse boot order, exit. A hard timeout backstops a hung process.

This recipe shows the moving parts and the matching Kubernetes YAML.

What runApp does

Inside runApp(app, { shutdown, health }):

StepSequence on SIGTERM
1/ready flips to 500 (Lightship). Load balancer removes the pod from rotation.
2Wait drainDelay ms (default 10_000). Gives the LB time to actually catch up.
3Stop accepting new connections (http-terminator).
4Drain in-flight requests, bounded by drainTimeout (default 30_000).
5Run plugin + provider shutdown hooks in reverse boot order (DBs, Redis, queues).
6Lightship's operational server shuts down.
7process.exit(0) — or SIGKILL if hardTimeout (default 45_000) is exceeded.

App side

ts
import { runApp, defineCheck } from "@nwire/http";
import { app } from "./app.ts";

await runApp(app, {
  shutdown: {
    drainDelay:   10_000,   // LB grace
    drainTimeout: 30_000,   // in-flight grace
    hardTimeout:  45_000,   // SIGKILL after this
  },
  health: {
    port:          9_400,
    livenessPath:  "/live",
    readinessPath: "/ready",
    checks: [
      defineCheck("db",    () => prisma.$queryRaw`SELECT 1`),
      defineCheck("redis", () => redis.ping()),
    ],
  },
});

The boot banner confirms what's running:

▸ nwire app "my-app" ready in 23ms
  http      http://localhost:3000
  docs      http://localhost:3000/docs
  openapi   http://localhost:3000/openapi.json
  inspect   http://localhost:3000/_nwire/manifest
  health    http://localhost:9400/live · /ready

Cluster side

The lifecycle assumes a few things about how the cluster is configured. Match these in your Deployment and Service.

Deployment

yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  template:
    spec:
      # Give the pod enough time to drain. Must exceed
      # drainDelay + drainTimeout from the app's runApp() config.
      terminationGracePeriodSeconds: 60

      containers:
        - name: app
          image: my-org/my-app:latest
          ports:
            - name: http
              containerPort: 3000
            - name: health
              containerPort: 9400

          # Lifecycle hook keeps the pod alive long enough for the
          # endpoints controller to remove it from Service routing.
          # Same number as drainDelay (10s).
          lifecycle:
            preStop:
              exec:
                command: ["sleep", "10"]

          livenessProbe:
            httpGet:
              path: /live
              port: health
            periodSeconds: 10
            failureThreshold: 3

          readinessProbe:
            httpGet:
              path: /ready
              port: health
            periodSeconds: 5
            failureThreshold: 2

          # Don't kill on startup if it's slow.
          startupProbe:
            httpGet:
              path: /live
              port: health
            periodSeconds: 5
            failureThreshold: 30

Service

The Service targetPort points at the app port, not the health port. Health probes hit the pod directly through kubelet.

yaml
apiVersion: v1
kind: Service
metadata:
  name: my-app
spec:
  selector:
    app: my-app
  ports:
    - name: http
      port: 80
      targetPort: http

How the timeouts compose

terminationGracePeriodSeconds on the pod is the upper bound. Make sure it's bigger than the sum:

terminationGracePeriodSeconds  ≥  drainDelay + drainTimeout + small buffer

For the defaults above (10s + 30s = 40s), 60s gives a 20s safety margin before kubelet sends SIGKILL.

hardTimeout (default 45s) is the app's upper bound — if the process is still alive at that point, the app force-exits itself. Set it to less than terminationGracePeriodSeconds so the app always wins the race against kubelet's SIGKILL.

ProfiledrainDelaydrainTimeouthardTimeoutterminationGracePeriodSeconds
Internal API, low traffic2s10s15s20s
Public API, moderate10s30s45s60s
Long-running ops (uploads)15s120s150s180s

Anything longer and you probably want a queue worker instead of an HTTP handler.

Probes — liveness vs readiness

  • Liveness = "is the process alive?" Cheap; fails only when the runtime is broken. Fails → kubelet restarts the pod.
  • Readiness = "should new traffic come here?" Includes DB/Redis checks. Fails → kubelet stops routing; pod stays alive.

Nwire's defineCheck runs on the readiness path. Don't put expensive checks (slow DB queries, external API pings) on liveness — they'll cause restart loops.

Recipe: blue/green-style deploy

For zero-downtime deploys without rolling, set:

yaml
strategy:
  type: RollingUpdate
  rollingUpdate:
    maxSurge: 100%
    maxUnavailable: 0

This brings all new pods up before any old pod is removed. Combined with Nwire's /ready gating, traffic only shifts when the new pods are actually ready (DB connected, JWKS fetched, etc).

MIT licensed.