DevOps &
Deployment Engineering

Writing code is 20% of a backend engineer's job. Getting it running reliably in production — containerized, orchestrated, deployed automatically, and recoverable from failure — is the other 80%. This section teaches you how production Spring Boot applications are actually shipped: Docker, Kubernetes, CI/CD pipelines, Nginx, and cloud deployment strategies used at real companies.

Why DevOps Matters for Backend Engineers

A backend engineer who can only write code but cannot deploy, monitor, or debug production systems is only half an engineer. Modern teams expect engineers who own their services end-to-end: build, ship, run, observe, fix. This "you build it, you run it" culture requires DevOps fluency.

The Modern Software Delivery Pipeline
Code
git push
CI
build + test
Docker
image build + push
CD
deploy to K8s
Monitor
Prometheus + alerts
Goal: code merged → running in production in under 15 minutes, automatically
What "DevOps" Actually Means in Practice

DevOps is not a job title — it's a philosophy that developers are responsible for their code in production. In practice this means: writing Dockerfiles, defining Kubernetes manifests, writing CI/CD pipelines, setting up alerts, and being on-call for services you wrote. You don't need to be a dedicated infrastructure engineer, but you must understand how your application runs, scales, and fails in production.

Docker: Containerizing Spring Boot

Docker packages your application and all its dependencies into a single portable image. The same image runs on your laptop, in CI, and in production — eliminating "works on my machine" problems and ensuring environment consistency.

Production-Grade Dockerfile

Dockerfile
# Stage 1: Build — use full JDK to compile
FROM eclipse-temurin:21-jdk-alpine AS builder
WORKDIR /app

# Copy dependency descriptors first (Docker layer cache optimization)
# If pom.xml unchanged, this layer is cached — no re-download
COPY mvnw pom.xml ./
COPY .mvn .mvn
RUN ./mvnw dependency:go-offline -q

# Now copy source and build
COPY src ./src
RUN ./mvnw package -DskipTests -q

# Extract layers for optimal Docker caching with Spring Boot layered JAR
RUN java -Djarmode=layertools -jar target/*.jar extract

# ─────────────────────────────────────────────────────────────
# Stage 2: Runtime — use lightweight JRE only (no compiler)
FROM eclipse-temurin:21-jre-alpine AS runtime

# Security: don't run as root
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
WORKDIR /app

# Copy Spring Boot layers in order of change frequency
# (dependencies change rarely → cached; application changes most → last)
COPY --from=builder /app/dependencies/ ./
COPY --from=builder /app/spring-boot-loader/ ./
COPY --from=builder /app/snapshot-dependencies/ ./
COPY --from=builder /app/application/ ./

# JVM flags for containers:
# -XX:+UseContainerSupport — respect container CPU/memory limits (default Java 11+)
# -XX:MaxRAMPercentage=75.0 — use 75% of container memory for heap
# -XX:+ExitOnOutOfMemoryError — crash fast on OOM (let K8s restart it)
ENV JAVA_OPTS="-XX:+UseContainerSupport \
               -XX:MaxRAMPercentage=75.0 \
               -XX:+ExitOnOutOfMemoryError \
               -Djava.security.egd=file:/dev/./urandom"

USER appuser
EXPOSE 8080

# Use exec form (not shell form) — signals reach the JVM directly
ENTRYPOINT ["sh", "-c", "java $JAVA_OPTS org.springframework.boot.loader.launch.JarLauncher"]
The Single-Stage Dockerfile Mistake

A single-stage Dockerfile that copies the fat JAR directly is the most common production mistake: it includes the full JDK (600MB+), runs as root, and doesn't use layer caching. The result is a 700MB image that takes 4 minutes to build. The multi-stage approach above produces a ~180MB image that builds in 45 seconds for typical changes because the dependency layer is cached.

Docker Image Best Practices

Shell
# Build and tag
docker build -t myapp:1.2.3 -t myapp:latest .

# Check image size (should be ~150-200MB for a typical Spring Boot app)
docker images myapp

# Inspect layers — find what's making the image large
docker history myapp:latest

# Run locally with environment variables
docker run -p 8080:8080 \
  -e SPRING_PROFILES_ACTIVE=local \
  -e DB_URL=jdbc:postgresql://host.docker.internal:5432/mydb \
  -e DB_PASSWORD=secret \
  myapp:latest

# Scan for security vulnerabilities
docker scout cves myapp:latest

# .dockerignore — critical for build speed and security
# Exclude these from build context:
cat > .dockerignore << 'EOF'
target/
*.log
.git/
.env
*.env
node_modules/
.idea/
*.iml
EOF

Docker Compose: Local Development Stack

Docker Compose defines and runs multi-container applications with a single file. For local development, it replaces the need to manually start PostgreSQL, Redis, Kafka, and your application separately — one command starts the entire stack.

YAML (docker-compose.yml)
version: '3.9'

services:
  app:
    build:
      context: .
      target: runtime      # Use the runtime stage
    ports:
      - "8080:8080"
    environment:
      SPRING_PROFILES_ACTIVE: docker
      SPRING_DATASOURCE_URL: jdbc:postgresql://postgres:5432/appdb
      SPRING_DATASOURCE_USERNAME: appuser
      SPRING_DATASOURCE_PASSWORD: ${DB_PASSWORD:-devpassword}
      SPRING_DATA_REDIS_HOST: redis
      SPRING_KAFKA_BOOTSTRAP_SERVERS: kafka:9092
    depends_on:
      postgres:
        condition: service_healthy
      redis:
        condition: service_started
    healthcheck:
      test: ["CMD", "wget", "-qO-", "http://localhost:8080/actuator/health"]
      interval: 15s
      timeout: 5s
      retries: 5
      start_period: 30s

  postgres:
    image: postgres:16-alpine
    environment:
      POSTGRES_DB: appdb
      POSTGRES_USER: appuser
      POSTGRES_PASSWORD: ${DB_PASSWORD:-devpassword}
    ports:
      - "5432:5432"
    volumes:
      - postgres_data:/var/lib/postgresql/data
      - ./db/init.sql:/docker-entrypoint-initdb.d/init.sql
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U appuser -d appdb"]
      interval: 5s
      timeout: 3s
      retries: 10

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    command: redis-server --maxmemory 256mb --maxmemory-policy allkeys-lru

  kafka:
    image: confluentinc/cp-kafka:7.5.0
    ports:
      - "9092:9092"
    environment:
      KAFKA_NODE_ID: 1
      KAFKA_PROCESS_ROLES: broker,controller
      KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092,CONTROLLER://0.0.0.0:9093
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
      KAFKA_CONTROLLER_QUORUM_VOTERS: 1@kafka:9093
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      CLUSTER_ID: "MkU3OEVBNTcwNTJENDM2Qg"

  # Database admin UI — accessible at http://localhost:5050
  pgadmin:
    image: dpage/pgadmin4:latest
    environment:
      PGADMIN_DEFAULT_EMAIL: admin@admin.com
      PGADMIN_DEFAULT_PASSWORD: admin
    ports:
      - "5050:80"
    profiles:
      - tools   # Only starts with: docker compose --profile tools up

volumes:
  postgres_data:
Shell
# Start everything
docker compose up -d

# Start with rebuild
docker compose up -d --build

# Watch logs from app only
docker compose logs -f app

# Restart just the app (after code change)
docker compose restart app

# Stop and remove everything including volumes (clean slate)
docker compose down -v

# Scale the app service to 3 replicas (needs a load balancer like nginx)
docker compose up -d --scale app=3

CI/CD with GitHub Actions

A CI/CD pipeline automates the path from code commit to production deployment. Every push triggers tests, every merge to main builds and pushes a Docker image, every tag deploys to production. Human error in deployment becomes a thing of the past.

YAML (.github/workflows/ci-cd.yml)
name: CI/CD Pipeline

on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]

env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}

jobs:
  # ── Job 1: Test ──────────────────────────────────────────────────────────
  test:
    runs-on: ubuntu-latest
    services:
      postgres:
        image: postgres:16-alpine
        env:
          POSTGRES_DB: testdb
          POSTGRES_USER: test
          POSTGRES_PASSWORD: test
        ports:
          - 5432:5432
        options: >-
          --health-cmd pg_isready
          --health-interval 5s
          --health-timeout 3s
          --health-retries 10
      redis:
        image: redis:7-alpine
        ports:
          - 6379:6379

    steps:
      - uses: actions/checkout@v4

      - name: Set up JDK 21
        uses: actions/setup-java@v4
        with:
          java-version: '21'
          distribution: 'temurin'
          cache: maven

      - name: Run tests
        run: ./mvnw test -Dspring.profiles.active=ci
        env:
          SPRING_DATASOURCE_URL: jdbc:postgresql://localhost:5432/testdb
          SPRING_DATASOURCE_USERNAME: test
          SPRING_DATASOURCE_PASSWORD: test

      - name: Upload test results
        uses: actions/upload-artifact@v4
        if: failure()
        with:
          name: test-results
          path: target/surefire-reports/

      - name: Code coverage
        run: ./mvnw jacoco:report
      - name: Upload coverage to Codecov
        uses: codecov/codecov-action@v4
        with:
          file: target/site/jacoco/jacoco.xml

  # ── Job 2: Build & Push Docker Image ────────────────────────────────────
  build:
    runs-on: ubuntu-latest
    needs: test
    if: github.ref == 'refs/heads/main'
    permissions:
      contents: read
      packages: write

    steps:
      - uses: actions/checkout@v4

      - name: Log in to Container Registry
        uses: docker/login-action@v3
        with:
          registry: ${{ env.REGISTRY }}
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Extract metadata (tags, labels)
        id: meta
        uses: docker/metadata-action@v5
        with:
          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
          tags: |
            type=sha,prefix=sha-
            type=raw,value=latest

      - name: Build and push Docker image
        uses: docker/build-push-action@v5
        with:
          context: .
          push: true
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}
          cache-from: type=gha    # GitHub Actions cache for Docker layers
          cache-to: type=gha,mode=max

  # ── Job 3: Deploy to Staging ─────────────────────────────────────────────
  deploy-staging:
    runs-on: ubuntu-latest
    needs: build
    environment: staging

    steps:
      - uses: actions/checkout@v4

      - name: Deploy to Kubernetes (staging)
        uses: azure/k8s-deploy@v4
        with:
          namespace: staging
          manifests: k8s/staging/
          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:sha-${{ github.sha }}
Secrets Management in CI/CD

Never put secrets in your workflow YAML files or environment blocks as plain text. GitHub Actions Secrets (encrypted at rest, masked in logs) handle short-lived tokens. For production, use a dedicated secrets manager: AWS Secrets Manager, HashiCorp Vault, or Google Secret Manager. Inject secrets into pods as environment variables from Kubernetes Secrets (which themselves should be backed by an external secrets operator). The golden rule: no secret should ever appear in a git repository, log file, or build artifact.

Kubernetes: Running Spring Boot at Scale

Kubernetes (K8s) is the de facto standard for container orchestration in production. It handles scheduling, scaling, self-healing, rolling deployments, and service discovery. Understanding core K8s concepts is now a baseline expectation for senior backend engineers.

Kubernetes Core Concepts
Pod
Smallest unit. One or more containers sharing network + storage. Ephemeral — when it dies, it's gone.
Deployment
Manages a ReplicaSet. Declares desired state (3 replicas). Handles rolling updates and rollback.
Service
Stable network identity for pods. ClusterIP (internal), NodePort, LoadBalancer (external cloud LB).
ConfigMap
Non-sensitive config (application.yml). Injected as env vars or mounted as files.
Secret
Sensitive config (passwords, tokens). Base64-encoded, should be backed by external secrets manager.
Ingress
HTTP(S) routing rules. Routes traffic to Services based on host/path. Handles TLS termination.

Complete Kubernetes Manifests for Spring Boot

YAML (k8s/deployment.yml)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: order-service
  namespace: production
  labels:
    app: order-service
    version: "1.2.3"
spec:
  replicas: 3
  selector:
    matchLabels:
      app: order-service
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1          # Allow 1 extra pod during update
      maxUnavailable: 0    # Never reduce below desired count (zero-downtime)
  template:
    metadata:
      labels:
        app: order-service
    spec:
      containers:
        - name: order-service
          image: ghcr.io/myorg/order-service:sha-abc123
          ports:
            - containerPort: 8080

          # Resource requests and limits — CRITICAL for stability
          resources:
            requests:
              cpu: "250m"       # 0.25 CPU cores guaranteed
              memory: "512Mi"   # 512MB guaranteed
            limits:
              cpu: "1000m"      # Can burst to 1 CPU
              memory: "1Gi"     # Hard cap — OOM killed above this

          env:
            - name: SPRING_PROFILES_ACTIVE
              value: "production"
            - name: DB_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: order-service-secrets
                  key: db-password
            - name: JAVA_OPTS
              value: "-XX:+UseContainerSupport -XX:MaxRAMPercentage=75.0"

          # Liveness probe — is the app alive? K8s restarts if this fails
          livenessProbe:
            httpGet:
              path: /actuator/health/liveness
              port: 8080
            initialDelaySeconds: 30   # Give JVM time to start
            periodSeconds: 10
            failureThreshold: 3

          # Readiness probe — is the app ready for traffic? Remove from LB if fails
          readinessProbe:
            httpGet:
              path: /actuator/health/readiness
              port: 8080
            initialDelaySeconds: 10
            periodSeconds: 5
            failureThreshold: 3

          # Startup probe — prevents liveness from killing slow-starting pods
          startupProbe:
            httpGet:
              path: /actuator/health/liveness
              port: 8080
            failureThreshold: 30     # 30 × 10s = 5 minutes max startup time
            periodSeconds: 10

      # Spread pods across availability zones for resilience
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: topology.kubernetes.io/zone
          whenUnsatisfiable: DoNotSchedule
          labelSelector:
            matchLabels:
              app: order-service

---
apiVersion: v1
kind: Service
metadata:
  name: order-service
  namespace: production
spec:
  selector:
    app: order-service
  ports:
    - port: 80
      targetPort: 8080
  type: ClusterIP

---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: order-service
  namespace: production
  annotations:
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
  tls:
    - hosts:
        - api.myapp.com
      secretName: api-tls-secret
  rules:
    - host: api.myapp.com
      http:
        paths:
          - path: /api/v1/orders
            pathType: Prefix
            backend:
              service:
                name: order-service
                port:
                  number: 80

Spring Boot Actuator Health for K8s Probes

YAML (application.yml)
management:
  endpoint:
    health:
      probes:
        enabled: true          # Enables /actuator/health/liveness and /readiness
      show-details: always
      group:
        liveness:
          include: livenessState    # JVM alive?
        readiness:
          include: readinessState, db, redis   # All dependencies ready?
  endpoints:
    web:
      exposure:
        include: health, info, prometheus, metrics
Missing Resource Limits = Cluster Instability

A pod without CPU/memory limits can consume unlimited resources, starving other pods on the same node. One runaway Spring Boot app with a memory leak can crash every other service on the node. Always set both requests (what K8s schedules against) and limits (the hard cap). Set memory limit to 130% of typical usage, and always match -XX:MaxRAMPercentage so the JVM heap never exceeds the pod limit.

Nginx as Reverse Proxy

Nginx sits in front of your Spring Boot application in production, handling TLS termination, load balancing, static file serving, rate limiting, and compression. Spring Boot speaks plain HTTP; Nginx handles the messy internet-facing concerns.

Nginx Config
# /etc/nginx/sites-available/myapp.conf

# Rate limiting zone — shared across all worker processes
limit_req_zone $binary_remote_addr zone=api_limit:10m rate=100r/m;

upstream spring_boot {
    server 127.0.0.1:8080;
    server 127.0.0.1:8081;    # Second instance for load balancing
    keepalive 32;              # Persistent connections to upstream
}

# HTTP → HTTPS redirect
server {
    listen 80;
    server_name api.myapp.com;
    return 301 https://$host$request_uri;
}

server {
    listen 443 ssl http2;
    server_name api.myapp.com;

    # TLS configuration (managed by certbot/Let's Encrypt)
    ssl_certificate /etc/letsencrypt/live/api.myapp.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/api.myapp.com/privkey.pem;
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256;
    ssl_session_cache shared:SSL:10m;

    # Security headers
    add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
    add_header X-Content-Type-Options "nosniff" always;
    add_header X-Frame-Options "DENY" always;

    # Gzip compression — critical for API responses
    gzip on;
    gzip_types application/json application/javascript text/css;
    gzip_min_length 1024;

    # Proxy to Spring Boot
    location /api/ {
        # Apply rate limiting
        limit_req zone=api_limit burst=20 nodelay;

        proxy_pass http://spring_boot;
        proxy_http_version 1.1;
        proxy_set_header Connection "";           # Enable keepalive
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        proxy_connect_timeout 5s;
        proxy_read_timeout 30s;
        proxy_send_timeout 30s;

        # Buffer settings for upstream responses
        proxy_buffer_size 16k;
        proxy_buffers 4 32k;
    }

    # Health check endpoint — not rate limited, not logged
    location /actuator/health {
        proxy_pass http://spring_boot;
        access_log off;
    }

    # Block access to sensitive actuator endpoints from outside
    location /actuator {
        deny all;
        return 403;
    }
}
X-Forwarded-For and Spring Boot

When Nginx proxies requests, the client IP in Spring Boot becomes 127.0.0.1 (the Nginx server). To get the real client IP, Spring Boot must be configured to trust the proxy: set server.forward-headers-strategy=native in application.yml. Then request.getRemoteAddr() and Spring Security's IP-based rate limiting will see the real client IP from the X-Forwarded-For header. Without this, IP-based rate limiting is meaningless — all traffic appears to come from the same proxy address.

Environment Management & Configuration

Production, staging, development, and CI all need the same code to run differently. Spring Boot's profiles combined with externalized configuration make this clean — but most teams get it wrong, leading to "it works in dev but fails in prod."

YAML
# application.yml — base config for all environments
spring:
  application:
    name: order-service
  datasource:
    url: ${DB_URL}              # Always from environment variable in production
    username: ${DB_USERNAME}
    password: ${DB_PASSWORD}
  jpa:
    open-in-view: false         # Never true in production
    hibernate:
      ddl-auto: validate        # Production: validate only, never auto-create

server:
  port: 8080
  forward-headers-strategy: native   # Trust X-Forwarded-For from proxy

---
# application-local.yml — developer convenience
spring:
  config:
    activate:
      on-profile: local
  datasource:
    url: jdbc:postgresql://localhost:5432/orderdb
    username: postgres
    password: postgres
  jpa:
    hibernate:
      ddl-auto: create-drop     # Recreate schema on each local restart
    show-sql: true
  devtools:
    restart:
      enabled: true

---
# application-production.yml — production hardening
spring:
  config:
    activate:
      on-profile: production
  jpa:
    hibernate:
      ddl-auto: validate
  datasource:
    hikari:
      maximum-pool-size: 20
      connection-timeout: 5000
      validation-timeout: 3000

server:
  compression:
    enabled: true
    mime-types: application/json,application/javascript,text/css

management:
  endpoints:
    web:
      exposure:
        include: health,info,prometheus  # Minimal exposure in production
Java
// Binding configuration properties to a class (type-safe, validated)
@ConfigurationProperties(prefix = "app.payment")
@Validated
public record PaymentConfig(
    @NotBlank String apiKey,
    @NotBlank String apiUrl,
    @Positive int timeoutSeconds,
    @Min(1) @Max(10) int maxRetries
) {}

// Register it
@SpringBootApplication
@ConfigurationPropertiesScan
public class Application {}

// Use it
@Service
@RequiredArgsConstructor
public class PaymentService {
    private final PaymentConfig config;

    public PaymentResult charge(Money amount) {
        // config.apiKey(), config.timeoutSeconds() etc.
    }
}

// application.yml binding
app:
  payment:
    api-key: ${PAYMENT_API_KEY}
    api-url: https://api.stripe.com
    timeout-seconds: 10
    max-retries: 3

Deployment Strategies

How you deploy matters as much as what you deploy. The wrong strategy causes downtime, data corruption, or silent bugs in production. Every senior engineer must understand these patterns.

Deployment Strategy Comparison
Rolling Update
K8s default
Replace pods one at a time. Old and new version run simultaneously during rollout. ✓ Zero downtime. ✓ Simple. ✗ API must be backward-compatible during transition.
Blue/Green
Instant switch
Two identical environments. "Blue" is live, deploy to "Green," switch load balancer. ✓ Instant rollback (switch back). ✗ 2× infrastructure cost. ✗ DB migrations must support both versions simultaneously.
Canary
Traffic splitting
Route 5% of traffic to new version. Watch metrics. Gradually increase to 100% or rollback. ✓ Lowest blast radius. ✗ Complex routing (requires service mesh or weighted ingress). ✗ Slow rollout.
Recreate
Stop then start
Stop all old pods, then start new ones. ✓ Simple. ✓ No version overlap. ✗ Downtime during deployment. Only appropriate for non-production or batch jobs.

Database Migration with Flyway in CI/CD

Java
// Flyway runs automatically on startup — migrations apply before the app accepts traffic
// This is the correct order: migrate DB → start app → K8s readiness probe passes → traffic routed

// application.yml
spring:
  flyway:
    enabled: true
    locations: classpath:db/migration
    baseline-on-migrate: false   # Only true for first migration on existing DB
    out-of-order: false          # Strictly ordered in production

// src/main/resources/db/migration/
// V1__initial_schema.sql
// V2__add_orders_table.sql
// V3__add_product_stock_column.sql  ← always additive!

// Safe migration practices:
// 1. NEVER rename a column in one step (add new + backfill + drop old = 3 deployments)
// 2. NEVER add NOT NULL without a default (crashes on existing rows)
// 3. NEVER drop a column before removing all code references (two deployments)
// 4. Always add indexes CONCURRENTLY to avoid table locks in PostgreSQL
// Example safe NOT NULL column addition:
// V10__add_status_column.sql:
//   ALTER TABLE orders ADD COLUMN status VARCHAR(50) DEFAULT 'PENDING';
//   UPDATE orders SET status = 'PENDING' WHERE status IS NULL;
//   ALTER TABLE orders ALTER COLUMN status SET NOT NULL;

Linux Essentials for Backend Engineers

When a production service fails at 2am, you'll be SSHed into a Linux server, not in your IDE. These commands are the minimum toolkit every backend engineer must know fluently.

Shell
# ── Process & JVM ──────────────────────────────────────────────────────────
ps aux | grep java                    # Find the Java process
jps -lv                               # List all JVM processes with flags
jstack $(pgrep -f app.jar)            # Thread dump — detect deadlocks/hangs
jmap -heap $(pgrep -f app.jar)        # Heap usage summary
kill -3 $(pgrep -f app.jar)           # Send SIGQUIT = thread dump to stdout

# ── Network & Ports ────────────────────────────────────────────────────────
ss -tlnp | grep 8080                  # What's listening on 8080?
curl -v http://localhost:8080/actuator/health   # Test health endpoint
curl -w "\n%{time_total}s\n" http://localhost:8080/api/v1/ping  # Request timing
netstat -s | grep -i retransmit       # Check for TCP retransmissions

# ── Log Analysis ───────────────────────────────────────────────────────────
tail -f /var/log/app/application.log  # Follow live logs
grep "ERROR" application.log | tail -50           # Last 50 errors
grep "2025-01-15 14:" application.log | grep -v INFO  # Errors in a time window
journalctl -u order-service -f --since "1 hour ago"  # systemd service logs

# ── System Resources ───────────────────────────────────────────────────────
top -p $(pgrep -f app.jar)            # CPU/memory for specific process
vmstat 1 5                            # System-wide memory/IO stats (5 samples)
iostat -x 1                           # Disk I/O — detect DB I/O bottleneck
free -h                               # Memory usage
df -h                                 # Disk space (OOM logs fill disks)
lsof -p $(pgrep -f app.jar) | wc -l  # Open file descriptors (watch for leaks)

# ── Kubernetes ────────────────────────────────────────────────────────────
kubectl get pods -n production                       # Pod status
kubectl logs -f order-service-abc123 -n production   # Follow pod logs
kubectl describe pod order-service-abc123 -n production  # Events, probe failures
kubectl exec -it order-service-abc123 -n production -- sh   # Shell into pod
kubectl rollout history deployment/order-service -n production
kubectl rollout undo deployment/order-service -n production  # Rollback!

Production Deployment Pitfalls

1. Missing Graceful Shutdown

Without graceful shutdown, K8s SIGTERM immediately kills the JVM, dropping in-flight requests. Configure server.shutdown=graceful and spring.lifecycle.timeout-per-shutdown-phase=30s. Spring Boot will stop accepting new requests, wait for active ones to complete, then exit cleanly.

2. JVM Ignoring Container Limits

Java 8 pre-update 191 doesn't respect cgroup memory limits — it reads the host's total RAM, sets a heap of 25%, and the pod gets OOM-killed. Always use Java 11+ with -XX:+UseContainerSupport and -XX:MaxRAMPercentage=75.0.

3. Broken Readiness Probe

If your readiness probe is too aggressive (fails fast), pods are removed from the load balancer during slow startup. If it's too lenient (always passes), traffic reaches pods that aren't ready. Configure initialDelaySeconds to be longer than your slowest startup, and use the Spring Boot Actuator readiness endpoint which checks actual downstream dependencies.

4. Destructive Flyway Migration

A migration that renames a column while the old code is still running causes 500 errors for every in-flight request. Always: (1) add new column, (2) deploy code that writes to both, (3) backfill, (4) deploy code that reads from new only, (5) drop old column. This takes 3 deployments but never causes downtime.

5. Connection Pool Exhaustion on Deploy

During a rolling update, old and new pods both connect to the database simultaneously. If your pool size is 20 per pod and you have 3 replicas, you need 120 DB connections during rollout (6 pods × 20). Size your DB connection pool accounting for max replicas during deployment, not just steady-state.

6. Secrets in Docker Image Layers

If you COPY a file containing secrets during the Dockerfile build and then DELETE it in a later layer, the secret is still accessible in the earlier layer's history. Secrets must NEVER be baked into images. Use --secret mount in BuildKit, or inject at runtime via environment variables or K8s Secrets.

Interview Questions

Q: What is the difference between a Docker image and a container? What is a layer?
An image is a read-only template — a series of filesystem layers stacked on top of each other. Each instruction in a Dockerfile creates a new layer. A container is a running instance of an image — it adds a thin writable layer on top of the image layers. Multiple containers can share the same image layers (saving disk space). Layers are cached: if the base layer hasn't changed, Docker reuses it. This is why you copy dependency descriptors before source code in a Dockerfile — dependencies change rarely, source code changes often. When source code changes, only the layers from that point onward need rebuilding.
Q: Why use a multi-stage Dockerfile for Spring Boot?
Multi-stage builds solve two problems: (1) Image size — the build stage needs the full JDK (400MB+), but the runtime stage only needs the JRE (~190MB). Without multi-stage, you ship the JDK to production unnecessarily. (2) Security surface — the build stage contains Maven, the full JDK, and build tools. None of these should be in a production image as they expand the attack surface. Multi-stage copies only the compiled artifacts from builder to runtime, producing a minimal, hardened image. The Spring Boot layered JAR also lets you optimize cache — dependencies (rarely changing) are in an early layer, application code (frequently changing) is in the last layer. Only changed layers are pushed to the registry.
Q: How does -XX:MaxRAMPercentage differ from -Xmx, and which should you use in containers?
-Xmx sets a fixed heap size (e.g., -Xmx512m). In a container, this is fragile: if you resize the container, you must also update the JVM flag. -XX:MaxRAMPercentage sets the heap as a percentage of the container's available memory. With -XX:MaxRAMPercentage=75.0, a 1GB container gets a 750MB heap; a 2GB container gets 1.5GB — automatically, without changing the flag. In Kubernetes where pod resource limits vary by environment, use MaxRAMPercentage. Leave 25% for metaspace, code cache, direct buffers, and OS overhead. Never set MaxRAMPercentage above 80% — pods get OOM-killed when total JVM memory (heap + non-heap) exceeds the container limit, not just when the heap fills.
Q: What happens when a Kubernetes pod's liveness probe fails?
When a liveness probe fails failureThreshold consecutive times, kubelet restarts the container (not the pod). The restart count increments. After several rapid restarts, K8s enters CrashLoopBackOff — exponentially increasing the delay between restarts (10s, 20s, 40s, up to 5 minutes). This is a safety mechanism to avoid thrashing. The readiness probe is different: failing readiness removes the pod from the Service's endpoint list (no traffic), but does NOT restart it. Use liveness for "is the app dead and needs a restart?" (JVM deadlock, OOM). Use readiness for "is the app temporarily unable to serve traffic?" (warming up, upstream dependency down). The startup probe prevents liveness from killing a slow-starting pod — it gives the app time to start before liveness takes over.
Q: What are Kubernetes resource requests vs limits and why does the difference matter?
Requests are what K8s uses for scheduling — it finds a node with at least that much available and places the pod there. Limits are the runtime cap. A pod exceeding memory limit is OOM-killed. A pod exceeding CPU limit is throttled (not killed). A pod can burst above its request up to its limit while the node has spare capacity. Setting requests = limits (Guaranteed QoS) gives predictable performance but wastes resources. Setting requests lower than limits (Burstable QoS) allows bursting but can cause noisy-neighbor problems. Most production apps use Burstable: set requests at typical usage (say 256MB memory), limits at maximum tolerable (say 1GB). Without resource limits, one misbehaving pod can consume all node resources and starve others.
Q: How do rolling deployments in Kubernetes achieve zero downtime?
Kubernetes rolling update works as follows: (1) Start a new pod with the updated image; (2) Wait for its readiness probe to pass; (3) Remove one old pod from the Service endpoints (so it stops receiving traffic); (4) Terminate the old pod; (5) Repeat. With maxUnavailable=0 and maxSurge=1, you always have at least the desired number of pods serving traffic, plus one new one warming up. Zero downtime requires: (a) your readiness probe actually checks real readiness (DB connectivity, cache warmup); (b) your application handles SIGTERM gracefully (finishes in-flight requests before exiting); (c) the new and old API versions are compatible during the transition window; (d) the preStop hook gives time for endpoints to drain before shutdown begins.
Q: What is the difference between CI and CD?
Continuous Integration (CI): automatically build and test every code change. Every PR triggers: compile, unit tests, integration tests, code coverage, static analysis. Goal: detect broken code within minutes of committing. Continuous Delivery (CD): automatically prepare a production-ready artifact for every merge to main. Every merge to main triggers: build Docker image, push to registry, deploy to staging, run smoke tests. The artifact is always deployable. Continuous Deployment: automatically deploy to production after passing all quality gates — no human approval. Continuous Delivery requires a human approval step for production. Most mature teams practice Continuous Delivery: staging deploys are fully automatic, production deploys are one-click from a pre-validated artifact.
Q: How do you handle database migrations safely in a rolling deployment?
During a rolling deployment, old and new code run simultaneously. This means DB schema must be compatible with both versions at the same time. Rules: (1) Additive changes only in a single deployment — add columns, add tables, add indexes. Never drop or rename. (2) Backward-compatible — the old code must work with the new schema. A new NOT NULL column needs a default so old code's inserts don't fail. (3) For breaking changes, use the expand-contract pattern: deploy 1 adds the new column and writes to both old+new; deploy 2 backfills and migrates reads to new column; deploy 3 removes old column. (4) Run migrations before deploying new pods — K8s Init Containers work well here. (5) Never use ddl-auto=update in production — it's unpredictable. Use Flyway or Liquibase with version-controlled migration scripts.
Q: How would you implement a canary deployment for a Spring Boot service?
Canary deployment routes a percentage of production traffic to a new version. Implementation options: (1) Kubernetes with weighted services — deploy canary as a separate Deployment with fewer replicas; traffic split proportionally by pod count (1 canary pod + 9 stable = 10% canary traffic). Simple but coarse-grained. (2) Ingress weight annotation — nginx-ingress supports canary: annotations to route a percentage to a canary service independently of replica count. (3) Service mesh (Istio/Linkerd) — fine-grained traffic splitting (5% by weight), header-based routing (internal users get new version), circuit breaking per version, per-version metrics. Process: deploy canary → monitor error rate and latency for new version → if metrics are good, increment traffic → eventually roll old version down. Key metric: compare p99 latency and error rate between canary and stable. Rollback = delete canary deployment.
🚀

Section 11 Complete

You can now containerize Spring Boot applications properly, orchestrate them on Kubernetes with health probes and resource limits, build automated CI/CD pipelines, configure Nginx as a reverse proxy, manage configuration across environments, and avoid the most dangerous production deployment mistakes.