Advanced Spring Boot
Architecture

CRUD apps are easy. Production systems are not. This section teaches the architectural patterns used in real-world, high-scale backend systems — from the evolution of a monolith to microservices, from layered architecture to hexagonal, and from synchronous calls to event-driven design.

Layered Architecture — Where Everyone Starts

Most Spring Boot applications start with a layered (n-tier) architecture: Controller → Service → Repository. This is the default, and for small to medium applications it works well. Understanding its limits is how you know when to evolve.

Classic Layered Architecture
Presentation Layer — @RestController
↕ DTOs only
Business Layer — @Service
↕ Domain objects
Persistence Layer — @Repository
↕ JPA Entities
Database — PostgreSQL / MongoDB

The Rules of Layered Architecture

📦
Downward dependency

Each layer only depends on the layer directly below it. Controllers know about Services; Services know about Repositories. Never the reverse.

🔄
DTO boundaries

Never expose JPA entities from controllers. Map to DTOs at the service boundary. Entities are persistence details — they should not leak into the API contract.

🧠
Business logic in Service

Controllers only route and validate. Repositories only query. All business logic lives in the Service. Fat controllers and smart repositories are anti-patterns.

⚠️
The main limitation

As complexity grows, Services accumulate hundreds of methods. Business logic is scattered. Testing requires wiring the full stack. The architecture doesn't enforce boundaries.

When layered architecture breaks down: You'll know it's time to evolve when your OrderService has 50 methods, your UserService imports OrderService which imports UserService (circular dependencies), your tests spin up half the application context, and no one can explain what a "Service" is actually responsible for.

Clean Architecture

Clean Architecture (Robert C. Martin, 2017) solves the core problem of layered architecture: the business logic depends on infrastructure details. In Clean Architecture, the dependency rule is inverted — infrastructure details depend on the business core, never the other way around.

Clean Architecture — Dependency Rule
Frameworks & Drivers
(Spring, JPA, HTTP, DB)
Interface Adapters
(Controllers, Repos, Gateways)
Use Cases
(Application Services)
Domain
Entities
Dependencies
point inward →

Package Structure in Spring Boot

Clean Architecture project layoutText
src/main/java/com/company/app/
├── domain/                          ← Core business objects (no Spring deps)
│   ├── model/
│   │   ├── Order.java               (pure Java, not a JPA entity)
│   │   ├── OrderItem.java
│   │   └── Money.java               (value object)
│   ├── repository/
│   │   └── OrderRepository.java     (interface — no implementation here)
│   └── service/
│       └── OrderPricingService.java (domain service — pure business logic)
│
├── application/                     ← Use cases / orchestration
│   ├── usecase/
│   │   ├── CreateOrderUseCase.java
│   │   ├── CancelOrderUseCase.java
│   │   └── GetOrderUseCase.java
│   └── port/
│       ├── in/                      (input ports — what the app can do)
│       │   └── CreateOrderCommand.java
│       └── out/                     (output ports — what the app needs)
│           ├── SaveOrderPort.java
│           └── LoadOrderPort.java
│
├── adapter/                         ← Infrastructure details
│   ├── in/
│   │   └── web/
│   │       ├── OrderController.java     (@RestController)
│   │       └── OrderRequest.java        (web DTO)
│   └── out/
│       ├── persistence/
│       │   ├── OrderJpaEntity.java      (@Entity — JPA detail)
│       │   ├── OrderJpaRepository.java  (Spring Data)
│       │   └── OrderPersistenceAdapter.java (implements SaveOrderPort)
│       └── messaging/
│           └── KafkaOrderEventAdapter.java
Use case with port interfacesJava
// Output port — defined in application layer, implemented in adapter layer
public interface SaveOrderPort {
    Order save(Order order);
}

// Input port (use case interface)
public interface CreateOrderUseCase {
    Order createOrder(CreateOrderCommand command);
}

// Use case implementation — depends only on port interfaces, not Spring JPA
@Service
@RequiredArgsConstructor
@Transactional
public class CreateOrderService implements CreateOrderUseCase {

    private final LoadProductPort loadProduct;    // implemented by JPA adapter
    private final SaveOrderPort saveOrder;         // implemented by JPA adapter
    private final SendOrderConfirmationPort notify; // implemented by email adapter

    @Override
    public Order createOrder(CreateOrderCommand cmd) {
        // Pure business logic — no Spring, no JPA, no HTTP concerns
        List<OrderItem> items = cmd.getItems().stream()
            .map(i -> {
                Product product = loadProduct.loadById(i.getProductId())
                    .orElseThrow(() -> new ProductNotFoundException(i.getProductId()));
                return new OrderItem(product, i.getQuantity());
            })
            .toList();

        Order order = Order.create(cmd.getCustomerId(), items);
        Order saved = saveOrder.save(order);
        notify.sendConfirmation(saved);
        return saved;
    }
}

// Adapter — bridges Spring Data JPA to the domain port
@Component
@RequiredArgsConstructor
public class OrderPersistenceAdapter implements SaveOrderPort, LoadOrderPort {

    private final OrderJpaRepository jpaRepo;
    private final OrderMapper mapper;

    @Override
    public Order save(Order order) {
        OrderJpaEntity entity = mapper.toEntity(order);
        return mapper.toDomain(jpaRepo.save(entity));
    }
}
When to use Clean Architecture: Complex domains with rich business rules. Long-lived systems that need to change infrastructure (swap databases, add new channels). Teams that want testable business logic without Spring context. For simple CRUD APIs, the overhead is not worth it — layered architecture is fine.

Hexagonal Architecture (Ports & Adapters)

Hexagonal Architecture (Alistair Cockburn, 2005) is the pattern that Clean Architecture builds upon. The core idea: your application has a hexagonal core with ports (interfaces) on all sides. Adapters plug into these ports. The core doesn't know or care what's on the other end — HTTP, Kafka, CLI, or a test.

Hexagonal Architecture
REST Controller
Kafka Consumer
CLI / Test
Driving Adapters
Application Core
Use Cases
Domain Model
Domain Services
↔ Input Ports ↔ Output Ports
JPA Adapter
Kafka Producer
Email Adapter
Driven Adapters

The key insight: driving adapters (left) call the application core via input ports. Driven adapters (right) are called by the application core via output ports. The core never imports Spring, JPA, or any infrastructure library directly — this is what makes it testable with plain Java unit tests.

Domain-Driven Design Fundamentals

DDD is a philosophy for tackling complex business domains. It gives you a vocabulary and set of patterns for modelling software that reflects real business reality, rather than technical convenience.

Core DDD Building Blocks

🏗
Entity

An object with a distinct identity that persists over time. Order(id=123) is the same order whether its status is PENDING or SHIPPED. Identity, not attributes, defines it.

💎
Value Object

Defined entirely by its attributes. No identity. Immutable. Money(100, "USD") is equal to any other Money(100, "USD"). Examples: Address, Email, Coordinates, Money.

🌳
Aggregate

A cluster of entities and value objects with one root. You only access the cluster through the root. Order is the root; OrderItems are inside. No one holds a direct reference to OrderItem.

Domain Event

Something that happened in the domain that other parts might care about. OrderPlaced, PaymentReceived, UserRegistered. Immutable, past-tense, carry all relevant data.

🏭
Repository

An abstraction for persisting and retrieving aggregates. Looks like an in-memory collection. The domain knows the interface; the infrastructure provides the implementation.

🗺
Bounded Context

A linguistic boundary where a model is consistent. "Product" in the Catalog context has different attributes than "Product" in the Inventory context. Each context has its own model.

Aggregate root with domain eventsJava
// Value Object — immutable, no identity
public record Money(BigDecimal amount, String currency) {
    public Money {
        if (amount.compareTo(BigDecimal.ZERO) < 0)
            throw new IllegalArgumentException("Amount cannot be negative");
    }
    public Money add(Money other) {
        if (!this.currency.equals(other.currency))
            throw new IllegalArgumentException("Currency mismatch");
        return new Money(this.amount.add(other.amount), this.currency);
    }
}

// Aggregate Root — controls access to its children
public class Order {
    private final OrderId id;
    private final CustomerId customerId;
    private OrderStatus status;
    private final List<OrderItem> items = new ArrayList<>();
    private final List<DomainEvent> domainEvents = new ArrayList<>();

    // Factory method — enforces invariants
    public static Order create(CustomerId customerId, List<OrderItem> items) {
        if (items.isEmpty()) throw new IllegalArgumentException("Order must have items");
        Order order = new Order(OrderId.generate(), customerId, OrderStatus.PENDING);
        items.forEach(order.items::add);
        // Register domain event — published AFTER the transaction commits
        order.domainEvents.add(new OrderCreatedEvent(order.id, customerId, items));
        return order;
    }

    public void confirm() {
        if (this.status != OrderStatus.PENDING)
            throw new IllegalStateException("Only PENDING orders can be confirmed");
        this.status = OrderStatus.CONFIRMED;
        domainEvents.add(new OrderConfirmedEvent(this.id));
    }

    public void cancel(String reason) {
        if (this.status == OrderStatus.SHIPPED)
            throw new IllegalStateException("Cannot cancel a shipped order");
        this.status = OrderStatus.CANCELLED;
        domainEvents.add(new OrderCancelledEvent(this.id, reason));
    }

    public Money totalAmount() {
        return items.stream()
            .map(item -> item.unitPrice().multiply(item.quantity()))
            .reduce(new Money(BigDecimal.ZERO, "USD"), Money::add);
    }

    // Spring Data auto-publishes via AbstractAggregateRoot
    public List<DomainEvent> popDomainEvents() {
        List<DomainEvent> events = List.copyOf(domainEvents);
        domainEvents.clear();
        return events;
    }
}

CQRS — Command Query Responsibility Segregation

CQRS separates read operations (Queries) from write operations (Commands). The insight: the model that's optimal for writing data is rarely optimal for reading it. Queries need flat, denormalized projections. Commands need to enforce invariants on aggregates. Trying to do both with one model creates compromise that serves neither well.

CQRS Write vs Read Side
✍ Write Side (Commands)
CreateOrderCommand
↓ CommandHandler
↓ Load Aggregate
↓ Enforce invariants
↓ Save Aggregate
↓ Publish DomainEvent
Normalized DB, full aggregate, transactional
📖 Read Side (Queries)
GetOrderSummaryQuery
↓ QueryHandler
↓ Optimized read model
↓ Return flat DTO
 
 
Denormalized views, projections, cached, no transactions
CQRS with separate read/write modelsJava
// ── WRITE SIDE ──
// Command — immutable, describes intent
public record CreateOrderCommand(
    CustomerId customerId,
    List<CreateOrderCommand.Item> items
) {
    public record Item(ProductId productId, int quantity) {}
}

@Service
@RequiredArgsConstructor
public class CreateOrderCommandHandler {
    private final OrderRepository orderRepo;

    @Transactional
    public OrderId handle(CreateOrderCommand cmd) {
        // Load, validate, persist aggregate
        Order order = Order.create(cmd.customerId(), buildItems(cmd));
        orderRepo.save(order);
        return order.getId();
    }
}

// ── READ SIDE ──
// Flat projection — optimized for display
public record OrderSummaryDto(
    String orderId,
    String customerEmail,
    String status,
    BigDecimal totalAmount,
    int itemCount,
    LocalDateTime createdAt
) {}

// Read model — can use native SQL, views, or a separate read DB
@Repository
public interface OrderSummaryRepository extends JpaRepository<OrderSummaryView, String> {
    @Query(value = """
        SELECT o.id, u.email, o.status,
               SUM(oi.unit_price * oi.quantity) as total,
               COUNT(oi.id) as item_count,
               o.created_at
        FROM orders o
        JOIN users u ON o.customer_id = u.id
        JOIN order_items oi ON o.id = oi.order_id
        WHERE o.customer_id = :customerId
        GROUP BY o.id, u.email, o.status, o.created_at
        ORDER BY o.created_at DESC
        """, nativeQuery = true)
    List<OrderSummaryDto> findSummariesByCustomer(String customerId);
}
CQRS doesn't require separate databases. The simplest form is just separate read and write service methods that use different query strategies. The more advanced form uses a separate read database (replicated or event-sourced). Start simple — split read/write services and interfaces. Add separate read models only when you have real query performance problems.

Event-Driven Architecture

In a request-driven system, services call each other directly. In an event-driven system, services publish events that other services react to. The publisher doesn't know or care who's listening. This creates loose coupling at the cost of eventual consistency.

Spring Events — In-Process

Spring Application EventsJava
// Domain event
public record OrderPlacedEvent(String orderId, String customerId,
                                BigDecimal total, Instant occurredAt) {}

// Publisher — in the command handler or aggregate
@Service
@RequiredArgsConstructor
public class OrderService {
    private final ApplicationEventPublisher eventPublisher;

    @Transactional
    public Order placeOrder(PlaceOrderRequest req) {
        Order order = processOrder(req);
        // Published AFTER transaction commits — use @TransactionalEventListener
        eventPublisher.publishEvent(
            new OrderPlacedEvent(order.getId(), order.getCustomerId(),
                                 order.getTotal(), Instant.now()));
        return order;
    }
}

// Listener — @TransactionalEventListener fires AFTER the publishing transaction commits
@Component
public class OrderEventHandlers {

    @TransactionalEventListener(phase = TransactionPhase.AFTER_COMMIT)
    @Async  // don't slow down the order transaction
    public void onOrderPlaced(OrderPlacedEvent event) {
        // Now it's safe to send email — order is definitely in the DB
        emailService.sendOrderConfirmation(event.customerId(), event.orderId());
    }

    @TransactionalEventListener(phase = TransactionPhase.AFTER_COMMIT)
    @Async
    public void updateAnalytics(OrderPlacedEvent event) {
        analyticsService.recordSale(event.total());
    }
}
Use @TransactionalEventListener, NOT @EventListener for domain events. Regular @EventListener fires during the transaction — if the email service fails, it rolls back your order. @TransactionalEventListener(AFTER_COMMIT) fires only after your transaction succeeds. If the email fails, the order is safe. This is the correct pattern.

Transactional Outbox Pattern

In-process events don't survive application crashes between commit and publish. For guaranteed cross-service delivery, use the Transactional Outbox: write events to a database table in the same transaction as your domain write, then a separate process reads and publishes them to Kafka/RabbitMQ.

Outbox pattern implementationJava
// Outbox event entity
@Entity
@Table(name = "outbox_events")
public class OutboxEvent {
    @Id private String id = UUID.randomUUID().toString();
    private String aggregateType;   // "ORDER"
    private String aggregateId;     // orderId
    private String eventType;       // "OrderPlacedEvent"
    private String payload;         // JSON
    private boolean published = false;
    private Instant createdAt = Instant.now();
}

// Write to outbox IN THE SAME TRANSACTION as the domain write
@Transactional
public Order placeOrder(PlaceOrderRequest req) {
    Order order = processOrder(req);
    orderRepo.save(order);

    // Atomically write the outbox event — same transaction, same DB
    outboxRepo.save(new OutboxEvent("ORDER", order.getId(),
        "OrderPlacedEvent", toJson(order)));

    return order;  // transaction commits both order AND outbox entry
}

// Separate polling job — reads unprocessed outbox events and publishes to Kafka
@Scheduled(fixedDelay = 1000)
@Transactional
public void processOutbox() {
    List<OutboxEvent> pending = outboxRepo.findByPublishedFalse();
    for (OutboxEvent event : pending) {
        kafkaTemplate.send(event.getEventType(), event.getPayload());
        event.setPublished(true);
    }
    // Even if Kafka publish fails, outbox row is not marked published
    // — will be retried on next poll
}

Monolith → Microservices Evolution

The most dangerous architecture mistake is starting with microservices. The second most dangerous is staying with a monolith too long. Understanding the right moment and the right path to decompose is a senior engineering skill.

The Strangler Fig Pattern

Strangler Fig — Incremental Migration
Phase 1: Put a proxy/gateway in front of the monolith
Phase 2: Extract a bounded context to a new service → route its traffic through the proxy
Phase 3: Gradually migrate more contexts → old monolith "strangles"
Phase 4: When monolith is empty → decommission
Key principle: monolith and new services run in parallel. No big-bang rewrite.

Decomposition Signals

Signal Team size exceeds the 2-pizza rule

When more than 8–10 engineers work on the same codebase, coordination overhead dominates. Different teams need to own different services with independent deployment. Microservices enable team autonomy first, technical scaling second.

Signal Scaling requirements diverge

Your catalog needs 10 replicas during browsing peaks; your checkout needs 50 replicas during sales; your admin needs 1 replica always. Scaling the whole monolith for one component is wasteful. Extract the high-scale component first.

Signal Deployment coupling causes outages

A bug in the notification module takes down the checkout flow because they deploy together. Each bounded context should be deployable without affecting others.

Anti-Signal Starting a new project

Starting with microservices means distributed transactions, service discovery, inter-service auth, and network debugging before you've validated the product. Build a modular monolith first. Extract services when you have real evidence of need. Amazon, Netflix, and Uber all started as monoliths.

API Gateway & Service Discovery

When you have multiple services, clients shouldn't talk to each one directly. The API Gateway is the single entry point — it handles routing, authentication, rate limiting, and protocol translation.

Spring Cloud Gateway configurationJava
// pom.xml dependency
// spring-cloud-starter-gateway

// application.yml — declarative routing
spring:
  cloud:
    gateway:
      routes:
        - id: order-service
          uri: lb://order-service    // lb:// = load-balanced via Eureka
          predicates:
            - Path=/api/orders/**
          filters:
            - RewritePath=/api/orders/(?<path>.*), /${path}
            - name: CircuitBreaker
              args:
                name: orderService
                fallbackUri: forward:/fallback/orders
            - name: RequestRateLimiter
              args:
                redis-rate-limiter.replenishRate: 100
                redis-rate-limiter.burstCapacity: 200

        - id: product-service
          uri: lb://product-service
          predicates:
            - Path=/api/products/**

      default-filters:
        - AddResponseHeader=X-Gateway-Version, 1.0
        - name: Retry
          args:
            retries: 3
            statuses: BAD_GATEWAY, SERVICE_UNAVAILABLE

Service Discovery with Eureka

Eureka Server setupJava
// pom.xml: spring-cloud-starter-netflix-eureka-server

@SpringBootApplication
@EnableEurekaServer
public class ServiceRegistryApplication {
    public static void main(String[] args) {
        SpringApplication.run(ServiceRegistryApplication.class, args);
    }
}

// application.yml
server:
  port: 8761
eureka:
  client:
    register-with-eureka: false  // don't register with itself
    fetch-registry: false
Service auto-registrationJava
// pom.xml: spring-cloud-starter-netflix-eureka-client

// application.yml
spring:
  application:
    name: order-service   // this is the service name in Eureka

eureka:
  client:
    serviceUrl:
      defaultZone: http://localhost:8761/eureka/
  instance:
    prefer-ip-address: true
    instance-id: ${spring.application.name}:${random.value}

// That's it — @SpringBootApplication auto-discovers the client starter
// The service registers itself on startup and sends heartbeats every 30s
Load-balanced RestClientJava
// Spring Cloud LoadBalancer automatically resolves service name to actual IP
@Configuration
public class ServiceClientConfig {

    @Bean
    @LoadBalanced  // enables service-name resolution
    public RestClient.Builder restClientBuilder() {
        return RestClient.builder();
    }

    @Bean
    public RestClient productServiceClient(RestClient.Builder builder) {
        return builder
            .baseUrl("http://product-service")  // Eureka resolves this
            .build();
    }
}

// Usage in service — no hardcoded URLs
@Service
@RequiredArgsConstructor
public class OrderEnrichmentService {
    private final RestClient productServiceClient;

    public ProductDto getProduct(String productId) {
        return productServiceClient.get()
            .uri("/api/products/{id}", productId)
            .retrieve()
            .body(ProductDto.class);
    }
}

Resilience — Circuit Breakers & Retry

In a distributed system, services fail. A slow downstream service can exhaust your thread pool and take down your entire application through cascading failures. The circuit breaker pattern prevents this.

Circuit Breaker State Machine
CLOSED
Requests flow normally
Tracking failure rate
Failure rate > 50% → →→→ ← ← Success rate improves
OPEN
All requests fail fast
Fallback response returned
After wait duration → →→→
HALF-OPEN
Let probe requests through
Measuring recovery
Resilience4j — Circuit Breaker + Retry + TimeoutJava
// pom.xml: spring-boot-starter-aop + resilience4j-spring-boot3

// application.yml
resilience4j:
  circuitbreaker:
    instances:
      inventoryService:
        sliding-window-size: 10
        failure-rate-threshold: 50        # open after 50% failures
        wait-duration-in-open-state: 10s
        permitted-number-of-calls-in-half-open-state: 3
        slow-call-duration-threshold: 2s  # calls > 2s count as failures
        slow-call-rate-threshold: 80

  retry:
    instances:
      inventoryService:
        max-attempts: 3
        wait-duration: 500ms
        retry-exceptions:
          - java.net.ConnectException
          - java.util.concurrent.TimeoutException

  timelimiter:
    instances:
      inventoryService:
        timeout-duration: 3s

---
@Service
public class InventoryService {

    @CircuitBreaker(name = "inventoryService", fallbackMethod = "getStockFallback")
    @Retry(name = "inventoryService")
    @TimeLimiter(name = "inventoryService")
    public CompletableFuture<Integer> getStockLevel(String productId) {
        return CompletableFuture.supplyAsync(() ->
            inventoryClient.getStock(productId));
    }

    // Fallback — called when circuit is OPEN or all retries exhausted
    public CompletableFuture<Integer> getStockFallback(String productId, Exception ex) {
        log.warn("Inventory service unavailable for {}: {}", productId, ex.getMessage());
        return CompletableFuture.completedFuture(-1);  // -1 = unknown stock
    }
}

// Monitoring — Resilience4j exposes Actuator endpoints
// GET /actuator/circuitbreakers  → state of all circuit breakers
// GET /actuator/retryevents      → retry statistics

Bulkhead Pattern

Thread pool bulkhead — isolate failure domainsJava
// Without bulkhead: one slow service exhausts the shared thread pool
// With bulkhead: each service gets its own isolated thread pool

resilience4j:
  bulkhead:
    instances:
      paymentService:
        max-concurrent-calls: 10       # max 10 parallel calls
        max-wait-duration: 100ms       # fail fast if no thread available

@CircuitBreaker(name = "paymentService")
@Bulkhead(name = "paymentService", type = Bulkhead.Type.THREADPOOL)
public CompletableFuture<PaymentResult> processPayment(PaymentRequest req) {
    return CompletableFuture.supplyAsync(() -> paymentGateway.charge(req));
    // Runs in the paymentService thread pool — can't starve other services
}

Distributed Tracing

In a microservices system, a single user request may pass through 5–10 services. When it's slow or failing, which service is the problem? Distributed tracing answers this by propagating a correlation ID (trace ID) through every service call and collecting timing data at each hop.

Micrometer Tracing + ZipkinJava
// pom.xml
// spring-boot-starter-actuator
// micrometer-tracing-bridge-otel (OpenTelemetry bridge)
// opentelemetry-exporter-zipkin

// application.yml
management:
  tracing:
    sampling:
      probability: 1.0  # trace 100% of requests (use 0.1 in production)
  zipkin:
    tracing:
      endpoint: http://zipkin:9411/api/v2/spans

// Spring Boot 3 auto-configures tracing — no code changes needed.
// Every incoming HTTP request gets a traceId + spanId added to MDC.
// RestClient / WebClient automatically propagate the trace headers.

// Manual span creation for important operations
@Service
@RequiredArgsConstructor
public class InventoryCheckService {
    private final Tracer tracer;

    public boolean checkAvailability(String productId) {
        Span span = tracer.nextSpan().name("inventory.check").start();
        try (Tracer.SpanInScope ws = tracer.withSpan(span)) {
            span.tag("product.id", productId);
            boolean available = inventoryRepo.isAvailable(productId);
            span.tag("available", String.valueOf(available));
            return available;
        } catch (Exception e) {
            span.error(e);
            throw e;
        } finally {
            span.end();
        }
    }
}
Correlation ID in logs: Micrometer Tracing automatically puts traceId and spanId into MDC (Mapped Diagnostic Context), so your logs include them automatically. In Logback/Log4j2 pattern: %X{traceId}. This means you can search all logs from a single user request across all services using the trace ID. This is non-negotiable in microservices production operations.

Saga Pattern — Distributed Transactions

In a monolith, a database transaction is atomic. In microservices, a business operation may span multiple services and databases — there is no distributed transaction that spans them all (two-phase commit is available but slow and fragile). The Saga pattern manages consistency through a sequence of local transactions with compensating rollbacks.

Saga — Order Placement Flow
1. Create Order
Order Service
2. Reserve Stock
Inventory Service
3. Charge Payment
Payment Service
4. Confirm Order
Order Service
If Payment fails → Compensating transactions:
Cancel Order
Order Service
Release Stock
Inventory Service
Payment already
failed — no action
Choreography vs Orchestration Sagas: In choreography, each service listens for events and reacts autonomously — no central coordinator. Easy to start but hard to understand the full flow. In orchestration, a central Saga Orchestrator sends commands and handles the compensating flow — easier to understand, debug, and monitor. Use Axon Framework, Temporal, or Conductor for production saga orchestration.
Knowledge Check
Your e-commerce platform has OrderService and InventoryService. When an order is placed, you need to reserve inventory. The reservation involves a database write in InventoryService. If it fails, the order must be rolled back. How should you handle this in a microservices architecture?
Use a distributed two-phase commit (2PC) transaction spanning both services
Call InventoryService synchronously and wrap the entire operation in @Transactional
Implement a Saga: OrderService creates a PENDING order, publishes OrderCreatedEvent, InventoryService reserves stock and emits StockReservedEvent OR StockReservationFailedEvent, OrderService listens and confirms or cancels accordingly
Use a shared database between services so one transaction can span both
Correct. There is no distributed transaction that works reliably across microservices. 2PC has serious availability problems (both services must be up) and creates long-lived locks. Sharing a database couples services at the infrastructure level — the whole point of microservices is independent data ownership. The Saga pattern uses a sequence of local transactions with event-driven coordination. Each step is a local transaction; failures trigger compensating actions (release stock, cancel order). The trade-off is eventual consistency: there's a brief window where the order exists but stock isn't reserved yet.

Interview Questions

Q: What is the Dependency Rule in Clean Architecture and why does it matter?
The Dependency Rule states that source code dependencies must point only inward — toward higher-level policies (domain logic). Nothing in the inner layers can know anything about the outer layers. Concretely: your domain entities don't know about JPA. Your use cases don't know about Spring. Your business logic doesn't import javax.persistence. Why it matters: the domain can be tested with plain Java unit tests, without spinning up any infrastructure. The infrastructure can be replaced (swap PostgreSQL for MongoDB, REST for gRPC) without changing any business logic. Most applications break this rule by having JPA annotations in domain objects — the domain then depends on the persistence framework, not the other way around.
Q: What is CQRS and when would you use it?
CQRS separates the model for reading data (queries) from the model for writing data (commands). The motivation: the write model needs to enforce domain invariants — it loads aggregates, validates business rules, persists changes. The read model needs to return flat, denormalized views quickly — complex JOINs, cached projections, maybe a different database. Trying to serve both with the same model creates compromises. Use CQRS when: your read patterns are significantly different from your write patterns; you need to scale reads and writes independently; read query complexity is making your domain model messy. Don't use CQRS for simple CRUD — the complexity overhead isn't worth it until you have real pain points.
Q: What is the Transactional Outbox Pattern and what problem does it solve?
The problem: you need to atomically update your database AND publish a message to Kafka. If you save to DB, then crash before publishing, the message is lost. If you publish first, then fail saving to DB, you have an event with no corresponding data. The Outbox pattern: in the same database transaction as your domain write, also insert an event row into an outbox table. A separate polling process (or CDC like Debezium) reads new outbox rows and publishes them to Kafka, marking them as published. Because the domain write and the outbox write are in the same transaction, they either both succeed or both fail. The async publisher handles the Kafka delivery reliably with retries. This gives you exactly-once database semantics with at-least-once message delivery.
Q: What are the operational costs of microservices that are often underestimated?
Teams underestimate: (1) Network failures become a normal failure mode — every service call can fail, timeout, or return stale data; (2) Distributed tracing and correlation logging are required to debug anything; (3) Service contracts become API contracts — breaking changes require versioning and coordination; (4) Testing is harder — you need contract tests, integration environments with all services, and chaos engineering; (5) Data consistency is eventual — no distributed transactions, requiring sagas, outbox patterns, and idempotent consumers; (6) Deployment complexity — orchestration (Kubernetes), service discovery, health checks, rolling deployments; (7) Security — every service-to-service call needs authentication; (8) Developer experience degrades — running 10 services locally is painful. These costs are justified when the team size and deployment frequency demand them. Not before.
Q: How does the Strangler Fig pattern work for migrating a monolith?
Named after the strangler fig tree that grows around a host tree and eventually replaces it. Steps: (1) Put an API gateway or proxy in front of the monolith — all traffic passes through it; (2) Identify a bounded context with clear inputs and outputs — often a subdomain with high scale requirements or a team wanting autonomy; (3) Build the new microservice in isolation, test it; (4) Route that context's traffic through the proxy to the new service — monolith still running; (5) After stabilization, delete the corresponding code from the monolith; (6) Repeat for the next context. The monolith gradually shrinks until nothing is left. This avoids big-bang rewrites (which almost always fail), keeps production running throughout, and allows rollback at any stage.
Q: Explain how a circuit breaker prevents cascading failures.
Cascading failure scenario: Service A calls Service B. Service B is slow (1s response instead of 100ms). Service A's thread pool fills up waiting. Service A becomes slow. Service C (calling A) also fills up. The entire system degrades from one slow dependency. Circuit breaker solution: A circuit breaker wraps calls to Service B and tracks the failure/slow call rate. When failures exceed the threshold (e.g., 50%), the circuit OPENS — subsequent calls fail immediately with a fallback response (cached data, empty list, error message). Service A's threads are freed up. After a wait period, the circuit goes HALF-OPEN — a few probe requests go through. If they succeed, the circuit CLOSES and normal operation resumes. The key insight: failing fast (immediately returning a fallback) is better than timing out slowly. It keeps your own service healthy even when dependencies fail.
Q: What is the Bulkhead pattern and how does it complement circuit breakers?
Named after the bulkheads in ships that compartmentalize the hull — a leak in one compartment doesn't sink the ship. In software: a bulkhead isolates failure domains by giving each downstream dependency its own resource pool (thread pool or semaphore). Without bulkhead: if Payment Service is slow, it can consume all 200 threads in your shared pool — Inventory Service, Product Service, and everything else start failing too, even if they're perfectly healthy. With bulkhead: Payment Service gets 20 threads, Inventory gets 20 threads, Product gets 20 threads. Payment being slow only affects payment calls — the rest of the system continues normally. Circuit breakers and bulkheads are complementary: circuit breakers stop calling failing services, bulkheads ensure one service's failure doesn't consume resources needed by other services.
🏛️

Section 07 Complete

You now understand how to think architecturally — from layered to clean to hexagonal, from monolith to microservices, from synchronous to event-driven. More importantly, you understand the tradeoffs and costs of each approach, which is what separates an engineer from someone who blindly follows trends.