Forem: Nuno Silva

The Pokémon Pattern - Gotta catch 'em all

Nuno Silva — Fri, 20 Mar 2026 17:14:21 +0000

Picture this. It is late on a Friday afternoon. You are integrating an external API — a third-party service your application depends on. You know the call might fail, but you are not sure exactly what exceptions the client library throws, and digging through their documentation is a rabbit hole you don't have time for right now.

So you take the shortcut that lives in every codebase. You wrap the call in a try, write catch (Exception e), log the error, return false, and move on. The PR is approved. The app doesn't crash. Everyone goes home.

What nobody realises — not you, not the reviewer, not the team — is that you just introduced a silent killer into the codebase.

Not because the pattern is lazy. But because it is plausible. It looks like defensive programming. It feels like resilience. It will pass code review, pass QA, and pass every test you throw at it — right up until the moment a half-executed database transaction quietly corrupts your data, and nobody can figure out why because the logs just say "Something went wrong".

This is the Pokémon Pattern. catch (Exception e). Gotta catch 'em all.

The Illusion of Resilience

The core mistake of the Pokémon Pattern is that it treats two fundamentally different categories of problems as if they were the same thing.

The first category is expected business failures. The user typed the wrong password. The account has insufficient funds. The item is out of stock. These are not bugs. They are normal, anticipated branches of your application's logic — outcomes the business has already thought about and has rules for handling.

The second category is system panics. The database connection died mid-transaction. A third-party API returned malformed JSON. A NullPointerException was thrown halfway through processing an order. These are not business outcomes. They are the application telling you something has gone structurally wrong.

When we write catch (Exception e), we throw a blanket over both. Here is what that blanket looks like in practice:

try {
    checkoutService.processOrder(order);
    return true;
} catch (Exception e) {
    logger.error("Something went wrong", e);
    return false;
}

This looks safe. But consider what actually happens when processOrder throws a NullPointerException halfway through — say, after deducting inventory but before recording the transaction.

If this method is annotated with @Transactional, the behaviour is particularly insidious. By swallowing the exception, we signal to the framework that the method completed successfully. Spring sees no exception, so it commits. The partial state — inventory reduced, transaction unrecorded — is now permanently written to the database. There is no rollback. There is no error. There is just quietly corrupted data, and a log file that says "Something went wrong".

We didn't engineer resilience. We engineered a zombie application. It is dead, but it is still walking around.

Better Logs Don't Fix Bad Architecture

When developers realise their logs are full of useless generic errors, the instinct is usually to write better messages inside the catch block. Add more context. Log the order ID. Log the user. Make the string more descriptive.

But a better string does not stop data corruption. It just makes the corruption easier to read about after the fact.

The real problem is not the log message. It is that the catch block is in the wrong place, doing the wrong job, for the wrong reason. No amount of string interpolation fixes that.

There is a better architecture — and it starts with drawing a hard line between a failure and a panic.

Handle Failures, Let Panics Crash

1. Return Failures — Don't Throw Them

An out-of-stock item is not an exceptional circumstance. It is a standard, predictable branch of business logic. Using the throw keyword to handle it is reaching for the wrong tool.

When we throw a business exception, we create an invisible GOTO statement inside our own codebase. The method signature promises nothing about what might happen. Callers have to guess — or read the implementation — or hope the documentation is accurate.

The fix is to make the failure explicit in the method signature using a Result type. Java doesn't have one natively, but a custom wrapper or sealed interfaces achieve the same effect — the compiler forces the caller to handle the failure rather than ignore it:

// Before: the signature lies — it secretly throws ItemUnavailableException
public OrderConfirmation submitOrder(User user, Cart cart) { ... }

// After: the signature is honest — the compiler forces the caller to handle it
public Result<OrderConfirmation, OrderError> submitOrder(User user, Cart cart) { ... }

When failures are returned as values rather than thrown as exceptions, they become part of the contract. The caller cannot ignore them. The try/catch block disappears from the domain logic entirely — not because we removed it, but because there is nothing left to catch.

2. Let Panics Crash

If the database goes offline, or a variable is unexpectedly null, checkoutService has no idea how to recover from that — and it should not try. Attempting to catch and absorb a panic does not resolve it. It just allows the application to execute more code on top of a broken foundation.

Let the thread crash. Let the panic bubble up immediately, before it has a chance to touch another line of business logic. A fast, loud, localised failure is always preferable to a slow, silent, system-wide one.

3. Catch Panics at the Boundary — Nowhere Else

Letting panics bubble up does not mean users see raw stack traces. It means we catch them in exactly one place: the outer edge of the application.

// The only place catch (Exception e) belongs: the absolute boundary
@ExceptionHandler(Exception.class)
public ResponseEntity<ErrorResponse> handleGlobalPanic(Exception e) {
    pagerDutyService.triggerAlarm(e);
    return new ResponseEntity<>(new ErrorResponse("An unexpected error occurred"), HttpStatus.INTERNAL_SERVER_ERROR);
}

This boundary catcher does three things well. It catches every unhandled panic in one predictable location. It alerts the on-call engineer immediately, with the full stack trace intact. And it returns a 500 to the caller — which is the honest and correct response. Something did go wrong on the server, and the caller deserves to know that. What we avoid is returning a cheerful 200 with a hidden error payload, which would be the HTTP equivalent of the Pokémon Pattern itself.

The catch (Exception e) pattern is not banned. It is relocated — from scattered throughout the domain logic to a single, honest, explicitly-purposed boundary.

A try/catch block is not a band-aid. It is a highly specific control flow tool — and like any tool, its value depends entirely on using it in the right place, for the right job.

When we catch generic exceptions to keep the application alive, we are making a trade we rarely intend: the illusion of uptime in exchange for the integrity of our data. We are hiding the exact stack traces we will desperately need when something goes wrong. We are teaching the system to lie about its own health.

Exceptions are for exceptional circumstances. Business rules are for business logic. Drawing a hard line between the two is not a theoretical nicety — it is what makes the difference between a system that fails loudly and honestly, and one that silently corrupts your database at 3am on a Saturday.

Precision Data Access in Spring Data JPA: A Guide to Projections

Nuno Silva — Fri, 20 Feb 2026 12:22:25 +0000

As an application matures, its domain model inevitably grows heavier. What started as a simple Order entity evolves into a dense, interconnected graph of LineItem, CustomerProfile, PaymentHistory, and ShippingManifest objects. That complexity is necessary — your core business logic genuinely needs it. But it creates a hidden tax on every read operation in your system.

The problem isn't the entity itself. The problem is using the same heavy entity fetch for every use case, regardless of what the caller actually needs.

Consider an Order entity with a dozen relationships. An invoice generation process needs all of it: the full entity graph, all lazy-loaded associations, the complete picture. A status monitoring job sitting next to it needs two fields: orderId and status. If both use findById() or findAll(), the monitoring job is doing the exact same work as the invoice process — hydrating a full entity graph, triggering Hibernate's dirty-tracking machinery, and risking N+1 fetches on relationships it never touches.

Spring Data JPA Projections solve this directly. They let you define exactly what data a caller needs and have the repository return precisely that — nothing more. This guide covers the projection types available in Spring Data JPA, when each is the right fit, and where each one breaks down.

The Problem Projections Solve

Before looking at the solutions, it's worth being precise about what's actually expensive when you fetch a full entity unnecessarily.

When Hibernate loads a managed entity, it does more than execute a SELECT. It:

Registers the entity in the first-level cache, holding a reference for the duration of the Session
Takes a state snapshot for dirty checking, so it can detect changes and generate targeted UPDATEs on flush
Initialises proxy objects for every lazy relationship declared on the entity, even ones the caller will never touch

That machinery is essential when you're going to modify the entity and persist changes. When you're only reading two fields and discarding the result, you're paying for infrastructure you don't use.

There's also the SQL itself. A standard findAll() on a complex entity selects every mapped column. The difference between a full entity fetch and a projection is not just what arrives in your JVM — it's what travels across the wire on every single row:

// Full entity fetch
DB ──► id, status, created_at, updated_at,
       customer_id, billing_addr, shipping_addr,
       currency, discount, tax_rate, subtotal,
       total, notes, internal_ref, ...           ──► JVM
       (28 columns the caller will never read)

// Projection fetch (OrderStatusSummary)
DB ──► id, status                                ──► JVM

Projections fix both problems. They scope the SQL to the columns you actually need, and because the result isn't a managed entity, Hibernate skips the lifecycle overhead entirely.

The Version Decision: One Rule

Before covering the projection types, here's the rule stated plainly so you can skip to what applies to your stack:

Java 16+: Use Records. They're stable, concise, and compiler-enforced.
Java 11 or below: Use class-based DTOs, with Lombok's @Value if it's on your classpath.
Java 14–15: Records exist behind --enable-preview, but preview features carry compatibility risk. Treat your stack as Java 11 for production purposes.

Both approaches use the same JPQL constructor expression syntax and generate identical SQL. The difference is purely in how much boilerplate you write to define the projection type.

Projection Types at a Glance

Type	Java Version	Boilerplate	Type Safety	Best For
Interface projection	Any	None	Compile-time	Simple root-entity field subsets
Class-based DTO	Java 8+	High (or Lombok)	Runtime (JPQL string)	Multi-table joins on Java 11 or below
Record projection	Java 16+ (stable)	None	Runtime (JPQL string)	Multi-table joins on Java 16+
Dynamic projection	Any	None	Runtime	Consolidating multiple fetch shapes into one repository method

1. Interface Projections

The simplest form of projection is an interface that declares getter methods for the fields you want. Spring Data JPA generates a proxy at runtime that maps the query result to your interface.

Suppose your status monitoring job only needs the order ID and its current status:

public interface OrderStatusSummary {
    Long getId();
    String getStatus();
}

You use it directly as a return type in your repository:

public interface OrderRepository extends JpaRepository<Order, Long> {

    List<OrderStatusSummary> findByStatus(String status);
}

Spring inspects the interface at startup, derives the required fields from the getter names, and generates a SQL query scoped to those columns:

-- What Spring actually generates — not SELECT *
SELECT o.id, o.status
FROM orders o
WHERE o.status = ?

No joins fired speculatively. No unmapped columns transferred. No entity lifecycle initialised. For a monitoring job running against a table with millions of rows, the difference in data transferred and query execution time is measurable.

Note: The SQL Spring generates is derived directly from your getter names. If a getter name doesn't match a mapped field on the entity, Spring will silently return null for that field rather than throwing an error. Any time a projection returns unexpected nulls, enable SQL logging and verify the generated query — the mismatch is usually obvious from the column list.

Where Interface Projections Break Down

Interface projections work cleanly when the fields you need map directly to columns on the root entity. They get dangerous when you need data from related entities.

You can traverse relationships using nested interfaces:

public interface OrderStatusSummary {
    Long getId();
    String getStatus();
    CustomerSummary getCustomer();  // nested projection

    interface CustomerSummary {
        String getName();
    }
}

This looks clean, but it hides a serious trap. When you back this with a derived query method like findByStatus(), Spring Data does not generate a join. Instead, it fetches the root projection and then issues a separate SELECT for every single row's related entity to populate the nested proxy — the exact N+1 problem this approach was supposed to avoid.

If you need a nested interface projection, you must back it with an explicit @EntityGraph or a @Query with a JOIN. The derived query and the entity graph version are not equivalent:

// ❌ Generates N+1 silently.
// Spring fetches orders, then fires one SELECT per row to load the customer.
public interface OrderRepository extends JpaRepository<Order, Long> {
    List<OrderStatusSummary> findByStatus(String status);
}

// ✅ Forces a join — one query, no surprises.
public interface OrderRepository extends JpaRepository<Order, Long> {

    @EntityGraph(attributePaths = {"customer"})
    List<OrderStatusSummary> findByStatus(String status);
}

Enable SQL logging (spring.jpa.show-sql=true) and verify the generated output any time you introduce a nested interface projection. If you see repeated identical SELECTs, the join isn't being applied.

The Open Projection Trap

Interface projections support SpEL expressions via @Value, which lets you compute derived fields from the entity:

public interface OrderStatusSummary {
    @Value("#{target.customer.firstName + ' ' + target.customer.lastName}")
    String getCustomerFullName();
}

This looks convenient but completely defeats the purpose of using a projection. To evaluate a SpEL expression, Spring must load the entire entity graph into memory — including all lazy relationships — before computing the result. You get none of the column scoping or lifecycle overhead savings that make projections valuable.

If you need a computed field, derive it in SQL instead using a dedicated DTO or Record with an explicit constructor expression:

// A dedicated projection for this specific read shape
public record OrderCustomerSummary(Long id, String customerFullName) {}

@Query("""
    SELECT new com.yourapp.dto.OrderCustomerSummary(
        o.id,
        CONCAT(c.firstName, ' ', c.lastName)
    )
    FROM Order o
    JOIN o.customer c
    WHERE o.status = :status
""")
List<OrderCustomerSummary> findWithCustomerName(@Param("status") String status);

The database computes the concatenation, only the two resulting values cross the wire, and there's no entity graph loaded anywhere.

There's also a subtler issue: interface projections are backed by a dynamic proxy, which means every field access goes through a method dispatch rather than a direct field read. For most use cases this cost is negligible. For a batch job processing millions of rows in a tight loop, it's worth being aware of.

2. Class-Based DTO Projections

Before Java Records existed, the standard approach was a plain class with a constructor matching the fields you wanted to project. This is the right choice for any Spring Boot 2.7.x application, and it remains fully supported across all modern Spring Boot releases if Records aren't an option.

The pattern relies on JPQL constructor expressions. You write a regular class with a matching constructor, and JPQL maps the query result directly into it:

public class OrderStatusSummary {

    private final Long id;
    private final String status;

    // Constructor must match the field order in the JPQL SELECT clause exactly
    public OrderStatusSummary(Long id, String status) {
        this.id = id;
        this.status = status;
    }

    public Long getId() { return id; }
    public String getStatus() { return status; }
}

The repository uses a @Query with a constructor expression:

public interface OrderRepository extends JpaRepository<Order, Long> {

    @Query("SELECT new com.yourapp.dto.OrderStatusSummary(o.id, o.status) FROM Order o WHERE o.status = :status")
    List<OrderStatusSummary> findByStatus(@Param("status") String status);
}

The generated SQL is scoped to the columns you declare, with no entity lifecycle overhead:

SELECT o.id, o.status
FROM orders o
WHERE o.status = ?

The result is a plain Java object with no Hibernate proxy, no dirty tracking, and no connection to the Session. It behaves identically to a Record projection in terms of what Hibernate does — the only difference is the boilerplate you write to define it.

The Boilerplate Problem at Scale

The weakness of class-based DTOs becomes apparent when your domain has many different projection shapes. Each one requires a separate class with a constructor, getters, and — if you need equality or debugging — equals(), hashCode(), and toString(). On Java 11 and older, that's a meaningful amount of code to maintain.

The common mitigation before Records was Lombok:

@Value  // generates constructor, getters, equals, hashCode, toString — immutable by default
public class OrderStatusSummary {
    Long id;
    String status;
}

@Value gives you a functionally immutable class with zero hand-written boilerplate, and it works on Java 8+. If your team is already using Lombok and you're not yet on Java 16, this is the practical equivalent of a Record projection.

One important caveat: class-based DTOs require a fully-qualified class name in the JPQL constructor expression. If you rename or move the class, the @Query annotation won't fail at compile time — it will fail at runtime when the JPQL is parsed. This is a known fragility of the constructor expression approach, and it applies equally to Records.

3. Java Record Projections

If you're on Java 11 or below, skip this section — the class-based DTO approach above is the direct equivalent.

Record projections are the modern replacement for class-based DTOs. The projection shape is declared as a Record, which gives you an immutable data carrier with a canonical constructor, equals(), hashCode(), and toString() generated by the compiler — no Lombok required:

public record OrderStatusSummary(Long id, String status) {}

The repository usage is identical to the class-based approach:

public interface OrderRepository extends JpaRepository<Order, Long> {

    @Query("SELECT new com.yourapp.dto.OrderStatusSummary(o.id, o.status) FROM Order o WHERE o.status = :status")
    List<OrderStatusSummary> findByStatus(@Param("status") String status);
}

The generated SQL and Hibernate behaviour are the same. The advantage is purely in the declaration: a one-line Record replaces a full DTO class, and the compiler enforces immutability rather than relying on convention.

Records also compose cleanly with Java Streams. Because a Record is a transparent data carrier with value-based equality, you can group, deduplicate, and compare projection results without implementing equals() yourself — something class-based DTOs require explicit attention to get right.

4. Dynamic Projections

If you have a single entity accessed by many different callers, each needing a different slice of data, you end up with either a proliferation of repository methods or the temptation to return the full entity everywhere and let each caller ignore what it doesn't need. Dynamic Projections offer a third option: one repository method that accepts the desired return type as a parameter.

public interface OrderRepository extends JpaRepository<Order, Long> {

    <T> Optional<T> findById(Long id, Class<T> type);
}

Each caller passes the projection type it needs:

// Invoice process: needs the full managed entity
Order fullOrder = orderRepository.findById(orderId, Order.class).orElseThrow();

// Status monitor: needs only the lightweight summary
OrderStatusSummary summary = orderRepository.findById(orderId, OrderStatusSummary.class).orElseThrow();

Spring inspects the Class<T> argument at runtime and generates the appropriate query — full entity fetch for Order.class, scoped column fetch for a projection interface or Record.

Where Dynamic Projections Break Down

The tradeoff is type safety. Because the return type is generic, the compiler cannot verify at build time that a given Class<T> argument is a valid projection for this entity. Passing an incompatible type compiles fine and fails at runtime. In a large codebase with many callers, that's a meaningful operational risk.

Dynamic Projections are a reasonable fit when you have a small, stable set of well-known projection types and the convenience of a single method is genuinely valuable. When the set of projection types is large or evolving, explicit repository methods with named return types are safer — the compiler enforces correctness, and the method signatures serve as documentation.

What Projections Are Not For

Projections are a read-only tool. They give you a scoped view of data for retrieval; they have no path back to the persistence context for writes.

If your caller needs to load an entity, modify it, and save it, use a standard entity fetch — that's exactly what Hibernate's dirty checking and transaction management are built for. The overhead that projections eliminate is only overhead when you're not using it. For writes, you need the full entity lifecycle.

The mental model that ties this to the N+1 problem: use projections for the same category of operations where you'd otherwise reach for a native SQL query returning a DTO. When you only need data — no state changes, no lifecycle — projections let you stay in the JPA abstraction while still being precise about what you ask the database for.

Summary

Projections don't replace entities — they complement them by giving you a precise, read-only view of your data without loading what you don't need.

Use interface projections when the fields you need map directly to the root entity and you want minimal boilerplate. Always back nested interface projections with @EntityGraph or an explicit @Query join — derived queries will silently generate N+1.
Never use SpEL @Value expressions in interface projections. They force a full entity load and eliminate every performance benefit projections provide. Push computed fields into SQL instead.
Use class-based DTO projections (with Lombok's @Value if available) on Java 11 or below. This is the workhorse for Spring Boot 2.7.x applications.
Use Record projections on Java 16+. Same SQL, same behaviour — less boilerplate, compiler-enforced immutability.
Use dynamic projections when consolidating multiple fetch patterns behind a single repository method, with the understanding that type safety is enforced at runtime, not compile time.
Don't use projections for writes. Any operation that modifies state and persists it should use the full managed entity.

Apply This Today

Open your APM tool — Datadog, New Relic, or whatever you're running in production — and filter for your highest-frequency read queries. Alternatively, turn on spring.jpa.show-sql=true locally and hit your most heavily used GET endpoints.

For each one, ask: is the repository method returning a full entity, and is the caller actually using all of it? If the answer is no, you have a projection candidate. Pick the heaviest offender, replace the entity return type with a scoped interface or DTO projection, and measure the query execution time and memory allocation before and after. The delta is usually immediate and significant.

The N+1 Problem in Spring Data JPA: A Practical Guide

Nuno Silva — Thu, 19 Feb 2026 17:00:27 +0000

Spring Data JPA solves a real problem. It lets you model your domain as an object graph and persist it to a relational store without hand-writing every SQL statement. For writes, this is largely a good trade. For reads, it can quietly destroy your application's performance in ways that are nearly invisible until you're already in production.

This guide explains why, with a specific focus on the N+1 query problem—the most common and costly consequence of naive JPA usage—and walks through the practical fixes available in the Spring ecosystem.

The Impedance Mismatch

The core tension between Hibernate and your database comes down to how each paradigm navigates data.

Object-oriented code thinks in graphs. An Order holds a reference to a Customer, which holds a collection of Address objects. You traverse the graph by following pointers: order.getCustomer().getBillingAddress().

Relational databases think in sets. Data lives in flat, normalized tables. You retrieve related data by joining those sets—a fundamentally different operation, executed in a single pass by the query planner.

Hibernate's job is to bridge these paradigms. The problem is that this mapping is lossy. When you write code that ignores the underlying execution model, Hibernate generates SQL that is technically correct but operationally catastrophic.

The N+1 Problem

Consider a realistic scenario: you're building an internal admin endpoint that returns a list of open support tickets, along with the name of the assigned agent and their current workload (total tickets assigned to them).

Your entities look like this:

@Entity
public class Ticket {
    @Id
    private Long id;
    private String subject;
    private String status;

    @ManyToOne(fetch = FetchType.LAZY)
    @JoinColumn(name = "agent_id")
    private Agent assignedAgent;
}

@Entity
public class Agent {
    @Id
    private Long id;
    private String name;

    @OneToMany(mappedBy = "assignedAgent")
    private List<Ticket> tickets;
}

And your service layer looks like this:

public List<TicketSummaryDTO> getOpenTicketSummaries() {
    List<Ticket> tickets = ticketRepository.findByStatus("OPEN");

    return tickets.stream()
        .map(ticket -> {
            Agent agent = ticket.getAssignedAgent();  // lazy load #1

            return new TicketSummaryDTO(
                ticket.getId(),
                ticket.getSubject(),
                agent.getName(),
                agent.getTickets().size()  // lazy load #2
            );
        })
        .toList();
}

This code is readable and structurally sensible. But with FetchType.LAZY (the JPA default), here is what Hibernate actually executes against the database — assuming 200 open tickets assigned across 40 agents:

-- [Query 1] The initial fetch — the "1" in N+1.
-- Returns 200 rows. Hibernate now holds 200 Ticket proxies,
-- each with an uninitialized assignedAgent reference.
SELECT * FROM tickets WHERE status = 'OPEN';


-- [Queries 2–201] Hibernate fires one query per Ticket to resolve each assignedAgent proxy.
-- This happens inside the stream(), the moment agent.getName() is called.
SELECT * FROM agents WHERE id = 12;  -- ticket 1  → resolves agent 12
SELECT * FROM agents WHERE id = 47;  -- ticket 2  → resolves agent 47
SELECT * FROM agents WHERE id = 12;  -- ticket 3  → resolves agent 12 AGAIN
                                     -- The 1st-level cache would prevent this only if agent 12
                                     -- was already fully loaded before this proxy was accessed.
                                     -- In a stream(), access order depends on the data,
                                     -- so duplicate fetches are common.
-- ... repeated for all 200 tickets


-- [Queries 202–241] For each unique agent encountered, Hibernate loads their entire
-- ticket collection to satisfy the .size() call — transferring records you will
-- immediately discard, just to get a count.
-- Hibernate's @LazyCollection(LazyCollectionOption.EXTRA) can replace this with a
-- clean SELECT COUNT(...), but writing the native query (Solution #3) is the
-- superior architectural choice when you're building a DTO anyway.
SELECT * FROM tickets WHERE agent_id = 12;  -- loads every ticket for agent 12
SELECT * FROM tickets WHERE agent_id = 47;  -- loads every ticket for agent 47
-- ... repeated for all 40 unique agents


-- Grand total: 1 + 200 + 40 = 241 queries minimum.
-- In the worst case (no cache hits on agents): 1 + 200 + 200 = 401 queries.

Why This Hurts More Than You Expect

The database itself is rarely the bottleneck here. A primary-key lookup with a good index is sub-millisecond. The damage comes from the network round-trip on each query.

Assume a conservative 1ms round-trip between your app server and database—realistic for services in the same VPC:

Scenario	Queries	Network Overhead
Naive lazy loading	~401	~401ms
With Session-level deduplication	~241	~241ms
Single optimized query	1	~1ms

That 400ms is pure blocking wait—your application thread is parked while TCP packets traverse the wire. At low traffic, this is survivable. Under load, with dozens of concurrent requests hitting the same endpoint, you exhaust your thread pool and your HikariCP connection pool simultaneously. What looked like a 400ms endpoint becomes a 4,000ms one under modest concurrency.

When Lazy Loading Is the Right Default

The N+1 example above involves bulk reads, which makes it tempting to conclude that lazy loading is always wrong. It isn't. Lazy loading was designed for exactly this scenario: a single-entity fetch with conditional business logic.

Consider a ticket detail endpoint that checks whether the assigned agent is overloaded — but only if the ticket is high priority:

@Transactional(readOnly = true)
public TicketDetailDTO getTicketDetail(Long ticketId) {
    Ticket ticket = ticketRepository.findById(ticketId)
        .orElseThrow(() -> new EntityNotFoundException("Ticket not found"));

    if (ticket.getPriority() == Priority.HIGH && ticket.getStatus() != Status.RESOLVED) {
        // Only reached for high-priority tickets
        Agent agent = ticket.getAssignedAgent();   // 1 PK lookup
        int workload = agent.getTickets().size();  // 1 PK lookup

        if (workload > 20) {
            escalationService.flag(ticket);
        }
    }

    return mapToDTO(ticket);
}

For any ticket that isn't high priority, this executes exactly one query. The lazy loads inside the if block never fire. For high-priority tickets, it executes three queries total: the ticket, the agent, and the agent's ticket collection — each a direct primary-key lookup against an indexed column, costing roughly 1ms each.

This is not the N+1 problem. N+1 occurs when the same lazy load fires repeatedly inside a loop over a collection. Here, each query fires at most once per request. Three indexed PK lookups at ~3ms total is not a performance issue — it's the system working correctly. You don't need a JOIN FETCH here. A heavy join on every single request would be a pessimisation, not an optimisation.

Why `FetchType.EAGER` Is Still the Wrong Reflex

A developer unfamiliar with the problem might look at the two lazy loads and reach for FetchType.EAGER to eliminate them:

@ManyToOne(fetch = FetchType.EAGER)  // ❌
@JoinColumn(name = "agent_id")
private Agent assignedAgent;

This is a global change to the entity. It doesn't just affect this endpoint — it forces a join on every call to findById, findAll, findByStatus, and every other repository method in the application. Every ticket fetch now loads the agent and their entire ticket collection, regardless of whether the caller needs it. You've optimised for the rarest branch and penalised everything else.

The correct mental model is this: FetchType.LAZY is the right default for single-entity conditional access. JOIN FETCH and batch loading are the right tools for collections and loops. Let the lazy loads fire when they're cheap and conditional; reach for explicit fetch strategies only when you know you're operating at scale.

One boundary condition worth knowing: lazy loads require an active Hibernate Session. If a detached entity is passed across a layer boundary and a proxy is accessed outside the original @Transactional context, Hibernate will throw a LazyInitializationException. That exception is not a signal to add EAGER — it's a signal that your transaction boundary is in the wrong place.

Solutions, In Order of Preference

1. JPQL JOIN FETCH

If you know at query time that you'll need the relationship, tell Hibernate to fetch it in the initial query. The cleanest way to do this in Spring Data JPA is a @Query annotation with JOIN FETCH:

@Repository
public interface TicketRepository extends JpaRepository<Ticket, Long> {

    @Query("SELECT t FROM Ticket t JOIN FETCH t.assignedAgent WHERE t.status = :status")
    List<Ticket> findByStatusWithAgent(@Param("status") String status);
}

This produces a single SQL join:

SELECT t.*, a.*
FROM tickets t
INNER JOIN agents a ON t.agent_id = a.id
WHERE t.status = 'OPEN';

One round-trip. The tradeoff is result set size: a join duplicates the agent's columns across every ticket row they're assigned to. For OLTP workloads with reasonable cardinality, this is almost always the right trade.

Caveat: If you add a second JOIN FETCH on a collection in the same query (e.g., fetching both assignedAgent and some other @OneToMany), Hibernate will throw a MultipleBagFetchException. You can work around this by converting List to Set on your collections, but be aware this changes equality semantics and can cause subtle bugs. When you need multiple collections, batch loading is usually the better fit.

2. `@BatchSize` and `@EntityGraph`

When joins create unacceptable result set inflation, batch loading is a better fit. Hibernate's @BatchSize annotation tells it to replace individual SELECT ... WHERE id = ? queries with SELECT ... WHERE id IN (?, ?, ...) batches:

@Entity
public class Agent {
    @Id
    private Long id;

    @OneToMany(mappedBy = "assignedAgent")
    @BatchSize(size = 50)
    private List<Ticket> tickets;
}

Instead of N queries, Hibernate issues ceil(N / batchSize) queries. For 40 agents with a batch size of 50, that's 1 query instead of 40.

Alternatively, @EntityGraph lets you declare fetch behavior at the query site without modifying the entity itself—useful when different callers need different fetch strategies on the same entity:

@Repository
public interface TicketRepository extends JpaRepository<Ticket, Long> {

    @EntityGraph(attributePaths = {"assignedAgent"})
    List<Ticket> findByStatus(String status);
}

This generates a left outer join under the hood, similar to JOIN FETCH, but without requiring a custom JPQL query.

Note: For our specific DTO example, this only solves the first N+1 (Tickets → Agents). Calling .size() on the agent's tickets will still trigger lazy queries unless you also include "assignedAgent.tickets" in the graph — which risks a Cartesian product and likely defeats the purpose.

3. Write the Query Yourself

For read-heavy endpoints that aggregate data or return partial projections, bypass Hibernate's entity model entirely. You're building a response DTO—you don't need dirty tracking, optimistic locking, or a managed entity lifecycle. You're paying for all of that overhead and throwing it away.

Spring Data JPA supports projecting directly into a DTO via constructor expressions in JPQL:

@Query("""
    SELECT new com.yourapp.dto.TicketSummaryDTO(
        t.id,
        t.subject,
        a.name,
        COUNT(all_t.id)
    )
    FROM Ticket t
    JOIN t.assignedAgent a
    LEFT JOIN Ticket all_t ON all_t.assignedAgent = a
    WHERE t.status = :status
    GROUP BY t.id, t.subject, a.id, a.name
""")
List<TicketSummaryDTO> findOpenTicketSummaries(@Param("status") String status);

Or drop to JdbcTemplate entirely for anything complex enough that JPQL becomes harder to read than SQL:

@Repository
public class TicketQueryRepository {

    private final JdbcTemplate jdbc;

    public List<TicketSummaryDTO> findOpenTicketSummaries() {
        String sql = """
            SELECT
                t.id,
                t.subject,
                a.name            AS agent_name,
                COUNT(all_t.id)   AS agent_workload
            FROM tickets t
            JOIN agents a ON t.agent_id = a.id
            LEFT JOIN tickets all_t ON all_t.agent_id = a.id
            WHERE t.status = 'OPEN'
            GROUP BY t.id, t.subject, a.id, a.name
        """;

        return jdbc.query(sql, (rs, rowNum) -> new TicketSummaryDTO(
            rs.getLong("id"),
            rs.getString("subject"),
            rs.getString("agent_name"),
            rs.getInt("agent_workload")
        ));
    }
}

This pushes aggregation into the database where it belongs, transfers only the columns you need, and involves zero Hibernate machinery.

Diagnosing Your Own Codebase

Enable Hibernate's SQL logging and exercise your endpoints:

# application.properties
spring.jpa.show-sql=true
spring.jpa.properties.hibernate.format_sql=true
logging.level.org.hibernate.SQL=DEBUG
logging.level.org.hibernate.orm.jdbc.bind=TRACE  # shows bind parameters

Look for sequences of structurally identical queries differing only in a bind parameter value. That's the N+1 pattern. In a busy system it's immediately obvious—you'll see the same SELECT repeated dozens or hundreds of times in a single request trace.

For production diagnostics, pg_stat_statements (Postgres) or the slow query log (MySQL) will surface high-call-count queries that look cheap individually but dominate aggregate database load. A query that takes 0.5ms but executes 50,000 times per minute is a far bigger problem than a slow query that runs once.

When to Use JPA vs. When to Write SQL

The useful mental model isn't "Hibernate is bad"—it's that Hibernate has a domain where it excels and a domain where it actively works against you.

Use JPA for transactional writes. Loading an entity, applying business logic, and persisting changes is exactly what Hibernate was designed for. It handles dirty checking, optimistic locking via @Version, and transaction demarcation cleanly.

Use SQL for reads that aggregate, project, or span multiple tables. The object-graph abstraction is a poor fit for set-oriented retrieval. Forcing it results in either N+1 queries or increasingly complex fetch annotations that are just obfuscated SQL with more failure modes.

This maps to a pattern the CQRS literature has formalized: your read model and your write model have different requirements. You don't need to adopt full CQRS to internalize that lesson. Even within a standard Spring layered application, being intentional about when you lean on JPA and when you reach for JdbcTemplate will significantly improve both performance and maintainability.

Summary

The N+1 problem isn't a Hibernate bug—it's a consequence of using an abstraction without understanding its execution model.

Audit read-heavy endpoints with SQL logging before they reach production; treat N+1 as a build-breaking issue, not something to revisit later.
Use JOIN FETCH or @EntityGraph when you know a relationship will be traversed at query time.
Use @BatchSize when joins produce excessive result set inflation or when fetching multiple collections.
Use JPQL constructor expressions or JdbcTemplate for aggregations and reporting queries—don't hydrate entities you're going to immediately project away.
In production, watch query count alongside query duration; a fast query executed 500 times per request is still a catastrophic query.

The next layer of this problem is understanding what your database actually does with the SQL Hibernate generates. EXPLAIN ANALYZE in Postgres—or EXPLAIN FORMAT=JSON in MySQL—will show you whether your joins are using indexes, what the estimated vs. actual row counts look like, and where the query planner is making bad decisions. That's where the real tuning happens.

The 1:1 Myth: Why Your CPU Can Handle 400 Threads on 4 Cores

Nuno Silva — Fri, 13 Feb 2026 18:19:29 +0000

Why This Article Exists

If you're a backend engineer working with Java, Python, Go, or any language with traditional OS threads, you've likely encountered the advice to keep thread pool sizes conservative—often close to your CPU core count.

This advice appears in Stack Overflow answers and some documentation. It sounds reasonable. But it's based on a fundamental misunderstanding of how CPUs and threads actually work.

The confusion stems from vocabulary: The word "thread" refers to two completely different things—a hardware thread (a physical execution unit in your CPU) and a software thread (a data structure in your operating system). Engineers often conflate these, leading to catastrophically undersized thread pools.

This article will dismantle the 1:1 Myth—the belief that you need one software thread per hardware thread—and show you why your 4-core CPU can comfortably handle 400 threads without breaking a sweat.

We'll cover the mechanics, the math, and the real-world constraints. By the end, you'll understand why most production systems are running at 10% capacity while paying for 100%.

The Experiment

Open your terminal right now. Type top or htop.

Look at the number of tasks running. Even on a modest laptop, you'll see 2,000+ threads competing for CPU time.

Now look at your core count. Maybe it's 8. Maybe it's 16.

If the "1 thread per core" rule were gospel, your computer should have exploded during boot. Yet here we are.

Now check your production infrastructure. How many threads is your API server running? If you're like most backend teams, you've capped your thread pool to match your core count—8 threads for an 8-core container.

You are likely running at 10% capacity while paying for 100%.

The Parking Lot Fallacy

There's a widespread fear in backend engineering: the fear of Oversubscription.

We look at our infrastructure and mentally map it to a parking lot. 8 cores = 8 parking spaces. Creating more than 8 threads feels dangerous—like a traffic jam waiting to happen. Context switching. Thrashing. Performance degradation.

So we cap our pools. We feel "safe."

This safety is an illusion. And it's expensive.

The fundamental error is treating software threads like physical objects that occupy space. Your CPU is not a parking lot with limited spots.

Your CPU is a high-speed revolving door.

Part I: The Foundation

Decoupling the Worker from the Work

To fix your throughput, you must understand the distinction between two fundamentally different concepts that share the word "thread":

1. The Hardware Thread (The Worker)

This is physical silicon. Whether it's a core or a hyper-thread (SMT), a hardware thread is an execution unit—the actual circuitry that runs instructions.

It is finite. Governed by the laws of physics. If you have 8 cores, you can execute exactly 8 instructions at any given nanosecond. No more.

2. The Software Thread (The Work)

A software thread is not a physical thing. In Linux, it's a task_struct. In the JVM, it's a wrapper around an OS kernel thread. It consists of:

Stack Memory (~1MB) for function call frames and local variables
Instruction Pointer (current position in the code)
Register State (CPU's working data—intermediate calculations, pointers, flags)

Creating a software thread does not occupy a core. It creates a candidate for execution—a piece of work that wants to use a core.

Part II: The Illusion

How 4 Cores Run 100 Threads

They don't. They take turns.

The OS Scheduler is the traffic cop. It uses Time Slicing:

Thread A runs on Core 1 for a few microseconds
The scheduler pauses Thread A and saves its state to RAM (Context Switch)
Thread B loads onto Core 1 and runs
Repeat, thousands of times per second

To the human eye, Threads A and B appear to run simultaneously. To the CPU, they are strictly sequential.

Visualising Time Slicing on a Single Core:

Time →
┌─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┐
│  A  │  B  │  C  │  A  │  D  │  B  │  A  │  C  │  ...
└─────┴─────┴─────┴─────┴─────┴─────┴─────┴─────┘
 2μs   2μs   2μs   2μs   2μs   2μs   2μs   2μs

Each thread runs for microseconds (μs), then pauses (context switch).
The CPU rotates through all READY threads, creating the illusion
that all 4 threads are running "at the same time."

This is why your laptop with 8 cores can juggle 2,000 threads without breaking a sweat.

The Context Switching Tax

"But wait," you ask, "isn't context switching expensive? Shouldn't I minimise threads to avoid that overhead?"

If your threads were encoding video, mining cryptocurrency, or running scientific simulations—yes, context switching would hurt.

But your threads are probably waiting for databases and APIs.

Part III: The Key Insight

Your Threads Are Not Working—They're Waiting

This is the single most important concept in thread pool sizing.

In 99% of business applications (REST APIs, microservices, web backends), threads spend the vast majority of their lifetime in one state:

BLOCKED (Waiting for I/O)

The Thread Lifecycle

A thread exists in one of three states:

RUNNING: Actively using the CPU
READY: Waiting for the CPU to be free
BLOCKED: Waiting for I/O (Database, Network, File System)

Visualising a Typical Web Request Thread:

HTTP Request Arrives
        ↓
    [RUNNING] ──→ Parse JSON, Route Request (2ms)
        ↓
    [BLOCKED] ──→ Database Query (98ms) ← CPU is FREE
        ↓
    [RUNNING] ──→ Serialize Response (2ms)
        ↓
    Response Sent

Key Insight: During the BLOCKED phase, this thread is 
"off the silicon"—it's in RAM, consuming ZERO CPU cycles.
The CPU is completely free to work on other threads.

The 98/2 Rule

Consider a typical HTTP request in a Spring Boot API:

Total Response Time: 100ms

2ms: CPU work (parsing JSON, routing, business logic, serialization)
98ms: Waiting for the database query to return

During that 98ms, the thread is in the BLOCKED state. It is off the silicon. It resides in memory, but it consumes zero CPU cycles.

If you follow the 1:1 rule (8 threads for 8 cores) and all 8 threads hit the database simultaneously—which happens constantly—your CPU sits idle.

You have 0% utilisation because all your workers are standing around waiting for the database.

Meanwhile, there are 100 requests queued up that could be parsed, routed, and submitted to the database right now—if only you had threads available.

You are paying for a Ferrari and leaving it in the driveway because you're afraid to scratch the paint.

Part IV: The Math

The Blocking Coefficient

To maximize throughput, you must oversubscribe. You need enough threads to ensure that every time one thread blocks, another is ready to jump onto the CPU.

We can derive the optimal pool size using a heuristic based on Little's Law:

N_threads = N_cpu × (1 + Wait_Time / Compute_Time)

This is a heuristic, not a law. It assumes stable workload characteristics and minimal contention.

Key Definitions:

Compute_Time: Actual CPU work (parsing, logic, serialization)
Wait_Time: Time spent blocked on I/O, including:
- Database queries
- Network latency (external APIs, microservice calls)
- Disk I/O (file reads/writes)
- Lock contention (waiting for synchronized blocks)

The ratio Wait_Time / Compute_Time is your Blocking Coefficient—the multiplier that tells you how many threads you need to keep your CPUs saturated.

Let's apply this to our web API scenario:

Given:

N_cpu = 4 cores
Wait Time = 98ms (database)
Compute Time = 2ms (actual CPU work)

Calculate the Ratio:

Wait_Time / Compute_Time = 98 / 2 = 49

Optimal Thread Pool Size:

N_threads = 4 × (1 + 49) = 4 × 50 = 200

You need 200 software threads to keep 4 hardware threads fully utilised.

If you capped your pool at 4 threads, you are artificially bottlenecking your throughput by 50x.

A Concrete Example

Let's say your API can handle 10,000 requests per second with the optimal pool size (200 threads).

With the 1:1 mapping (4 threads), you'd be limited to approximately 200 requests per second—not because your CPU is slow, but because you're refusing to use it.

Part V: The Real Limits

This Doesn't Mean Threads Are Free

You cannot spawn infinite threads. You are bounded by three constraints:

1. Memory Constraints

Each Java thread reserves stack space:

200 threads ≈ 200MB of RAM (manageable)
1,000 threads ≈ 1GB of RAM (still fine)
10,000 threads ≈ 10GB of RAM (problematic)

Stack memory is the primary constraint in traditional threading models. This is why Virtual Threads (Project Loom) were invented—they use growable stacks with much smaller footprints.

2. The Thrashing Point

If your workload suddenly shifts and all 200 threads become CPU-bound simultaneously (e.g., they stop waiting and start doing heavy computation), the OS will choke on context switching.

The scheduler will spend more time swapping threads than actually running them. This is thrashing, and it kills performance.

Technical note: Thrashing also occurs when threads do very brief work between blocks. If a thread wakes up, does 1 microsecond of work, then blocks again, the context switch overhead (saving/loading state) exceeds the actual execution time. The CPU spends more time managing threads than running them.

Cache Pollution: Context switching isn't just about saving registers—it destroys the L1/L2 CPU cache. When Thread B loads onto a core, it has to fetch its data from RAM (slow, ~100ns) because Thread A filled the cache with its own data. This cache pollution is the hidden tax of oversubscription. With excessive context switching, your CPU can spend more time waiting for RAM than executing instructions.

3. Downstream Bottlenecks (The Real Limit)

Increasing your thread pool size does not magically increase system capacity. You're often bounded by downstream constraints:

What threads don't fix:

Database connection pool limits
External API rate limits
Lock contention
Network bandwidth
Downstream service capacity

What oversized pools can cause:

Database connection exhaustion
Cascading failures in microservices
Amplified lock contention
Queueing in unexpected places

Critical coordination points:

Your thread pool must align with your DB connection pool
HTTP client pools must be sized appropriately
Rate limiters and circuit breakers should be in place
Downstream services need capacity for your load

The formula gives you the thread count needed to saturate your CPU. But production systems are rarely CPU-bound—they're usually constrained by databases, downstream APIs, or other shared resources.

Before increasing threads, verify your bottleneck is actually CPU starvation.

The Safeguard: Proper Workload Classification

The formula works because of the assumption that threads are I/O-bound. If that assumption breaks, the formula breaks.

CPU-Bound Workload (video encoding, cryptography, scientific computing):

Threads ≈ Cores
Maybe cores × 1.5 if you want some overlap during cache misses

I/O-Bound Workload (web APIs, database-backed services, microservices):

Threads = Cores × (1 + Wait/Compute)
Often 10x-50x the core count

Mixed Workload:

Measure your actual wait/compute ratio
Test empirically
Monitor CPU utilisation and response times

Part VI: Practical Takeaways

How to Right-Size Your Thread Pool

Profile your application
- Measure actual CPU time vs. wait time for typical requests
- Use APM tools (New Relic, Datadog) or profilers (JFR, async-profiler)
Apply the formula
- N_threads = N_cpu × (1 + Wait / Compute)
- Start conservative, then increase
Load test and monitor
- Watch CPU utilisation (should be 70-90% under load)
- Watch response times (should remain stable as load increases)
- Watch thread pool queue depth (should stay near zero)
Iterate
- If CPU is maxed but latency is good: You're optimal
- If CPU is low and latency is increasing: Not enough threads or downstream bottleneck
- If CPU is oscillating wildly: Possible thrashing (too many threads for the workload)

Monitoring Signals

What to watch:

CPU utilisation: Should be high (70-90%) under load if properly sized
Thread pool queue depth: Should stay near zero; growth indicates undersized pool or downstream bottleneck
Response time percentiles (p50, p95, p99): Should remain stable as load increases
Context switch rate: Dramatic increases may indicate thrashing
GC pauses (JVM): Excessive pauses may indicate memory pressure from too many threads
Database wait times: High waits suggest downstream, not thread pool, is the bottleneck

Symptom diagnosis:

Low CPU + rising latency → Pool too small OR downstream bottleneck (check DB connection pool, external API limits)
High CPU + unstable latency → Possible thrashing or CPU-bound workload with too many threads
High CPU + stable latency → You're optimal
Queue depth growing → Undersized pool or downstream can't keep up

Common Thread Pool Sizes for I/O-Bound Services

CPU Cores	Typical Wait/Compute Ratio	Optimal Threads
4	10:1 (DB-backed API)	40-50
4	50:1 (High-latency external APIs)	200+
8	20:1 (Microservice)	160-180
16	10:1 (Standard web app)	160-200

Note: These numbers assume a traditional blocking I/O model with OS threads (Java platform threads, Python threads, etc.). If using Virtual Threads (Java 21+), these memory-based limits disappear—you can run 100k+ virtual threads per JVM, and the optimal pool size becomes effectively unlimited for I/O-bound workloads.

Platform Caveats

Python and the GIL

Python's Global Interpreter Lock (GIL) prevents true parallel execution of Python bytecode across threads.

Implications:

CPU-bound Python threads do not execute in parallel
I/O-bound Python threads still benefit from concurrency (I/O operations release the GIL)
Thread pool sizing for CPU-bound Python work doesn't follow the same rules as JVM or Go
Consider multiprocessing (separate processes) for CPU-bound parallelism

What About Reactive/Async?

Reactive frameworks (WebFlux, Vert.x, Node.js) take a different approach: they use event loops with a small thread pool (often matching cores) and non-blocking I/O.

Instead of blocking threads during waits, they register callbacks and release the thread immediately. This achieves high concurrency with minimal threads.

Trade-off: Significantly more complex programming model. You give up the straightforward imperative style for callback hell or coroutine complexity.

With Virtual Threads (Java 21+), you get the throughput of async with the simplicity of blocking code. Virtual threads are so cheap (100k+ per JVM) that you can write natural, sequential code while achieving the concurrency of reactive frameworks.

Conclusion

Stop treating your CPU core count as a hard limit for your thread pool.

It's a baseline, not a ceiling.

The "safety" of 1:1 thread-to-core mapping is an illusion that leaves your infrastructure dramatically underutilised.

The Rules

For CPU-Bound tasks: Threads ≈ Cores
For I/O-Bound tasks: Trust the math. Oversubscribe aggressively.

Your CPU is designed to juggle. It's built for time-slicing. It wants to handle hundreds of threads.

Just make sure the rest of your system can keep up.

Breaking the Sequential Ceiling: High-Performance Concurrency in Java 8 Enterprise Systems

Nuno Silva — Wed, 11 Feb 2026 19:03:12 +0000

Modern applications call five, ten, even twenty downstream services per request. Virtual threads (Java 21) and reactive frameworks solve this elegantly — but in 2026, a significant portion of enterprise Java still runs on Java 8 and Spring Boot 2.7. Whether it's regulatory constraints, vendor dependencies, or the sheer inertia of large codebases, upgrading the JVM isn't always an option — and these teams still need practical solutions.

This article shows how to achieve real concurrency gains in legacy Java using the Bulkhead Pattern, explicit thread pool isolation, and CompletableFuture. We'll walk through the theory, then validate it with Project IronThread — a proof-of-concept that achieves a 41% latency reduction by parallelizing previously sequential service calls.

The Sequential Tax

A typical dashboard endpoint might aggregate data from three services:

Service	Latency
User Service	200 ms
Order Service	500 ms
Recommendations Service	1 000 ms

Called sequentially, these produce a hard ceiling of 1 700 ms. Run them in parallel and the total drops to the duration of the slowest call — 1 000 ms:

Strategy	Execution	Total Latency
Sequential	User → Orders → Recs	200 + 500 + 1 000 = 1 700 ms
Parallel	User, Orders, Recs (concurrent)	max(200, 500, 1 000) = 1 000 ms

Under load, this ceiling becomes a wall. With Tomcat's default 200 worker threads and each request taking 1.7 seconds, you can only handle ~117 requests per second before exhausting the thread pool.

The root issue: unnecessary serialization. Three independent network calls are forced to wait on each other.

Reducing latency isn't just about UX — it directly affects scalability. Shorter request durations release threads sooner, increasing effective throughput and reducing queueing delays under sustained load.

The `ForkJoinPool` Trap

The obvious first move is CompletableFuture:

CompletableFuture<String> userF = CompletableFuture.supplyAsync(
    () -> callUserService());

Without an explicit executor, this defaults to ForkJoinPool.commonPool() — a shared pool designed for CPU-bound fork/join tasks, not blocking I/O.

Why This Breaks Down

Shared global resource. The common pool is shared across the entire JVM. One endpoint flooding it with I/O starves everything else.
Sizing mismatch. Default pool size is availableProcessors() - 1. On an 8-core machine, that's 7 threads for the whole application. Ten concurrent dashboard requests create 30 blocking operations against a 7-thread pool.
No isolation. A single misbehaving endpoint degrades the entire system.

ForkJoinPool does provide ManagedBlocker to mitigate blocking scenarios, but it's rarely used in enterprise applications and doesn't address workload isolation.

Virtual threads in Java 21 eliminate this class of problem entirely. For Java 8, the answer is explicit thread pool isolation.

The Bulkhead Pattern

Named after the watertight compartments in a ship's hull, the Bulkhead Pattern dedicates separate thread pools to distinct workload types. Each pool is tuned to its workload characteristics, and failures in one pool can't cascade to others.

Spring's ThreadPoolTaskExecutor provides a clean implementation:

@Bean("ioTaskExecutor")
public Executor ioTaskExecutor() {
    ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
    executor.setCorePoolSize(20);
    executor.setMaxPoolSize(50);
    executor.setQueueCapacity(100);
    executor.setThreadNamePrefix("IO-Pool-");
    executor.setRejectedExecutionHandler(new ThreadPoolExecutor.CallerRunsPolicy());
    executor.initialize();
    return executor;
}

Parameter	Purpose	Sizing Guidance
Core Pool Size	Threads kept alive indefinitely.	For I/O-bound work: `cores × (1 + wait_time / compute_time)`
Max Pool Size	Burst capacity when core threads are busy and the queue is full.	2–3× core size is a reasonable starting point.
Queue Capacity	Buffers tasks before spawning additional threads.	Deep queues smooth transient spikes but increase tail latency. In latency-sensitive systems, prefer bounded queues with an explicit rejection policy.
Rejection Policy	Defines what happens when both the pool and queue are full.	`CallerRunsPolicy` applies back-pressure by running the task on the submitting thread. `AbortPolicy` (the default) throws an exception. Choose based on whether you prefer degraded latency or fast failure.
Thread Name Prefix	Makes thread dumps self-documenting.	Always set this — you'll thank yourself during production debugging.

Tip — Observability: ThreadPoolTaskExecutor exposes its active count, queue size, and pool size at runtime. In production, wire these metrics to Micrometer / Spring Boot Actuator to detect saturation before it becomes a problem.

Orchestrating Parallel Calls with `CompletableFuture`

With isolated pools in place, orchestration is straightforward. For two futures, thenCombine works well:

CompletableFuture<String> userF = CompletableFuture.supplyAsync(
    () -> callUserService(), ioTaskExecutor);

CompletableFuture<String> ordersF = CompletableFuture.supplyAsync(
    () -> callOrderService(), ioTaskExecutor);

CompletableFuture<DashboardData> result = userF.thenCombine(ordersF,
    (user, orders) -> new DashboardData(user, orders));

For three or more independent futures, CompletableFuture.allOf() combined with thenApply() is cleaner — we'll see this in the case study below.

The execution model:

supplyAsync() submits tasks to ioTaskExecutor and returns immediately. The calling thread does not block.
Worker threads execute the service calls in parallel.
Non-async continuations like thenCombine() run on whichever thread completes the last required stage — no additional thread is spawned, no unnecessary context switch.

Handling Partial Failure

Distributed systems fail routinely. The key is failing gracefully:

CompletableFuture<String> recsF = CompletableFuture.supplyAsync(
        () -> callRecommendationsService(), ioTaskExecutor)
    .exceptionally(ex -> "Recommendations Unavailable");

.exceptionally() transforms a failure into degraded success. The user still gets their profile and orders — just without recommendations. No exceptions propagate, no cascading failures.

Case Study: Project IronThread

Project IronThread applies these principles to a dashboard aggregation service. Three mock services simulate realistic downstream behaviour:

Service	Latency	Failure Rate
User Service	200 ms	0%
Order Service	500 ms	0%
Recommendations	1 000 ms	20%

The Executor Configuration

The teaching example above used corePoolSize=20 for a production system handling multiple workload types. IronThread uses a smaller pool — it's a single-service proof-of-concept:

@Bean("ironThreadExecutor")
public Executor taskExecutor() {
    ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
    executor.setCorePoolSize(10);
    executor.setMaxPoolSize(20);
    executor.setQueueCapacity(500);
    executor.setThreadNamePrefix("IronThread-");
    executor.initialize();
    return executor;
}

The Async Pipeline

public CompletableFuture<DashboardResult> getDashboardAsync() {
    long start = System.currentTimeMillis();

    CompletableFuture<String> userF = CompletableFuture.supplyAsync(
            downstreamService::getUserDetails, ironThreadExecutor);

    CompletableFuture<String> ordersF = CompletableFuture.supplyAsync(
            downstreamService::getOrders, ironThreadExecutor);

    CompletableFuture<String> recsF = CompletableFuture.supplyAsync(
                    downstreamService::getRecommendations, ironThreadExecutor)
            .exceptionally(ex -> "Recs:Fallback");

    return CompletableFuture.allOf(userF, ordersF, recsF)
            .thenApply(voidResult -> {
                long duration = System.currentTimeMillis() - start;
                return DashboardResult.builder()
                        .userDetails(userF.join())
                        .orders(ordersF.join())
                        .recommendations(recsF.join())
                        .executionTime(duration)
                        .threadName(Thread.currentThread().getName())
                        .build();
            });
}

All three calls fire immediately on the ironThreadExecutor. .exceptionally() on the recommendations future provides graceful degradation. CompletableFuture.allOf() guarantees all futures complete before .thenApply() executes, so the join() calls simply retrieve already-available results — they don't block.

What About Timeouts?

One thing this implementation doesn't handle is timeouts. If a downstream service hangs for 30 seconds, the thread is held indefinitely — which is the same starvation problem we're trying to avoid, just in a different pool.

Java 9+ introduced orTimeout() and completeOnTimeout(), but on Java 8, you'd need a ScheduledExecutorService that completes the future exceptionally after a deadline:

private static final ScheduledExecutorService scheduler =
    Executors.newScheduledThreadPool(1);

public static <T> CompletableFuture<T> withTimeout(
        CompletableFuture<T> future, long timeout, TimeUnit unit) {
    scheduler.schedule(() ->
        future.completeExceptionally(new TimeoutException("Timed out")),
        timeout, unit);
    return future;
}

Note that this simplified version still fires the scheduled task even if the future completes normally (the completeExceptionally call simply returns false on an already-completed future). Production code would typically use whenComplete() to cancel the scheduled task.

This is left out of IronThread's demo code for simplicity, but in production systems, timeout handling is essential.

Benchmark Results

Disclaimer: These measurements are illustrative, not a formal benchmark. They measure single-request latency, not throughput under concurrent load. Runs were executed after JVM warm-up with mocked downstream latency. The goal is to demonstrate the architectural impact of parallelization.

Environment: Apple MacBook Pro M3 Pro (11-core CPU, 18 GB Unified Memory)

===================================================================
Run # | Strategy   | Time (ms)  | Thread Name          | Status    
===================================================================
1     | Blocking   | 1708       | main                 | Success   
2     | Blocking   | 1704       | main                 | Success   
3     | Blocking   | 1712       | main                 | Success   
4     | Blocking   | 1709       | main                 | Success   
5     | Blocking   | 1707       | main                 | Success   
-------------------------------------------------------------------
1     | Async      | 1006       | IronThread-6         | Success   
2     | Async      | 1003       | IronThread-9         | Success   
3     | Async      | 1008       | IronThread-12        | Partial   
4     | Async      | 1004       | IronThread-15        | Success   
5     | Async      | 1009       | IronThread-18        | Success   
===================================================================

Key observations:

41% latency reduction — the direct result of parallelizing three independent calls. Async averages ~1 006 ms (bounded by the slowest call) vs. blocking's ~1 708 ms (sum of all calls). This isn't a novel optimisation; it's the expected outcome once you remove unnecessary sequential execution.
Pool isolation verified. Every async run executes on IronThread-* workers — not on the common pool or Tomcat threads.
Graceful degradation works. Run 3 shows a partial failure — recommendations failed, but the dashboard still loaded with user and order data intact.

A natural next step would be to validate this under concurrent load — simulating 50+ simultaneous requests with a tool like JMeter or wrk to measure throughput, queue saturation, and tail latency behaviour.

Key Takeaways

The 41% latency improvement is the natural result of parallelizing independent calls that were previously sequential. It comes from three deliberate decisions:

Explicit thread pool isolation — avoid ForkJoinPool.commonPool() for blocking I/O.
Parallel execution — use CompletableFuture.allOf() to fire independent calls concurrently.
Graceful degradation — use .exceptionally() to contain failures without cascading.

No Java 21 required. No reactive framework. Just understanding when threads block and respecting pool boundaries.

Independent network calls that can happen in parallel should happen in parallel. The Bulkhead Pattern ensures that doing so doesn't create new failure modes.

Source Code: github.com/nunosilva-dev/iron-thread

Forem: Nuno Silva

The Pokémon Pattern - Gotta catch 'em all

The Illusion of Resilience

Better Logs Don't Fix Bad Architecture

Handle Failures, Let Panics Crash

1. Return Failures — Don't Throw Them

2. Let Panics Crash

3. Catch Panics at the Boundary — Nowhere Else

Precision Data Access in Spring Data JPA: A Guide to Projections

The Problem Projections Solve

The Version Decision: One Rule

Projection Types at a Glance

1. Interface Projections

Where Interface Projections Break Down

The Open Projection Trap

2. Class-Based DTO Projections

The Boilerplate Problem at Scale

3. Java Record Projections

4. Dynamic Projections

Where Dynamic Projections Break Down

What Projections Are Not For

Summary

Apply This Today

The N+1 Problem in Spring Data JPA: A Practical Guide

The Impedance Mismatch

The N+1 Problem

Why This Hurts More Than You Expect

When Lazy Loading Is the Right Default

Why FetchType.EAGER Is Still the Wrong Reflex

Solutions, In Order of Preference

1. JPQL JOIN FETCH

2. @BatchSize and @EntityGraph

3. Write the Query Yourself

Diagnosing Your Own Codebase

When to Use JPA vs. When to Write SQL

Summary

The 1:1 Myth: Why Your CPU Can Handle 400 Threads on 4 Cores

Why This Article Exists

The Experiment

The Parking Lot Fallacy

Part I: The Foundation

Decoupling the Worker from the Work

1. The Hardware Thread (The Worker)

2. The Software Thread (The Work)

Part II: The Illusion

How 4 Cores Run 100 Threads

The Context Switching Tax

Part III: The Key Insight

Your Threads Are Not Working—They're Waiting

The Thread Lifecycle

The 98/2 Rule

Part IV: The Math

The Blocking Coefficient

A Concrete Example

Part V: The Real Limits

This Doesn't Mean Threads Are Free

1. Memory Constraints

2. The Thrashing Point

3. Downstream Bottlenecks (The Real Limit)

The Safeguard: Proper Workload Classification

Part VI: Practical Takeaways

How to Right-Size Your Thread Pool

Monitoring Signals

Common Thread Pool Sizes for I/O-Bound Services

Platform Caveats

Python and the GIL

What About Reactive/Async?

Conclusion

The Rules

Further Reading

Breaking the Sequential Ceiling: High-Performance Concurrency in Java 8 Enterprise Systems

The Sequential Tax

The ForkJoinPool Trap

Why This Breaks Down

The Bulkhead Pattern

Orchestrating Parallel Calls with CompletableFuture

Handling Partial Failure

Case Study: Project IronThread

The Executor Configuration

The Async Pipeline

Why `FetchType.EAGER` Is Still the Wrong Reflex

2. `@BatchSize` and `@EntityGraph`

The `ForkJoinPool` Trap

Orchestrating Parallel Calls with `CompletableFuture`