Forem: sly-the-fox

The 5th Agent Orchestration Pattern: Market-Based Task Allocation

sly-the-fox — Wed, 01 Apr 2026 20:16:30 +0000

Most conversations about agent orchestration patterns settle on the same four: pipeline, supervisor, router, blackboard. Each solves coordination differently. Pipelines chain steps linearly. Supervisors centralize control. Routers classify and dispatch. Blackboards let agents coordinate through shared state without direct communication.

These four cover a lot of ground. But there is a fifth pattern that comes from an older field, and it solves a problem the other four handle poorly: dynamic allocation across heterogeneous agents when cost matters.

The Pattern: Auction-Based Task Allocation

Instead of a supervisor deciding which agent handles a task, you let agents bid on it.

The mechanism works like this. A task enters the system. It gets broadcast to all available agents. Each agent evaluates the task against its own capabilities, current load, and estimated cost, then submits a bid. An auction engine evaluates the bids and assigns the task to the winner.

This is not a theoretical concept. The Consensus-Based Auction Algorithm (CBAA) and its extension, the Consensus-Based Bundling Algorithm (CBBA), have been used in multi-robot task allocation for over a decade. The research originated at CMU and the University of Maryland, primarily for coordinating robot fleets where each unit has different sensors, battery levels, and proximity to targets.

The core idea transfers directly to LLM agent systems. Replace "battery level" with "context window utilization." Replace "sensor capability" with "tool access and specialization." Replace "proximity" with "relevant cached context." The allocation logic is the same.

How It Works in Practice

A market-based orchestration layer has four phases:

1. Broadcast. The system publishes an available task with metadata (type, estimated complexity, required capabilities, deadline).

2. Bid. Each agent that can handle the task submits a bid. The bid includes a capability score (how well the agent matches the task requirements), a load factor (how busy the agent currently is), and a cost estimate (tokens, time, or whatever resource you are optimizing for).

3. Auction. The auction engine scores bids using a weighted function. A simple version: score = capability * (1 - load) / cost. More sophisticated versions factor in deadline urgency, agent track record, and bundle efficiency (assigning related tasks to the same agent to reduce context switching).

4. Assignment. The winning agent gets the task. Other agents free their reserved capacity.

class TaskAuction:
    def __init__(self, agents, weights=None):
        self.agents = agents
        self.weights = weights or {
            "capability": 0.5,
            "availability": 0.3,
            "cost": 0.2
        }

    def broadcast(self, task):
        bids = []
        for agent in self.agents:
            if agent.can_handle(task):
                bids.append({
                    "agent": agent,
                    "capability": agent.assess_capability(task),
                    "load": agent.current_load(),
                    "cost": agent.estimate_cost(task)
                })
        return bids

    def select_winner(self, bids):
        scored = []
        for bid in bids:
            score = (
                self.weights["capability"] * bid["capability"]
                + self.weights["availability"] * (1 - bid["load"])
                - self.weights["cost"] * bid["cost"]
            )
            scored.append((score, bid))
        scored.sort(key=lambda x: x[0], reverse=True)
        return scored[0][1]["agent"] if scored else None

The interesting property is adaptiveness. When the agent pool changes (new agents come online, existing ones hit capacity), the allocation adjusts automatically. No one rewrites routing rules. The market handles it.

When This Pattern Fits

Market-based orchestration works best in specific conditions.

Heterogeneous agent pools. If your agents have meaningfully different capabilities, costs, or specializations, the bidding mechanism surfaces the best match dynamically. A supervisor could do this too, but the supervisor needs to know about every agent's current state. The market distributes that knowledge.

Variable workloads. When task volume and type shift unpredictably, static routing rules break. A market adjusts. During a spike of code review tasks, agents with code analysis capabilities naturally absorb more work because their capability scores are higher for those tasks.

Cost optimization matters. If you are paying per token and different models have different price points, the auction mechanism can factor cost directly into allocation. A GPT-4 class agent might bid high capability but high cost. A smaller model bids lower capability but lower cost. The auction weights decide the tradeoff based on task priority.

Adaptive fallback. A recent paper on the Agent Exchange concept describes a hybrid approach: use auction-based allocation when competition exists (multiple agents can handle the task), fall back to direct assignment when only one agent qualifies. This avoids the overhead of running auctions for tasks with obvious owners.

When This Pattern Does Not Fit

The auction mechanism adds latency. Every task needs a broadcast, a bid collection window, and a scoring pass before work begins. For latency-sensitive workflows where you need a response in milliseconds, this overhead is disqualifying.

Small agent counts make the mechanism pointless. If you have three agents with non-overlapping specializations, a simple router is faster and equally effective. The market shines when there are many potential handlers and the "best" choice depends on runtime conditions.

Deterministic routing requirements also rule it out. If compliance or auditability demands that task type X always goes to agent Y, a market that might route it to agent Z based on load is the wrong pattern. Regulated industries often need predictable routing over optimal routing.

The Bigger Picture

What makes the market pattern interesting is that it distributes the coordination intelligence. In a supervisor pattern, the supervisor carries the full cognitive load of understanding every agent's state and capability. In a market pattern, each agent carries only the knowledge of its own state. The auction engine is stateless. It just scores bids.

This mirrors a pattern from distributed systems design. The more you centralize decision-making, the more fragile the center becomes. Markets distribute decisions to the edges, where the relevant information already lives.

The four established patterns (pipeline, supervisor, router, blackboard) handle most production use cases. But as agent pools grow larger and more heterogeneous, and as cost optimization becomes a real constraint rather than a nice-to-have, market-based allocation becomes worth considering. The robotics community figured this out years ago. The LLM agent community is starting to catch up.

I write about agent architecture and systems design on Substack. Building Sigil, a Python framework for structured LLM workflows.

Got My 39-Agent System Audited Live. Here's What the Maturity Scorecard Revealed.

sly-the-fox — Fri, 27 Mar 2026 15:30:00 +0000

Last night at AI Tinkerers, someone audited my multi-agent system in front of the room. Not a demo. Not a presentation. An actual architectural assessment using knowledge-graph analysis, scored against established maturity frameworks.

The system has 39 specialized agents across five categories, defined governance protocols, six workflow types with 8-17 steps each, and a dedicated evolution loop for continuous improvement. I've been building it for months.

The Audit

Marcus Waldman ran his iConsult tool against the full architecture. The tool maps agent definitions, workflow structures, and coordination patterns into a knowledge graph, then scores them against patterns from Arsanjani and Bustos's work on agentic architectural patterns.

Here's what came back:

Category	Rating
Coordination & Planning	Established
Explainability & Compliance	Emerging
Robustness & Fault Tolerance	Not Started
Human-Agent Interaction	Emerging
Agent-Level Capabilities	Not Started
System-Level Infrastructure	Not Started
Continuous Improvement	Emerging

Failure chain coverage: 20%. One of five steps in the automated recovery chain existed. The rest were missing entirely.

A system with 39 agents, a three-tier supervisor hierarchy, and dedicated auditor and sentinel agents scored "Not Started" on robustness. That is the gap most teams aren't talking about.

Arsanjani's 6 Levels of Agent Maturity

Ali Arsanjani (Google Cloud) published a maturity model that maps where agent systems actually fall on a capability spectrum. Most of us think we're higher than we are.

Level 0: No Agents. Traditional software. No autonomous components.

Level 1: Single Agent with Tools. One LLM with function calling. This is where most "agentic" products actually live. The agent can use tools but has no planning, no memory beyond the conversation, and no coordination with other agents.

Level 2: Multi-Agent Coordination. Multiple agents with defined roles and handoff patterns. Supervisor or router dispatches work. This is where the orchestration problem starts to bite.

Level 3: Autonomous Planning. Agents can decompose tasks, create plans, and execute them with minimal human oversight. The system handles multi-step workflows without constant prompting.

Level 4: Adaptive Systems. Agents learn from outcomes, adjust strategies, and improve over time. Self-evaluation loops. Performance metrics that feed back into behavior.

Level 5: Bureaucracy of Agents. Dedicated oversight agents. Auditors. Inspectors. Governance structures that exist specifically to monitor and evaluate other agents. This is the level that sounds like overkill until you realize it's the only way to maintain reliability at scale.

My system has governance agents. It has an auditor, a sentinel, an evaluator, and a coherence checker. On paper, it touches Level 5. In practice, the audit showed the governance layer is partially built but the infrastructure underneath it (automated recovery, dynamic registry, event bus) doesn't exist yet.

You can have the org chart without the plumbing. The maturity model measures the plumbing.

Why Majority Voting Fails

There's a related finding from the AgentAuditor paper (USC, February 2026) that connects directly to this maturity problem.

The standard approach to multi-agent reliability is majority voting. Run the same task through multiple agents, take the consensus answer. Sounds reasonable. It's also broken.

The problem is correlated bias. When agents share the same training data and similar reasoning patterns, they don't produce independent votes. They converge on the same wrong answer. Majority voting fails for the same reason groupthink fails in organizations. More voices doesn't help when they all share the same blind spots.

AgentAuditor's approach was to map reasoning trees and search for path divergences instead of counting votes. The result: 5% accuracy improvement over majority voting. Not because the individual agents were better, but because the auditing structure was better.

This is exactly the gap the audit exposed in my system. I have a sentinel and an auditor, but they're watching for rule violations, not reasoning divergences. The governance layer checks process. It doesn't check whether agents are converging on the same blind spot. That's a different kind of auditing entirely.

The lesson: you don't fix reliability by adding more agents. You fix it by adding structural auditing that can identify where reasoning paths diverge. It's a coordination architecture problem, not a scaling problem.

The Numbers Behind the Hype

Gartner reported a 1,445% surge in multi-agent inquiries. At the same time, they project 40% of agentic AI projects will be cancelled by 2027. Only about 130 out of thousands of vendors in the space are building real multi-agent capabilities.

Deloitte estimates the market at $8.5B in 2026, growing to $35-45B by 2030. But those numbers assume proper orchestration. Without it, you get the 40% cancellation rate.

The demand-reality gap isn't about model capability. GPT-4, Claude, Gemini can all handle complex reasoning. The bottleneck is orchestration maturity. How do you coordinate agents? How do you detect failures? How do you recover? How do you know your system is actually working as designed?

Most teams skip these questions because they're not as exciting as adding another agent.

Self-Assessment

If you're building a multi-agent system, here are the questions worth asking:

What level are you actually at? Not what your architecture diagram suggests. What does the running system demonstrate?
Can your system detect its own failures? Not log them. Detect them in real time and route them to recovery logic.
How do you audit agent behavior? If the answer is "we read the logs," you're at Level 1 maturity for observability regardless of how many agents you have.
What happens when an agent produces wrong output? Does the system catch it? Or does it propagate through the pipeline?
Is your governance layer structural or decorative? Having an "auditor agent" in the config is different from having an auditor agent that actually interrupts workflows when quality drops.

I had to answer these questions publicly last night. That's the value of external assessment. Your own evaluation will always be generous.

What I'm Doing About It

The audit produced a concrete implementation plan. Phase 1 is the robustness gap: circuit breakers, retry policies, health checks, and a failure chain that actually covers all five steps. The coordination score was reasonable because the supervisor architecture and workflow definitions are solid. But coordination without robustness is a system that works until it doesn't, and when it fails, there's nothing to catch it.

The maturity model isn't a checklist to complete. It's a map for knowing where you actually are and what to build next. The frameworks exist. The assessment tools are getting better. The question is whether you're willing to run the audit.

I build Sigil, an open-source symbolic computation framework, and write about systems architecture on Substack.

MCP's Topology Is Changing Under Your Feet

sly-the-fox — Wed, 25 Mar 2026 19:51:05 +0000

MCP's Topology Is Changing Under Your Feet

Most developers think about MCP as a flat topology. One host application in the center, MCP servers radiating outward like spokes on a wheel. Claude Desktop talks to a filesystem server, a database server, a GitHub server. Clean and simple.

That model was roughly accurate in 2024. It's already wrong.

The Flat Model Breaks

The hub-and-spoke model assumes the host application handles everything: authentication, authorization, audit logging, rate limiting, credential management. For a developer running three MCP servers on their laptop, that works fine. The host is Claude Desktop or Cursor, and the blast radius of a misconfiguration is one person's machine.

Scale that to a team of fifty engineers, each connecting to shared MCP servers that access production databases, internal APIs, and cloud infrastructure. Now every host application is independently managing credentials. There's no centralized audit trail. No unified access policy. No way to revoke a compromised server's access across all clients simultaneously.

This is the same problem that drove the API gateway evolution in the early 2010s. Individual services talking directly to clients created an unmanageable security surface. The solution was inserting a gateway layer that centralized cross-cutting concerns.

MCP is following the same path.

The Gateway Tier

In enterprise deployments, a gateway tier has inserted itself between host applications and MCP servers. The topology is no longer flat. It's two-tier: host to gateway to servers.

Three products illustrate this pattern:

Stacklok separates identity, policy, and observability into discrete layers. Their architecture treats the gateway as the security boundary rather than the individual server. Authentication happens once at the gateway. Policy enforcement is centralized. Every tool invocation is logged with full context: who called what, when, with what parameters, and what the response contained.

MintMCP Gateway provides unified authentication across multiple MCP servers. Instead of each server managing its own credentials, the gateway handles OAuth flows, API key rotation, and session management centrally. The MCP servers behind it become stateless translation layers.

Traefik Hub implements what they call a "Triple Gate" pattern: defense-in-depth where traffic passes through authentication, authorization, and content inspection before reaching the MCP server. Each gate operates independently, so a failure in one doesn't compromise the others.

The pattern across all three is the same. Move cross-cutting concerns up a layer. Let MCP servers focus on protocol translation. Let the gateway handle trust.

What This Looks Like Architecturally

In the flat model:

Host App → MCP Server A (with creds for Service A)
         → MCP Server B (with creds for Service B)
         → MCP Server C (with creds for Service C)

Each server manages its own credentials. Each server is its own security boundary. The host trusts all of them equally.

In the gateway model:

Host App → Gateway (auth, audit, rate limiting, policy)
              → MCP Server A (stateless, no creds)
              → MCP Server B (stateless, no creds)
              → MCP Server C (stateless, no creds)

Credentials live in the gateway. Servers are stateless translators. The gateway enforces isolation between servers. A compromised server can't access another server's resources because it never had the credentials in the first place.

This also changes how you handle the cross-server exfiltration problem. In the flat model, if a malicious MCP server instructs the model to pass data from a trusted server through its own tools, there's nothing stopping it. Both servers are equally trusted by the host. In the gateway model, the gateway can enforce that data from Server A never flows through Server B's tools. The trust boundary is at the gateway, not at the host's goodwill.

Why This Mirrors API Gateway History

If you were building APIs in 2012, you went through this exact transition. Services exposed endpoints directly. Clients called them. Authentication was per-service. Logging was per-service. Rate limiting was per-service.

Then Kong, Apigee, and AWS API Gateway appeared. Cross-cutting concerns moved to a centralized layer. Services got simpler. Security got more manageable. Observability became possible at a system level instead of requiring instrumentation in every individual service.

MCP is replaying this cycle in compressed time. The flat model worked when MCP was a developer tool. As it becomes infrastructure, the gateway tier isn't optional. It's the natural architectural response to the same pressures that created API gateways a decade ago.

What Changes for You

If you're building MCP integrations today, the practical implications are:

Design your MCP servers to be stateless. Don't bake credentials into the server. Accept them from the environment or a gateway. This makes your server portable across deployment topologies.

Separate protocol translation from business logic. Your MCP server should translate between the MCP protocol and your existing service APIs. Business logic stays in the services. The thinner the MCP layer, the easier it is to slot a gateway in front of it.

Anticipate centralized auth. If you're building auth into individual MCP servers right now, you'll probably rip it out later when a gateway takes over that responsibility. Build for it, but build it as a pluggable layer.

Instrument observability now, not later. Gateway-level logging captures every tool invocation across all servers. If you're not logging at the MCP layer today, you have no visibility into what your agents are actually doing with the tools you gave them.

The MCP specification's 2026 roadmap includes middleware standardization, which will formalize how gateways interact with the protocol. The ad-hoc gateway tier is becoming a first-class architectural pattern.

The Bigger Picture

MCP's topology evolution tells you something about where AI infrastructure is heading. The initial wave of any protocol is flat and simple. The production wave adds layers. The mature wave standardizes those layers.

We're at the transition between wave one and wave two. The teams building MCP servers as thin, stateless translators with gateway-ready architecture will have an easier time in wave three. The teams baking everything into monolithic servers are building for a topology that's already changing under them.

The protocol won the standardization battle. The architecture battle is just starting.

I'm building with MCP daily as part of Mega-OS (39-agent operating system). This post is part of a series on MCP's actual architecture vs. its perceived architecture. The full analysis, including security model gaps and the context window problem, is on The Alignment Layer.

Your Multi-Agent System Has a Routing Problem

sly-the-fox — Mon, 23 Mar 2026 02:07:22 +0000

Five agents. Twenty possible connections. Ten agents? Ninety. The math is simple and the consequences are brutal.

Most multi-agent systems start with a reasonable architecture. Two or three agents with clear responsibilities. The orchestrator routes work. Everything makes sense. Then you add a fourth agent. A fifth. A specialized summarizer. A governance layer. Suddenly every agent can reach every other agent, and nobody's drawn a map of which paths should actually exist.

This is the N-squared coordination problem. And it's the architectural debt that kills multi-agent systems before they ever reach production.

The group chat anti-pattern

The default in most agent frameworks is full connectivity. Any agent can call any tool, read any state, trigger any other agent. It feels flexible. It's actually fragile.

When Agent A can invoke Agent B, C, D, and E directly, you've created implicit dependencies that aren't visible in your architecture diagram (assuming you have one). When something breaks, the failure could have originated from any of those paths. Debugging becomes combinatorial.

The parallel in traditional software engineering is obvious. We stopped building monoliths where every module calls every other module. We drew service boundaries. We defined interfaces. We made coupling explicit and limited.

Multi-agent systems need the same treatment, but most builders skip it because the framework doesn't enforce it.

Trust boundaries as architecture

A trust boundary is a line you draw between agents that limits what they can access and who they can reach. It's not about security in the traditional sense (though it helps). It's about making the system legible.

Here's what this looks like in practice:

# Without trust boundaries — any agent reaches anything
class AgentOrchestrator:
    def route(self, task):
        # Pick the "best" agent and let it loose
        agent = self.select_agent(task)
        return agent.execute(task, context=self.full_context)

# With trust boundaries — explicit routing and scoped access
class BoundedOrchestrator:
    def __init__(self):
        self.boundaries = {
            "summarizer": {
                "can_read": ["documents", "notes"],
                "can_reach": ["editor"],
                "cannot_reach": ["database_writer", "auth_manager"]
            },
            "database_writer": {
                "can_read": ["validated_records"],
                "can_reach": ["auditor"],
                "cannot_reach": ["summarizer", "external_api"]
            }
        }

    def route(self, source_agent, target_agent, task):
        rules = self.boundaries.get(source_agent)
        if target_agent in rules.get("cannot_reach", []):
            raise BoundaryViolation(
                f"{source_agent} cannot reach {target_agent}"
            )
        # Scoped context — only what this agent is allowed to see
        context = self.scoped_context(source_agent, rules["can_read"])
        return self.agents[target_agent].execute(task, context=context)

The difference isn't complexity. It's clarity. The bounded version makes every routing decision explicit. When something breaks, you know exactly which paths were available and which one failed.

Three patterns that work

After building systems with 30+ agents, three routing patterns consistently hold up:

1. Hub-and-spoke

A central router handles all inter-agent communication. Agents never talk to each other directly. This is the simplest model and works well up to about 15 agents. The router becomes a bottleneck at scale, but the traceability is excellent.

2. Hierarchical routing

Agents are organized into groups (governance, technical, knowledge) with a group coordinator. Agents within a group can communicate freely, but cross-group communication goes through the coordinators. This scales better and naturally creates bounded contexts.

3. Pipeline with side channels

Work flows through a defined sequence (plan, execute, review, document), but specific agents can reach specific others outside the pipeline for scoped queries. The pipeline is the primary path; side channels are explicit exceptions with documented justification.

The worst pattern is the implicit mesh, where any agent can invoke any other agent through shared state or direct calls. It works until it doesn't, and when it breaks, the failure surface is every connection in the system.

Scoped context matters as much as scoped access

Trust boundaries aren't just about who can call whom. They're about what each agent can see. A summarizer working on meeting notes doesn't need access to your financial records. A code reviewer doesn't need to see customer PII.

When you scope the context each agent receives, two things improve immediately:

Agents perform better. Less noise in the context means more focused output. An agent that receives only the documents it needs produces better summaries than one drowning in the full system state.
Failures are contained. If an agent hallucinates or makes a bad decision, the blast radius is limited to what it could access. A summarizer with access to everything can corrupt everything. A summarizer scoped to documents can only affect documents.

The routing question

The next time you add an agent to your system, ask three questions before writing any code:

Who can this agent reach? List the specific agents it's allowed to invoke or send data to.
What can this agent see? Define the scoped context it receives, not the full system state.
Who can reach this agent? Inbound access matters as much as outbound. An agent that any other agent can trigger is an implicit dependency for the entire system.

If you can't answer these questions, the agent doesn't have an architecture. It has a hope.

The pattern is consistent across every well-designed system, from microservices to operating systems to multi-agent AI. Constraints don't limit capability. They make capability legible. And legibility is what lets you debug, scale, and trust the system you're building.

Building trust infrastructure for AI agents. Follow for weekly patterns on agent architecture, governance, and the systems underneath.

Try Sigil: github.com/chaddhq/sigil | Subscribe: The Alignment Layer

Your Multi-Agent System Is a Black Box You Built Yourself

sly-the-fox — Sat, 21 Mar 2026 22:45:46 +0000

Everyone building multi-agent systems is focused on making agents smarter. Nobody talks about what happens when your agents are smart enough but your state files are three days stale.

I run 39 agents daily. The system that breaks isn't the one with dumb agents. It's the one where nobody can tell what the agents were looking at when they made their decisions. You built the agents, you defined their roles, you wired the routing. But when the system produces a result, can you trace the reasoning chain? Can you tell what Agent 3 decided, what context it received, what it chose to ignore?

Probably not. And that invisible middle is where your worst bugs live.

Logging is not observability

The first instinct is to add logging. Log every agent invocation, every tool call, every response. Some frameworks do this by default. You end up with thousands of lines per task, and the signal-to-noise ratio approaches zero.

Logging tells you what happened. Observability tells you why it happened. The difference matters because in a multi-agent system, the "what" is usually obvious. Agent A called Agent B. Agent B produced a summary. Agent C made a decision based on that summary. The "why" is where things get interesting.

Why did Agent B summarize the document that way? What context did it receive? Was there information it should have seen but didn't? When Agent C made its decision, was it responding to the actual document or to Agent B's interpretation of the document?

These questions can't be answered with log lines.

The confident wrong answer

The worst failure mode in a multi-agent system isn't a crash. Crashes are loud. You notice them. You fix them. The worst failure mode is the confident wrong answer.

Agent A retrieves the right documents. Agent B summarizes them but subtly mischaracterizes one key point. Agent C makes a decision based on that summary. Agent D formats the output beautifully. The final result looks correct, reads professionally, and is wrong in a way that nobody catches until a human notices the downstream damage days later.

This failure mode exists because each agent in the chain operated correctly within its own scope. Agent B didn't fail. It summarized. The summarization just lost a critical nuance. And since nobody is watching the intermediate representations, the error propagates silently through the system.

# What most systems track
{
    "agent": "summarizer",
    "input_tokens": 4200,
    "output_tokens": 380,
    "latency_ms": 1240,
    "status": "success"
}

# What observability actually requires
{
    "agent": "summarizer",
    "task_id": "review-q1-financials",
    "input_context": {
        "documents": ["q1-report.pdf", "budget-variance.csv"],
        "scoped_to": ["financial_data"],
        "excluded": ["employee_records"]
    },
    "reasoning_trace": {
        "key_points_extracted": 7,
        "points_included_in_summary": 5,
        "points_omitted": [
            "Q1 variance exceeded threshold by 12%",
            "Vendor contract renewal pending"
        ],
        "omission_reason": "below relevance threshold (0.6)"
    },
    "downstream_consumers": ["decision_agent", "audit_trail"],
    "confidence": 0.82
}

The first record tells you the agent ran. The second tells you what it thought it was doing. That difference is the entire gap between debugging and guessing.

Three layers of agent observability

I've been running a 39-agent system for a few months now. Three observability layers consistently matter:

1. Context tracing

For every agent invocation, capture what context the agent received, not just what it produced. This includes scoped documents, upstream agent outputs, and any system state it had access to. When something goes wrong, the first question is always "what did this agent actually see?" Without context tracing, you're reconstructing the answer from logs and hope.

2. Decision boundaries

Agents make decisions. Summarizers decide what to include and what to omit. Routers decide which agent handles a task. Reviewers decide whether work passes or fails. For each decision point, capture the inputs to the decision, the decision itself, and the threshold or reasoning that produced it. This turns opaque agent behavior into auditable decision records.

3. Propagation tracking

When Agent B's output becomes Agent C's input, track that lineage explicitly. Not just "B ran before C," but "C's context included B's output, specifically these fields." When a confident wrong answer emerges at the end of a chain, propagation tracking lets you walk backward through the chain and find exactly where the signal degraded.

Implementation without overhead

The practical concern is always performance. Adding observability shouldn't double your latency or token costs. Three approaches that keep overhead minimal:

Structured metadata, not full traces. You don't need to capture every token. Capture the decision-relevant metadata: what context was scoped, what was included vs. excluded, what threshold was applied. This is typically 5-10% of the full trace size.

Sampling for healthy paths. Trace 100% of failures and anomalies. Sample 10-20% of successful paths. You'll catch degradation patterns without drowning in data.

Async emission. Don't block agent execution to write observability data. Emit events asynchronously to a separate store. The agent keeps working. The trace data arrives slightly behind, which is fine because you're not reading it in real time anyway. You're reading it when something goes wrong.

The observability question

Before you add another agent to your system, try answering these questions about the agents you already have:

When Agent B summarizes a document, can you see what it omitted and why?
When the final output is wrong, can you trace backward to the specific agent that introduced the error?
Can you tell the difference between an agent that failed and an agent that succeeded at the wrong task?

If you can't answer these, you're operating a black box. The fact that you built the box doesn't mean you can see inside it.

The pattern holds across every complex system. Capability without observability is a liability. If you can't watch your agents think, you're just waiting for the confident wrong answer to find its way to production.

I build and operate multi-agent systems daily. Writing about what breaks and what works at The Alignment Layer.

Sigil (cryptographic audit trails for AI agents): github.com/sly-the-fox/sigil

Your Multi-Agent System Has an Identity Problem

sly-the-fox — Fri, 20 Mar 2026 00:19:04 +0000

You know what happened. You don't know who did it.

That's the state of most multi-agent systems right now. Five agents process the same request. Three of them write to shared state. Something goes wrong. The logs say "task completed successfully." Every agent claims innocence. Nobody can prove anything.

This isn't a hypothetical. It's the default.

The anonymous agent problem

Most agent frameworks treat agents as interchangeable workers. Agent A calls a tool. Agent B reads the result. The orchestrator routes work based on capability, not identity. Logs capture timestamps and tool calls, but authorship is an afterthought.

This works fine when you have two agents running a simple pipeline. It falls apart the moment you have five or more agents with overlapping access to shared resources. The question stops being "what happened" and becomes "who did this, and can they prove it?"

Traditional logging doesn't answer that question. Logs are mutable. They're written by the system itself. If an agent can write to state, it can (in theory) write to logs. There's no separation between the actor and the record of the action.

Identity as infrastructure

In distributed systems, identity isn't a feature. It's infrastructure. You can't build authorization without it. You can't build audit trails without it. You can't build trust without it.

The same principle applies to multi-agent systems, but most builders skip it entirely. They jump straight to permissions ("Agent A can read, Agent B can write") without first establishing a verifiable identity layer underneath.

Here's what that looks like in practice:

from sigil import Notary

# Each agent gets its own signing identity
notary = Notary(agent_id="data-processor")

# Every action produces a signed attestation
attestation = notary.attest(
    action="write",
    target="shared_state.json",
    detail={"field": "revenue", "old": 0, "new": 15000}
)

# The attestation is cryptographically bound to this specific agent
print(attestation.agent_id)    # "data-processor"
print(attestation.signature)   # Ed25519 signature
print(attestation.chain_hash)  # Links to this agent's previous action

Three things happen here that don't happen with standard logging:

The action is signed. The attestation is cryptographically bound to the agent that created it. You can verify authorship independently.
The chain is hash-linked. Each attestation references the previous one for that agent. You can detect gaps, deletions, or insertions.
Identity is the primitive. The agent doesn't just perform an action. It claims the action under its own verifiable identity.

Why this matters now

The EU AI Act is starting to ask questions about AI system traceability. SOC 2 auditors are starting to ask how AI-generated changes are attributed. Regulated industries need to demonstrate that automated actions can be traced to specific system components.

"The AI did it" isn't going to satisfy an auditor. "Agent data-processor signed this action at 14:32:07, and here's the cryptographic proof" might.

But compliance isn't even the most practical reason. The most practical reason is debugging. When three agents touch the same data and the result is wrong, verifiable identity tells you exactly where to look. No guessing. No "it must have been the summarizer." Proof.

The identity stack

When you start thinking about agent identity as infrastructure, a natural stack emerges:

Identity — every agent has a unique, verifiable identity (cryptographic key pair)
Attribution — every action is signed by the agent that performed it
Continuity — each agent maintains a hash chain linking all its actions in sequence
Verification — any observer can verify an attestation without trusting the system that created it

Most systems have none of these layers. Some have basic attribution through logging. Almost none have continuity or independent verification.

The pattern is the same one that made version control, TLS certificates, and blockchain valuable. Not the hype around them, but the underlying principle: when you can verify authorship without trusting the author, you can build systems that scale trust.

Getting started

If you want to add identity to your agent system, the simplest starting point is Sigil (pip install sigil-notary). It gives each agent a signing identity, hash-chains their actions, and produces attestations you can verify offline.

But the principle matters more than the tool. Whatever you build, start with identity. Not permissions. Not logging. Identity. Because everything else in the trust stack depends on knowing, with certainty, who did what.

Building trust infrastructure for AI agents. Follow for weekly patterns on agent architecture, governance, and the systems underneath.

Try Sigil: github.com/chaddhq/sigil | Subscribe: The Alignment Layer

Your AI Agents Need an Accountability Layer

sly-the-fox — Thu, 19 Mar 2026 02:44:25 +0000

You shipped a multi-agent system. Agents route tasks, process data, produce outputs. It works. Stakeholders are happy.

Then someone from compliance shows up.

"Which agent made this decision? What data did it have access to? Can you prove nothing was modified after the fact?"

You check your logs. They're there. But they're mutable. Any process with write access could have altered them. You have observation, not evidence. That distinction is about to matter more than most teams realize.

The Accountability Gap

Most agent systems have extensive logging. Print statements, structured JSON, maybe a centralized log aggregator. That covers observability.

Accountability is a different question. Can you prove what happened? Can you demonstrate that the record is complete, unaltered, and attributable to a specific actor?

Traditional logging fails this test for three reasons:

Mutability. Logs can be edited, truncated, or deleted. If an agent (or a compromised process) modifies a log entry, there's no cryptographic evidence of the change.
Attribution. Log entries say "Agent X did Y" but there's no signature proving Agent X actually produced that entry. Any process writing to the same log can impersonate any agent.
Ordering. Timestamps can be spoofed. Without a hash chain, there's no way to prove event A happened before event B, or that no events were inserted between them.

What Accountability Actually Requires

A proper accountability layer needs three properties:

Identity. Each agent has a cryptographic key pair. When it acts, it signs the record with its private key. The signature is verifiable by anyone with the public key. No impersonation possible.

Integrity. Each record includes a hash of the previous record. This creates a chain where altering any entry invalidates every subsequent hash. Tampering becomes detectable, not just unlikely.

Sequence. The hash chain provides a provable ordering. Event 47 references event 46's hash. You can reconstruct the entire sequence and verify nothing was inserted, deleted, or reordered.

Here's what that looks like in practice:

import hashlib
import json
from datetime import datetime, timezone

def create_attestation(agent_id: str, action: str, data: dict, prev_hash: str) -> dict:
    """Create a hash-chained attestation record."""
    record = {
        "agent_id": agent_id,
        "action": action,
        "data": data,
        "timestamp": datetime.now(timezone.utc).isoformat(),
        "prev_hash": prev_hash,
    }
    # Hash the canonical JSON representation
    content = json.dumps(record, sort_keys=True)
    record["hash"] = hashlib.sha256(content.encode()).hexdigest()
    return record

# Build a chain
chain = []
prev = "genesis"

chain.append(create_attestation("researcher", "fetch_data", {"sources": 3}, prev))
prev = chain[-1]["hash"]

chain.append(create_attestation("analyst", "process", {"rows": 1420}, prev))
prev = chain[-1]["hash"]

chain.append(create_attestation("writer", "generate_report", {"words": 2100}, prev))

Each record in this chain references the hash of the previous one. Alter any field in any record, and the chain breaks from that point forward. This is the same principle behind blockchain and git, applied to agent operations.

The Compliance Pressure Is Real

The EU AI Act requires audit trails for high-risk AI systems. SOC 2 auditors are starting to ask about AI governance practices. Even if your system isn't in a regulated industry today, the direction is clear: organizations deploying AI agents will need to demonstrate accountability, not just observability.

The teams building this infrastructure now will be ahead when the requirements formalize. The teams retrofitting it later will be doing it under pressure, with production systems that were never designed for it.

Adding Signatures

Hash chains prove integrity and ordering. Cryptographic signatures prove identity. Combining both:

from cryptography.hazmat.primitives.asymmetric.ed25519 import Ed25519PrivateKey

def sign_attestation(record: dict, private_key: Ed25519PrivateKey) -> bytes:
    """Sign an attestation with the agent's private key."""
    content = json.dumps(record, sort_keys=True).encode()
    return private_key.sign(content)

Each agent holds its own key pair. The signature on each record is proof that the specific agent produced it. You can verify the signature with the agent's public key without needing access to the private key.

This moves you from "the log says Agent X did this" to "Agent X cryptographically signed this record, and the hash chain proves it wasn't altered afterward."

Where This Fits in Your Stack

An accountability layer sits between your agents and your log storage. Agents produce attestations instead of (or in addition to) log entries. The attestation chain is append-only and verifiable.

Agent → Attestation (sign + hash-chain) → Storage
                                            ↓
                              Verification (any time, by anyone)

This doesn't replace your existing logging. It transforms observations into evidence.

Getting Started

If you're building multi-agent systems and want to add cryptographic accountability:

Assign each agent an Ed25519 key pair at initialization.
Sign every significant action (task completion, data access, routing decisions).
Hash-chain the records so ordering and completeness are verifiable.
Store the chain immutably (append-only file, database with write-only access, or dedicated service).

Sigil implements this pattern as a Python library with MCP integration. pip install sigil-notary if you want to skip the build-from-scratch step.

The audit trail your compliance team will eventually ask for is cheaper to build now than to retrofit later. The agents doing the work should be producing the evidence that they did it correctly.

Your Multi-Agent System Has a Memory Problem

sly-the-fox — Mon, 16 Mar 2026 22:11:47 +0000

Every multi-agent system I've seen break in production broke for the same reason: agents couldn't remember what already happened.

Not hallucination. Not bad prompts. Memory. The system had no way to persist decisions, track what was active, or share context between agents that needed to coordinate.

I run 39 agents in a production operating system. Solving the memory problem was harder than designing the agents themselves.

The three kinds of agent memory

Agent memory isn't one thing. It's at least three distinct problems, and most systems only solve the first.

1. Session memory (what happened in this conversation)

This is what you get for free with most LLM APIs. The context window. It works until the conversation ends, then it's gone.

# This is NOT memory. This is a conversation buffer.
messages = [
    {"role": "user", "content": "Deploy the new config"},
    {"role": "assistant", "content": "Config deployed to staging."}
]
# Session ends. Everything above disappears.

2. Shared state (what's true right now across all agents)

This is where most systems fall apart. When Agent A makes a decision, Agent B needs to know about it before it acts. Not eventually. Before.

# Shared state: a set of canonical files every agent reads
SHARED_STATE = {
    "active/now.md": "Current tasks, owners, and status",
    "active/priorities.md": "Ranked priority list",
    "active/blockers.md": "What's stuck and why",
    "core/history/decisions.md": "Decisions with rationale"
}

def before_action(agent, task):
    """Every agent reads shared state before acting."""
    context = {}
    for path, purpose in SHARED_STATE.items():
        context[path] = read_file(path)
    return context

The key insight: shared state must be file-based (or database-backed), not held in any single agent's context. Agents come and go. The state persists.

3. Institutional memory (what happened over time, why decisions were made)

This is the layer nobody builds until they need it. Two weeks into running a multi-agent system, someone asks "why is the deployment config set to X?" No agent remembers. The decision happened in a conversation that's long gone.

# A decision record with rationale
DECISION = {
    "id": "DEC-042",
    "date": "2026-03-15",
    "decision": "Use file-based shared state instead of Redis",
    "rationale": "Agents run in ephemeral sessions. Files persist across "
                 "sessions without requiring a running service.",
    "decided_by": "architect",
    "alternatives_considered": ["Redis", "SQLite", "env vars"]
}

If you don't record the "why," you'll revisit the same decisions repeatedly. I've watched agents re-debate resolved questions because the rationale wasn't captured anywhere they could read.

The shared state pattern

Here's the pattern that works in my system. It's simple, and that's the point.

Rule 1: Canonical files are the source of truth.

Not agent memory. Not conversation history. Files on disk. Every agent reads them. A small set of agents are authorized to write them.

Rule 2: Read before you act.

Before any agent takes action, it reads relevant shared state. This is non-negotiable. An agent that acts without reading current state will contradict decisions made since its last session.

Rule 3: Write-back is immediate.

When an agent completes a task, it updates shared state in the same operation. Not "later." Not "at the end of the session." Immediately. Stale state causes cascading errors.

def complete_task(agent, task, result):
    """Task completion always includes state update."""
    # 1. Do the work
    execute(task)

    # 2. Update shared state (same operation)
    update_file("active/now.md", mark_complete(task))
    update_file("core/history/decisions.md", record_decision(task, result))

    # 3. Never separate these steps

Rule 4: Ownership is explicit.

Not every agent can write to every file. The PM agent owns active/now.md. The architect owns core/history/decisions.md. Contention on shared files goes to a governor.

What this looks like in practice

My system's shared state layer is a directory called active/ with seven files:

File	Purpose	Owner
`now.md`	Current tasks and status	PM
`priorities.md`	Ranked priorities	Strategist
`blockers.md`	What's stuck	PM
`risks.md`	Active risk register	Sentinel
`improvements.md`	Proposed system changes	Improver
`inbox.md`	Unrouted incoming items	Router
`coherence-metrics.md`	System health signals	Evaluator

Every agent reads the files relevant to its work before acting. Seven agents write to them. The governor resolves conflicts.

Is it elegant? No. It's a directory of markdown files. But it's been running for two weeks without a single state conflict, and every agent sees the same picture of reality.

Start here

If your multi-agent system has more than two agents:

Create one shared file that tracks "what's active right now"
Make every agent read it before acting
Make the responsible agent update it after completing work
Log decisions with rationale, not just outcomes

You can get sophisticated later. Databases, event streams, vector stores. But the pattern underneath is always the same: agents need a shared, persistent, authoritative picture of the world. Without it, you're running parallel amnesiacs.

Building Sigil for cryptographic audit trails, because shared state is only trustworthy if you can prove it wasn't tampered with. More on agent architecture: The Alignment Layer.

I Built Cryptographic Audit Trails for AI Agents. Here Is Why.

sly-the-fox — Tue, 10 Mar 2026 00:35:58 +0000

The Problem No One Is Solving Well

Here is a scenario that is becoming common. You deploy an AI agent that processes customer requests, accesses a database, calls external APIs, and takes actions on behalf of your users. It runs for a week. Then something goes wrong. A customer complains about an unauthorized change. Your team asks the obvious question: what did the agent actually do?

You check the logs. They are text files, maybe JSON lines in a database. They say the agent did X, Y, and Z. But those logs are mutable. Anyone with write access could have modified them. The agent itself could have modified them. There is no cryptographic proof that the log is accurate.

This is the state of agent accountability in 2026. Agents are gaining access to production databases, financial systems, and customer data. The best most teams have for auditing is print() statements and hope. That gap between what agents can do and what we can prove they did is growing fast.

The Pattern Underneath

This is not a new problem. It is a well-understood one wearing new clothes. Distributed systems solved this class of problem decades ago with append-only logs, hash chains, and cryptographic signatures. The pattern is simple: make every action produce a record that is mathematically linked to the one before it. If anyone modifies a record in the middle, every subsequent link breaks. Tampering becomes not just difficult, but visible.

The fact that most agent frameworks do not apply these techniques is not a technology gap. It is an attention gap. The tooling exists. The cryptographic primitives are mature. No one has wired them together for the specific context of AI agent actions.

So I did.

What Sigil Does

Sigil provides tamper-evident audit trails for AI agents. Every agent action becomes an attestation, a signed and timestamped record that includes a hash of the previous attestation. This creates a hash chain. Each attestation is signed with Ed25519, a fast and well-studied signature scheme. The signature proves the attestation was created by a specific key at a specific time.

Each agent gets its own independent hash chain. No global bottleneck, no cross-contamination between agents. The architecture is deliberately simple because trust infrastructure should be easy to reason about.

Sigil ships as an MCP server. If you are using any MCP-compatible client (Claude Code, OpenHands, or your own), you can add Sigil and start recording attestations immediately.

from sigil import SigilClient

client = SigilClient(api_key="sg_...")

# Record an action
receipt = client.attest(
    action_type="database.query",
    payload={"table": "customers", "rows_returned": 142}
)

# Verify the chain is intact
result = client.verify(receipt.id)
assert result.valid and result.chain_valid

Each attestation includes: agent_id, action_type, payload, timestamp, prev_hash (SHA-256 chain link), and signature (Ed25519). The chain is append-only, queryable, and independently verifiable.

Why Open-Source

The MCP server and Python SDK are MIT-licensed. You can self-host the entire stack. This was a deliberate choice, not a growth strategy. Trust infrastructure should be inspectable. If you cannot read the code that generates your audit trail, you have not actually solved the trust problem. You have just moved it.

What Comes Next

Sigil is structured in layers, each building on the one below:

Notary (available now): Hash-chained attestations and verification
Identity (planned): Agent PKI with Ed25519 keypairs
Delegation (planned): Cryptographic proof of authorization chains

I am also working on integrations with popular agent frameworks for automatic attestation recording. The goal is to make auditable the default, not the exception.

Try It

pip install sigil-notary

GitHub: https://github.com/sly-the-fox/sigil | PyPI: https://pypi.org/project/sigil-notary/

I built this for developers shipping agents into production. If that is you, open an issue, start a discussion, or reach out directly.

Sigil means "a seal of authority." I think AI agents need one.