Forem: nostalgicskinco

AI Agents Are Making Decisions Nobody Can Audit

nostalgicskinco — Sat, 21 Feb 2026 06:08:04 +0000

Last month, a developer posted on Reddit about an AI agent that got stuck in a loop and fired off 50,000 API requests before anyone noticed. Production was down. The bill was ugly. And the worst part? Nobody could tell exactly what the agent had been doing or why.

This isn't an edge case anymore. It's Tuesday.

The problem nobody wants to talk about

AI agents are everywhere now. They're calling APIs, querying databases, executing code, and in some cases, spending real money — all autonomously. The frameworks for building them are incredible. CrewAI, LangChain, AutoGen, OpenAI's Agents SDK — they make it shockingly easy to stand up an agent that can do real work.

But here's what none of these frameworks give you: visibility into what your agent actually did.

No audit trail. No kill switch. No way to replay what happened after something goes wrong. No policy enforcement before a dangerous action executes. And perhaps most concerning — no PII redaction. Every prompt and completion your agent generates ships directly to your observability backend with customer data, API keys, and internal information fully intact.

Every team I've talked to handles this differently. Most don't handle it at all.

Why this is an infrastructure problem, not an application problem

Think about TLS. Nobody implements TLS differently in every microservice. It's a standardized layer that sits below application code and handles encryption for everything above it.

Agent safety needs to work the same way.

If every team builds their own logging, their own kill switches, their own policy checks — you get inconsistency, gaps, and the kind of "we'll deal with it later" approach that leads to the 50,000-request incident above.

The safety layer needs to be:

Framework-agnostic — works whether you're using CrewAI, LangChain, AutoGen, or something custom
Infrastructure-level — operates in the network path and telemetry pipeline, not inside agent code
Standardized — uses OpenTelemetry so it plugs into whatever observability stack you already have

What I built

I've been working on an open-source project called AIR Blackbox — think of it like a flight recorder for AI agents. It sits between your agents and your LLM providers and captures everything.

The architecture is straightforward:

Your Agent ──→ Gateway ──→ Policy Engine ──→ LLM Provider
                 │               │
                 ▼               ▼
           OTel Collector   Kill Switches
                 │          Trust Scoring
                 ▼          Risk Tiers
           Episode Store
           Jaeger · Prometheus

One line change — swap your base_url — and every agent call flows through it. No SDK changes. No code refactoring.

Here's what each piece does:

Gateway — An OpenAI-compatible reverse proxy written in Go. It intercepts all LLM traffic, emits structured OpenTelemetry traces, and checks policies before forwarding requests. Any OpenAI-compatible client works without modification.

Policy Engine — Evaluates requests against YAML-defined policies in real time. Risk tiers (low, medium, high, critical), trust scoring, programmable kill switches, and human-in-the-loop gates for high-risk operations. This isn't monitoring after the fact — it's governance before the action happens.

OTel Collector — A custom processor for gen_ai telemetry. PII redaction using hash-and-preview (48-character preview + hash, so you can debug without exposing full data). Cost metrics. And loop detection — the thing that would have caught that 50,000-request incident before it became a disaster.

Episode Store — Groups individual traces into task-level episodes you can replay. When something goes wrong, you don't sift through raw logs — you replay the episode like rewinding a tape.

The part I didn't expect

When I started building this, I thought the hard problem would be the technical architecture. It wasn't. OpenTelemetry gives you a solid foundation. Go is great for proxies. The plumbing was actually the straightforward part.

The hard problem is convincing people they need it before the incident happens.

Every team I talk to says some version of: "We're being careful." "Our agents are simple." "We'll add monitoring later."

And then later arrives as a production incident, a leaked API key, or an auditor asking questions nobody prepared for.

The companies that are thinking about this — the ones deploying agents in regulated industries, in healthcare, in finance — they already know. They're the ones asking: "Can we prove what our agent did? Can we shut it down instantly? Can we guarantee PII doesn't leak into our trace backend?"

These aren't hypothetical questions. ISO 27001 auditors are starting to ask them. SOC 2 reviewers are starting to ask them. And if your answer is "we log stuff to CloudWatch," that's not going to cut it.

What's next

AIR Blackbox is fully open source under Apache 2.0. It's 21 repositories, fully modular — you can use the whole stack or just the pieces you need.

There are trust plugins for CrewAI, LangChain, AutoGen, and OpenAI's Agents SDK. A five-minute quickstart gets you the full stack running locally with make up.

If you're deploying AI agents in production — or planning to — I'd genuinely appreciate your feedback. What gaps are you seeing? What keeps you up at night?

GitHub: github.com/airblackbox

There are interactive demos in the README if you want to explore without installing anything.

I'm building AIR Blackbox because I think agent safety shouldn't be an afterthought bolted on after the first incident. It should be infrastructure — boring, reliable, and already running when the 50,001st request tries to fire.

Why Your AI Agents Need a Black Box

nostalgicskinco — Wed, 18 Feb 2026 03:09:21 +0000

My AI agents went rogue.

I run an e-commerce store. A few months ago, I deployed AI agents to handle customer emails — returns, refund requests, product questions. It worked great, until it didn't.

The agents started making promises we couldn't keep: wrong refund amounts, unauthorized discounts, completely fabricated policies. "Sure, we'll refund your shipping even though our policy says otherwise." "Yes, you can return that item after 90 days." None of it was true.

The worst part wasn't that they failed. That's fixable. The worst part was that I couldn't prove what they actually said.

When customers disputed AI responses, I had nothing. Logs were scattered across three different services. They were mutable — anyone (or any process) could quietly change them after the fact. And they were incomplete. Half the tool calls weren't captured at all.

I had no audit trail. No accountability. No evidence.

The Gap That Nobody Talks About

When I went looking for a solution, I found plenty of observability tools. Langfuse. Helicone. LangSmith. They're all excellent at showing you what happened.

But I didn't need to see what happened. I needed to prove what happened.

That distinction sounds subtle. It isn't.

Observability answers: "What did the agent do?"
Accountability answers: "What did the agent do, and can you prove it wasn't changed after the fact?"

In a regulated world — and we're entering one fast — that difference is everything. The EU AI Act is partially in force. The Colorado AI Act hits in June 2026. Texas TRAIGA is live now. The SEC has made AI governance its top priority for 2026.

Companies deploying AI agents for anything consequential (approving loans, handling complaints, writing medical summaries, processing transactions) are going to need tamper-evident records of what their AI said and did. Not logs. Proof.

What I Built

So I built AIR — the open-source black box for AI agents.

Like the flight recorder on an aircraft, AIR captures every decision, every interaction, every tool call your AI agents make. But unlike scattered logs, AIR creates cryptographic chains: HMAC-SHA256 proof that records haven't been modified after the fact. Change one record and the entire chain breaks.

Three lines of code to wrap your existing OpenAI app:

from openai import OpenAI
import air

client = air.air_wrap(OpenAI())
response = client.chat.completions.create(...)
# Every call is now recorded with a tamper-evident audit trail

That's it. Every prompt, completion, tool call, and model decision is captured — with cryptographic integrity — stored on your own infrastructure, never leaving your control.

What It Actually Solves

When a customer disputes what your AI told them: You have a signed, timestamped record of the exact conversation. Not a log file that could have been edited — cryptographic proof of what was said.

When a regulator asks for your AI governance documentation: AIR auto-generates compliance reports mapped to SOC 2, ISO 27001, and EU AI Act requirements. 22 controls, pre-mapped.

When your agent goes off-script and you don't know why: Deterministic replay lets you reproduce any AI decision exactly as it happened, in isolation, for debugging.

When your team changes a model or prompt: You have a before/after comparison with the same inputs, so you can prove the change didn't introduce new failure modes.

The Ecosystem

AIR isn't one repo — it's a complete accountability stack across 19 open-source repositories:

air-blackbox-gateway — OpenAI-compatible reverse proxy that captures every LLM call
air-sdk-python — Python SDK wrapping OpenAI, LangChain, and CrewAI
agent-policy-engine — Risk-tiered autonomy: policies, kill switches, trust scoring
eval-harness — Replay episodes, score results, detect regressions
mcp-policy-gateway — Firewall for AI agent tool access via MCP

The Python SDK is live on PyPI: pip install air-blackbox-sdk

There's also an interactive demo you can try right now in your browser — watch an agent run, inspect the audit chain, tamper with a record, and see the chain break.

Why Now

The timing isn't accidental. 90% of enterprises use AI in daily operations. Only 18% have governance frameworks. The tools simply don't exist yet for most companies.

But they will need to exist — and soon. August 2026 is when EU AI Act enforcement begins for high-risk systems. June 2026 for Colorado. January 2026 for Texas (already live).

If you're building AI agents that affect real people, you need to be thinking about this now, not after your first customer dispute or regulatory inquiry.

AIR is open source and free to use. The hard part is done — the code is real, the SDK is live, the demo works.

If you're building AI agents in production, I'd love your feedback. Try the interactive demo, kick the tires on the SDK, file an issue if something doesn't work.

GitHub: github.com/nostalgicskinco/air-blackbox-gateway

Jason Shotwell builds e-commerce tooling and, apparently, AI infrastructure when his agents go rogue.