Forem: Avi Fenesh

If you build your AI coding system

Avi Fenesh — Sat, 28 Mar 2026 15:55:29 +0000

Avi Fenesh

Mar 28

How AI workloads changed the queue I was already building

#ai #queue #valkey #orchestration

Comments 6

5 min read

How AI workloads changed the queue I was already building

Avi Fenesh — Sat, 28 Mar 2026 15:50:18 +0000

I did not start glide-mq because of AI.

I started it because I needed a queue, wanted mechanics I liked better than what was out there, and wanted it built on top of Valkey Glide.

Up through v0.13, that is basically what it was: a feature-rich queue.

Then I kept building AI systems on top of it.

That is where the shape started changing.

The queue itself was usually fine. The pain was everything around it. Long-running jobs that were not actually stuck. Streaming that wanted to be part of the job instead of a side channel. Budget checks that needed to happen before the spend, not after. Token-aware rate limits. Pause/resume because real flows sometimes need a human or have to wait for CI.

Different project. Same pile of glue.

After enough rounds, the pattern stops looking normal. The queue is doing the easy part. Everything AI-specific is leaking out around it.

That is what pushed glide-mq in a different direction.

AI workloads expose the wrong assumptions in normal queue design

Most queues were built for ordinary background work.

AI workloads are not ordinary background work.

They stream. They run long without being stuck. They spend money while they execute. They wait on humans. They hit limits based on tokens, not job count. They can fail semantically while every operational metric stays green.

If the queue does not understand that shape, the missing behavior does not disappear. It ends up patched on from the outside.

That is the part that started feeling wrong to me.

Not because queues are bad. Because a lot of the assumptions underneath them were built for a different class of workload.

1. Locks are the wrong model for long AI jobs

Most queues assume the worker proves it is alive by renewing a lock. Miss the renewal, retry the job.

That is reasonable for short jobs.

It is a bad model for LLM workloads.

A long generation is not dead. A reasoning-heavy call sitting on a hard prompt for 90 seconds is not dead. But a normal queue cannot tell the difference, so it retries, and now you are paying twice for the same work.

Nothing crashed. Nothing exploded. You just had the wrong execution model.

The usual workaround is increasing a global lock timeout.

That is also wrong.

A tiny classifier and a two-minute generation should not share the same timeout assumptions just because they live in the same queue.

So in glide-mq, lock duration is per job:

await queue.add("classify", { text: "short" }, { lockDuration: 10_000 });
await queue.add("research", { topic: "complex" }, { lockDuration: 180_000 });

Same queue. Different expectations.

That should be normal.

2. Budget control belongs in the execution path

The ugly failures in agent systems are often not technical failures.

The requests succeed.
The responses are valid.
The logs look clean.
The dashboards stay green.

And meanwhile the system is stuck in some useless loop spending money.

That is why budget control does not belong only in dashboards, analytics, or some side service that tells you later what happened. By then the spend is already gone.

The queue is the choke point. Every step passes through it. That is where the budget check belongs.

So I put budgets on flows:

await flowProducer.add(
  {
    name: "pipeline",
    queueName: "ai",
    data: { topic: "research" },
    children: [
      { name: "search", queueName: "ai", data: {} },
      { name: "analyze", queueName: "ai", data: {} },
      { name: "draft", queueName: "ai", data: {} },
    ],
  },
  {
    budget: {
      maxTotalTokens: 50_000,
      maxTotalCost: 2.0,
      tokenWeights: { reasoning: 2.0 },
      onExceeded: "fail",
    },
  },
);

When a job reports usage, the budget check happens atomically in Valkey. If the flow is out of budget, the next step stops there.

Not later. Not after the invoice. At the point where it still matters.

3. Streaming should not be a second system

If the model is producing the result incrementally, that stream is part of the job.

Treating it as a separate system is already the smell.

But most queues only understand one shape: job starts, job finishes, here is the result.

LLMs do not behave like that.

So people bolt on pub/sub. Or WebSockets. Or SSE through some different route. Then reconnect logic. Then ordering. Then another pile of glue to keep the stream state and the job state from drifting apart.

Now the job lives in one place and the live output lives somewhere else.

That split is artificial.

So in glide-mq, the stream stays on the job:

const worker = new Worker(
  "ai",
  async (job) => {
    for await (const token of generateTokens(job.data.prompt)) {
      await job.streamChunk("token", token);
    }
    await job.streamChunk("done");
    return { completed: true };
  },
  { connection },
);

const chunks = await queue.readStream(jobId, { block: 5000 });

No extra pub/sub layer. No second system just to watch a job do its work. The stream is attached to the job, stored in Valkey, and resumable after disconnect.

That is the model that actually matches the workload.

The rest is the same mismatch in different clothes

Once you accept that AI workloads are a different class of work, the rest stops looking like extra features and starts looking like missing queue behavior.

Pause/resume for human approval.
Wake the flow back up when CI finishes.
Fallbacks across models.
Rate limiting based on tokens instead of job count.
Usage tracking that does not break when the next model adds a new token category.

These are not edge cases. They are part of the shape of the work.

That is what v0.14 is really about

Not AI branding. Not pretending glide-mq started as an AI queue from day one.

It did not.

It started as the queue I wanted to have.

Then building real AI systems on top of it kept exposing the same gaps, and eventually patching around them started feeling like the wrong move.

So v0.14 moves those behaviors into the queue

That is what changed.

glide-mq is still a queue. But v0.14 is where it started absorbing the behaviors that AI systems kept forcing into side systems.

Per-job lockDuration so long jobs stop fighting short ones.
job.reportUsage() so budgets and accounting live in the execution path.
job.streamChunk() so streaming stays attached to the job.
job.suspend() and queue.signal() for human-in-the-loop flows.
Ordered fallbacks.
Token-aware throttling.
Flow budgets that fail before the spend gets worse.

That is the direction now.

Not queue plus five things you will bolt on later anyway.

The queue should understand more of the workload it is running.

Final thought

I do not think this is only a glide-mq story.

I think AI workloads are exposing the wrong assumptions in a lot of older tooling.

The problem is not just that queues need a few more integrations. The problem is that many of the abstractions we still lean on were designed for ordinary jobs, requests, and background work. AI systems have a different shape, and when the abstraction does not match, the missing behavior leaks out into glue.

That is the part I stopped wanting to patch from the outside.

npm install glide-mq

GitHub | Examples | Docs

Your AI agent configs are probably silently broken -- I built a linter that catches it

Avi Fenesh — Thu, 12 Feb 2026 00:54:18 +0000

Your AI Agent Configs Are Probably Broken (and You Don't Know It)

Avi Fenesh ・ Feb 12

#ai #tooling #devtools #opensource

Your AI Agent Configs Are Probably Broken (and You Don't Know It)

Avi Fenesh — Thu, 12 Feb 2026 00:51:42 +0000

I've spent the last year wiring up AI coding tools: Claude Code, Cursor, Copilot, Cline, Codex CLI, and friends. Skills, hooks, memory files, MCP servers, agent defs, plugin manifests — the whole stack.

Here's the annoying truth: a huge chunk of these configs are silently broken. And the tools mostly don't tell you.

The problem nobody talks about

If you misconfigure ESLint, it screams. If you misconfigure a SKILL.md, nothing happens. The skill just… never triggers. No error. No warning. It's like the file doesn't exist.

Vercel even measured this: skills invoke at 0% without correct syntax. Not "less often". Zero. One wrong frontmatter field and your carefully written skill is invisible to the agent.

And then you get the classic "almost right" output. Stack Overflow's 2025 developer survey has 66% of devs calling "almost right" their biggest AI frustration — and honestly, misconfigured agents produce exactly that. You think you gave the model the right rules/tools, but you didn't (or you did, just in the wrong format).

It gets worse if you use multiple tools. Cursor for editing, Claude Code for terminal work, maybe Copilot for inline completions. Now you're maintaining parallel configs in different formats that are supposed to stay consistent. A rule that works in .cursor/rules/testing.mdc might contradict what you put in CLAUDE.md. Nobody catches it.

What actually goes wrong in the real world

After digging through official specs, research, and a lot of "why is this not working?" debugging, the failure modes repeat:

Skills that never trigger:

YAML frontmatter is missing / malformed (most common)
Name is PascalCase instead of kebab-case
No trigger phrases → agent has no way to discover it
Invalid values in model/context fields

Skills that trigger when they shouldn't:

A deploy/publish/delete skill without disable-model-invocation: true — Claude can auto-trigger it without you asking

Hooks that quietly fail:

Event name is wrong (PreToolExecution instead of PreToolUse)
command missing (so the hook is basically a comment)
Script path points to nothing
Dangerous commands sneaking in (rm -rf in a hook with no guard… yeah)

Memory files that make the agent worse:

Generic fluff like "be helpful and accurate" (wastes context; the model already knows)
Important rules buried mid-file (primacy/recency applies to LLM context too)
Files that are too long for the tool to reliably respect — Windsurf caps rules files at 12,000 chars, Copilot global instructions work best under ~4,000 — past those ceilings the tool either truncates or deprioritizes your instructions

Cross-tool conflicts:

Your CLAUDE.md says npm test but your AGENTS.md says pnpm test — one agent runs tests correctly, the other doesn't
Cursor rules allow unrestricted Bash, but your CLAUDE.md disallows it — the agent's permissions depend on which tool you're using
Multiple instruction layers with contradictions (and good luck knowing which one actually takes precedence)

MCP servers with protocol violations:

Missing required fields in tool defs
Invalid transport config
Schema mismatches that lead to tools "existing" but not actually working

So I built a linter for this

I couldn't find a tool that just tells you, plainly, across tools: "this config won't work".

So I made one.

agnix is a linter for AI agent configurations. It validates skills, hooks, memory files, plugins, MCP configs, and agent definitions across Claude Code, Cursor, GitHub Copilot, Cline, Codex CLI, OpenCode, and Gemini CLI.

It's currently 156 rules. Every rule links to its source — an official spec, vendor docs, or research paper — with RFC 2119 severity levels (MUST vs SHOULD vs BEST_PRACTICE) so you know what's a hard requirement vs a recommendation. It's not "trust me bro".

What it looks like:

$ npx agnix .

CLAUDE.md:15:1 warning: Generic instruction 'Be helpful and accurate' [fixable]
  help: Remove generic instructions. Claude already knows this.

.claude/skills/review/SKILL.md:3:1 error: Invalid name 'Review-Code' [fixable]
  help: Use lowercase letters and hyphens only (e.g., 'code-review')

.claude/settings.json:12:5 error: Script file not found: './scripts/lint.sh'
  help: Create the file or fix the path

.cursor/rules/testing.mdc:1:1 error: Missing required frontmatter
  help: Add YAML frontmatter with description, globs, and alwaysApply fields

Found 3 errors, 1 warning
  2 issues are automatically fixable

hint: Run with --fix to apply fixes

It also auto-fixes.

agnix --fix . rewrites configs to comply with specs
agnix --fix-safe . applies only high-certainty fixes (things like normalizing a skill name to kebab-case — not changes that might alter semantic meaning)

What it covers

Claude Code:

CLAUDE.md, hooks, agents, plugins, skills
Frontmatter errors, invalid models, broken script paths, generic instructions, dangerous auto-invocation, manifest issues

Cursor:

.cursor/rules/*.mdc, .cursorrules
Missing frontmatter, invalid globs, deprecated formats, boolean-vs-string alwaysApply

GitHub Copilot:

.github/copilot-instructions.md, .github/instructions/*.instructions.md
Empty files, missing applyTo frontmatter in scoped instructions, invalid glob patterns, file length limits

MCP:

*.mcp.json
Protocol violations, schema errors, transport config

AGENTS.md:

AGENTS.md, AGENTS.local.md
Tool-specific size limits, platform-specific features without guards, nesting issues

Cline:

.clinerules, .clinerules/*.md
Structure + validation

Codex CLI:

.codex/config.toml
Config validation

OpenCode:

opencode.json
Schema validation

Gemini CLI:

GEMINI.md
Format validation

There are also cross-platform rules (XP-*) that detect conflicts between tools, and prompt-engineering rules (PE-*) that catch patterns that consistently degrade behavior.

How it works (quick)

It's written in Rust.

Walk the repo (respect .gitignore)
Detect file type by path (SKILL.md, .mdc, settings.json, etc.)
Validate in parallel across CPU cores
Emit diagnostics + suggested fixes

Typical performance: single file under 10ms, 100-file project ~200ms, 1000-file project ~2s.

The rules stay current — a CI workflow monitors upstream specs weekly and flags when vendor documentation drifts from what agnix expects.

Not just a CLI

agnix also runs as an LSP, so you get real-time validation in your editor:

VS Code: Extension on the marketplace
JetBrains: Plugin on the marketplace
Neovim: plugin available
Zed: extension available

There's also an MCP server (agents can lint their own configs), plus a GitHub Action for CI:

- name: Validate agent configs
  uses: avifenesh/agnix@v0
  with:
    target: 'claude-code'

Try it

npx agnix .

That's it. Run it on your repo. In my experience, almost every repo with agent configs has a few issues — and they're usually the silent ones that have been dragging quality down for weeks.

If you want a "real" install:

npm install -g agnix          # npm
brew tap avifenesh/agnix && brew install agnix  # Homebrew
cargo install agnix-cli       # Cargo

MIT/Apache-2.0, open source: github.com/avifenesh/agnix

Why I think this matters

The ecosystem is fragmenting fast. Every tool invented its own format, its own conventions, and its own special ways to fail quietly.

We have linters for code. For configs. For IaC. But the layer that tells AI agents how to behave — the stuff sitting between you and basically every AI interaction — had nothing.

agnix doesn't tell you what to write. It tells you when what you wrote won't work.

agnix is open source and free. GitHub | Docs | npm