I Built a RAG-like Context Engine for Claude Code — Without Vector DB

Leo KIM — Tue, 31 Mar 2026 12:23:17 +0000

The Problem

Claude Code reads your CLAUDE.md once at session start. But here's the thing — Vercel's engineering team found that skills-based retrieval was skipped in 56% of eval cases. The model simply didn't invoke them.

I run Claude Code as my daily coding assistant across 26+ custom resources. After months of watching Claude forget rules, ignore conventions, and skip critical project knowledge, I built a system to fix it.

The Solution: Context Feeder

Context Feeder is a lightweight context injection engine that runs on Claude Code hooks. It doesn't ask the model what's relevant — it force-injects matched context on every message.

No vector database. No embeddings. No cloud API. Just JSON tags + shell scripts.

🔗 GitHub: github.com/friends0485-cyber/context-feeder

How It Works: 3-Stage Chain

Your Message
→ Parser (keyword match against tags.json)
→ Counter (track frequency, assign rank: best/normal/worst)
→ Injector (check rank threshold, read file, output to Claude's context)

Stage 1: Parser (`tag_search.py`)

Reads the user message from Claude Code's UserPromptSubmit hook via stdin, scans for keywords defined in tags.json, and saves matched file paths.

Stage 2: Counter (`counter.py`)

Tracks how many times each tag has been called in the current session. Assigns a rank:

Rank	Threshold	Meaning
best	1st match → inject	Frequently used tag
normal	2nd match → inject	Standard frequency
worst	3rd match → inject	Rarely used, auto-deleted after 30 days

Stage 3: Injector (`tag_injector.sh`)

Reads the ranked results, checks a 30-minute cooldown (prevents re-injection of the same file), and outputs matched TOML content to stdout — which Claude Code injects into the conversation context.

Context File Format

Rules and knowledge are stored as .toml files:

[rule_001]
title = "API Error Handling"
tags = ["error", "catch", "try", "exception"]
content = '''
All API endpoints must:
- Wrap async handlers with error middleware
- Return structured error responses
- Never expose stack traces to clients
'''

The tags field is what the parser matches against. The content is what gets injected.

Why Not RAG?

Claude Code's own team confirmed they dropped vector DB-based RAG early on:

"We tried RAG… we tried a few different kinds of search tools. And eventually, we landed on just agentic search… One is it outperformed everything. By a lot."

Context Feeder follows the same philosophy — deterministic keyword matching instead of probabilistic similarity search. It's faster, simpler, and requires zero infrastructure.

Key Differences from Existing Tools

Tool	Approach	Context Feeder
CLAUDE.md	Read once, model decides relevance	Force-injected on every match
RAG + Vector DB	Embeddings + infrastructure	JSON keywords + shell scripts
Claude-Mem	Session memory (past observations)	Rule injection (present context)
Skills	Model chooses to invoke	System forces delivery

Quick Start

Clone the repo
Add your rules as .toml files in contexts/
Register keywords in config/tags.json
Connect to Claude Code hooks in .claude/settings.json
Done — next message triggers automatic injection

Beyond the Core Engine

The open-source release is the core 3-stage chain. My production system has 10 interconnected modules including a logger (21 categories), watchdog (real-time dashboard), reminder (workflow violation detection), and a live console UI. The full architecture is documented in the repo.

What's Next

Auto-scanner that rebuilds tags.json from your TOML files (included)
Community-contributed context templates for popular frameworks
Integration patterns for monorepos

I'd love to hear how others are handling context injection with Claude Code hooks. What patterns have worked for you?

Built by Leo KIM — AI Automation Engineer
GitHub: context-feeder

Forem: Leo KIM