Forem: Shudipto Trafder

Claude Code Is Reading Your .env File Right Now — And You Probably Don't Know It

Shudipto Trafder — Tue, 19 May 2026 03:05:32 +0000

Every time you open a project with Claude Code, it starts scanning your files. Your source code. Your configs. Your .env file.

By the time you type your first prompt, Claude already knows your database password, your Supabase service key, and that Twilio auth token you've been meaning to rotate for three months. And depending on your setup, all of that might be sitting in Anthropic's conversation logs right now.

I know this sounds alarmist. But this isn't theoretical — a GitHub issue filed in April 2026 confirmed that Claude reads and echoes .env contents into conversation context, even when you've explicitly told it not to in your CLAUDE.md. That was the moment I stopped trusting advisory rules and started understanding how Claude's permission system actually works.

Here's what I found — and more importantly, how to actually fix it.

The False Sense of Security: Why CLAUDE.md Won't Save You

The first thing most developers do is open their CLAUDE.md and write something like:

"Never read .env files. Never expose API keys."

Reasonable. Logical. And almost completely useless as a security control.

Here's the uncomfortable truth: CLAUDE.md is a suggestion, not a constraint. Claude follows it under normal conditions — short context windows, simple tasks, clear intent. But push the model into a complex debugging session with a long conversation history and an ambiguous instruction, and those advisory rules start slipping.

The model isn't being malicious. It's just prioritizing. When the system prompt says "don't read .env" but the task at hand requires understanding why a database connection is failing, the task usually wins.

The only thing that actually enforces a hard boundary is a deny rule in settings.json. Deny rules are evaluated before Claude even attempts the operation. The file never opens. The contents never enter the context. It's the difference between "please don't" and "you physically cannot."

It's Not Just One Leak — It's Three

Most developers, once they hear about this problem, go add a deny rule for .env files and call it done. That blocks one of the three ways your secrets get exposed.

Leak #1: Direct file read — This is the obvious one. Claude scans your project directory, opens .env, and the keys become part of the conversation. Deny rules stop this completely.

Leak #2: Runtime output capture — This is the sneaky one. Claude runs your test suite. One test makes an HTTP request with an Authorization header. The request fails and the error log dumps the full header value — your live API key — into the terminal output. Claude captures all of that output. Your secret is now in the conversation, and Claude never needed to open a single file.

Or imagine a database connection timing out. The error message includes the full connection string: postgres://admin:MyActualPassword123@prod-database.us-east-1.rds.amazonaws.com/appdatabase. Claude sees it. It's in context. Done.

Leak #3: Search and grep — Claude uses grep to find where you defined a helper function. The search returns matches from a config file that happens to contain your Resend API key alongside the function definition. The matched lines show up in grep output. Claude reads it. You never suspected a thing.

[!warning]+ Reality Check
Most guides on this topic protect against Leak #1 only. Leaks #2 and #3 are where real credentials actually escape in production workflows. I'll show you how to handle all three.

The Fix That Actually Works: Hard Deny Rules

Open ~/.claude/settings.json — or create it if it doesn't exist. This is the global config that applies to every project you open with Claude Code.

Add deny rules for every sensitive file pattern you want to block:

{
  "permissions": {
    "deny": [
      "Read(**/.env*)",
      "Read(**/secrets/**)",
      "Read(**/credentials/**)",
      "Read(**/.ssh/**)",
      "Write(**/.env*)",
      "Write(**/secrets/**)",
      "Write(**/credentials/**)",
      "Write(**/.ssh/**)"
    ]
  }
}

The ** wildcard means these rules apply to every subdirectory, not just the project root. If you have a monorepo with a packages/api/.env.production, it's blocked. If your CI scripts live in tooling/scripts/.env.ci, it's blocked.

Write rules matter too — you don't want Claude accidentally creating or overwriting .env files during a "let me set up your environment" task.

Solving Leak #2: The Test Environment Trick

Deny rules can't intercept runtime output — that's just text flowing through the terminal. The solution is to ensure Claude never runs code that has access to real credentials in the first place.

Create a .env.test file that contains placeholder values for every key your app uses:

# .env.test — safe values for all automated tasks
ANTHROPIC_API_KEY=sk-ant-test-placeholder-not-real
SUPABASE_URL=https://test-project.supabase.co
SUPABASE_SERVICE_ROLE_KEY=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.test
RESEND_API_KEY=re_test_placeholder_123456789
TWILIO_ACCOUNT_SID=ACtest00000000000000000000000000000
TWILIO_AUTH_TOKEN=test_auth_token_placeholder_value
REDIS_URL=redis://localhost:6379

Then point your test runner at .env.test instead of .env. For Python projects using pytest, add this to pytest.ini:

[pytest]
env_files = .env.test

For Node.js projects, load it explicitly in your test setup:

// test/setup.js
import { config } from 'dotenv'
config({ path: '.env.test', override: true })

Now when Claude runs your test suite and a request fails with a logged header, the only key that shows up is sk-ant-test-placeholder-not-real. Harmless.

Solving Leak #3: The Pre-Commit Safety Net

Even with deny rules and test environments, the repo itself is the last line of defense. A pre-commit hook scans every staged file before it reaches git history — and git history is permanent in a way that conversations aren't.

Create .git/hooks/pre-commit:

#!/usr/bin/env bash
# Pre-commit hook: blocks commits that contain credential patterns

set -euo pipefail

SECRET_PATTERNS=(
  'sk-ant-api'              # Anthropic production keys
  're_[A-Za-z0-9]{20,}'    # Resend API keys
  'eyJhbGciOiJIUzI1NiJ9'   # Supabase JWTs (common header)
  'ACa[0-9a-f]{32}'        # Twilio Account SIDs
  'AKID[A-Z0-9]{16}'       # Cloud access key IDs
  'postgres://[^@]+:[^@]+@' # Postgres DSNs with embedded passwords
  'mongodb\+srv://[^@]+:[^@]+@' # MongoDB Atlas URIs
  '-----BEGIN (RSA|EC|OPENSSH) PRIVATE KEY-----'
)

BLOCKED_FILENAMES=(
  '.env'
  '.env.local'
  '.env.production'
  'id_rsa'
  'id_ed25519'
  '*.p12'
  '*.pfx'
  'service-account.json'
)

FOUND_ISSUE=0

# Check for secret patterns in staged diff
for pattern in "${SECRET_PATTERNS[@]}"; do
  if git diff --cached --diff-filter=ACM -- . | grep -qE "^\+.*${pattern}"; then
    echo "❌ BLOCKED: Potential secret found matching pattern: ${pattern}"
    FOUND_ISSUE=1
  fi
done

# Check for sensitive filenames being staged
for filename in "${BLOCKED_FILENAMES[@]}"; do
  if git diff --cached --name-only | grep -qF "${filename}"; then
    echo "❌ BLOCKED: Sensitive file staged for commit: ${filename}"
    FOUND_ISSUE=1
  fi
done

if [[ $FOUND_ISSUE -eq 1 ]]; then
  echo ""
  echo "Remove the flagged content and try again."
  echo "If this is a false positive, use: git commit --no-verify"
  exit 1
fi

echo "✅ Pre-commit security scan passed."
exit 0

Make it executable:

chmod +x .git/hooks/pre-commit

This catches the patterns that matter for modern stacks: Anthropic keys, Resend, Supabase JWTs, Twilio, cloud IAM keys, embedded database passwords in connection strings, and private key material. If you hit a false positive on a test value, git commit --no-verify skips the hook — but that's a conscious override, not an accident.

The Nuclear Option: Container Isolation

For client work or anything touching production credentials, there's a more drastic approach: don't let .env files exist inside Claude's environment at all.

# Replace .env with an empty file at mount time
docker run \
  -v "$(pwd):/workspace" \
  -v /dev/null:/workspace/.env \
  -v /dev/null:/workspace/.env.local \
  -w /workspace \
  your-dev-image

From Claude's perspective, .env is an empty file. The deny rules still apply. The test environment still runs. But even if something went wrong at every other layer, there's nothing to leak because the file physically contains nothing.

This is overkill for personal projects. It's the right call for anything where you're holding someone else's production database password.

The Full Config: Copy, Paste, Done

Here's the complete ~/.claude/settings.json that combines everything — allowing normal development operations while blocking secrets and dangerous commands:

{
  "permissions": {
    "allow": [
      "Read",
      "Glob",
      "Grep",
      "LS",
      "Edit",
      "MultiEdit",
      "Write(src/**)",
      "Write(tests/**)",
      "Write(docs/**)",
      "Bash(python -m pytest *)",
      "Bash(uv run *)",
      "Bash(poetry run *)",
      "Bash(npm run *)",
      "Bash(npx *)",
      "Bash(git status)",
      "Bash(git diff *)",
      "Bash(git log *)",
      "Bash(git add *)",
      "Bash(git commit *)"
    ],
    "deny": [
      "Read(**/.env*)",
      "Read(**/.dev.vars*)",
      "Read(**/*.pem)",
      "Read(**/*.key)",
      "Read(**/*.p12)",
      "Read(**/secrets/**)",
      "Read(**/credentials/**)",
      "Read(**/.aws/**)",
      "Read(**/.ssh/**)",
      "Read(**/config/secrets.toml)",
      "Read(**/config/production.json)",
      "Read(**/.netrc)",
      "Read(**/.pypirc)",
      "Write(**/.env*)",
      "Write(**/secrets/**)",
      "Write(**/.ssh/**)",
      "Write(.github/workflows/**)",
      "Bash(rm -rf *)",
      "Bash(sudo *)",
      "Bash(git push *)",
      "Bash(pip install * --user)",
      "Bash(curl * | bash)",
      "Bash(curl * | sh)",
      "Bash(wget * -O- | sh)"
    ],
    "defaultMode": "acceptEdits"
  }
}

The allow list covers what you actually do day-to-day: reading code, making edits, running tests, checking git status. The deny list covers secrets, sensitive system directories, and shell patterns that could pipe remote code into execution.

The Decision Matrix

Your situation                            → What to do
─────────────────────────────────────────────────────────────────
Personal projects, no production creds   → deny rules in settings.json
Team project with shared repo            → deny rules + pre-commit hook
Running Claude-generated tests often     → deny rules + .env.test setup
Client work / holding their credentials  → All of the above + container isolation
CI/CD pipeline with Claude integration   → Vault (AWS Secrets Manager, GCP Secret Manager)

Before You Close This Tab: The 6-Point Check

Run through these right now, not after your next session:

~/.claude/settings.json exists and has deny rules for .env*, *.pem, *.key, and secrets/**
.env.test exists with placeholder values for every key your test suite touches
.git/hooks/pre-commit is executable and scans for credential patterns
.env is in .gitignore — if it's not, fix that first
Production credentials live in a vault, not in a plaintext file anywhere near your project
.env files live outside the project directory when possible — one directory up means they're never inside a Claude scan boundary

If you checked all six: you're in better shape than 95% of Claude Code users. If you checked zero: you're one long debugging session away from your live credentials becoming part of a conversation log you can't delete.

The deny rules take five minutes to set up. The pre-commit hook takes another five. That's ten minutes against an unlimited blast radius.

The Bottom Line

Claude Code is genuinely useful. That's not in question. But "useful" and "safe by default" are different things, and right now it leans heavily toward the former.

The tooling to make it safe exists — it's just not turned on out of the box. CLAUDE.md instructions feel like security because they're written with security intent. But they're conversation rules, not permission boundaries. One confused model state and they're gone.

Deny rules in settings.json are a different category entirely. They're enforced at the system level, before the model sees anything. That's what a real boundary looks like.

Set them up once. Run every future session knowing your secrets are actually safe.

InjectQ: The Modern Python Dependency Injection Library

Shudipto Trafder — Sun, 17 May 2026 17:19:14 +0000

The Pain of Python Apps (And How InjectQ Fixes It)

Ever built a Python app that started simple but slowly turned into a tangled web of dependencies?
Where changing one component breaks another?
Where testing becomes painful and dependency management spirals out of control?

You’re not alone.

Most Python developers eventually hit the same problems:

Tight coupling between components
Difficult-to-test business logic
Manual dependency wiring everywhere
Async code that becomes messy over time

Traditional dependency injection frameworks often make things worse:

Verbose configuration
Complex setup
Poor async support
Too much boilerplate

That’s where InjectQ comes in.

InjectQ is a lightweight, modern dependency injection library for Python that feels intuitive from day one.
It’s as simple as using a dictionary, yet powerful enough for enterprise-grade applications.

Built for modern Python:

Async-first
Type-safe
Thread-safe
FastAPI-ready
FastMCP-ready
Taskiq-ready

5-Minute Setup

Install InjectQ:

pip install injectq

Your First InjectQ App

from injectq import InjectQ, inject, singleton

container = InjectQ.get_instance()


@singleton
class Database:
    def __init__(self):
        print("DB connected!")


@singleton
class UserService:
    def __init__(self, db: Database):
        self.db = db

    def get_user(self, user_id: int):
        return {
            "id": user_id,
            "name": "Alice",
        }


@inject
def main(service: UserService):
    user = service.get_user(1)
    print(user)


if __name__ == "__main__":
    main()

Output

DB connected!
{'id': 1, 'name': 'Alice'}

No manual wiring.
No container plumbing.
Just clean, testable Python code.

Why InjectQ?

1. Dictionary-Simple API

InjectQ keeps dependency management intuitive.

from injectq import InjectQ

container = InjectQ.get_instance()

container[str] = "Hello World"
container[Database] = Database

message = container[str]
db = container[Database]

Perfect for:

Rapid prototyping
Config management
Lightweight applications
Enterprise services

2. Automatic Injection with `@inject`

The @inject decorator resolves dependencies automatically using type hints.

Works with:

Functions
Class methods
Static methods
Async functions

from injectq import inject


@inject
def process_user(user_id: int, service: UserService):
    return service.get_user(user_id)


result = process_user(123)

No need to pass service manually.

3. Lazy Injection with `Inject[T]`

Need optional or deferred dependencies?

Use Inject[T] for lazy resolution.

from injectq import Inject


def handle_request(
    data: dict,
    service: UserService = Inject[UserService],
):
    return service.get_user(data["id"])

Dependencies resolve only when accessed.

Benefits:

Faster startup time
Lower memory usage
Cleaner optional dependency handling

4. Powerful Factory Support

InjectQ supports runtime-aware factories with mixed injected and manual parameters.

from injectq import InjectQ

container = InjectQ.get_instance()


class UserHandler:
    def __init__(self, db: Database, user_id: str):
        self.db = db
        self.user_id = user_id


def create_handler(db: Database, user_id: str):
    return UserHandler(db, user_id)


container.bind_factory("handler", create_handler)

handler = container.invoke("handler", user_id="alice")

Ideal for:

Request-specific objects
Multi-tenant systems
Background jobs
Dynamic runtime construction

5. Lifecycle & Scope Control

Control object lifecycles precisely.

from injectq import singleton, scoped, transient


@singleton
class DatabasePool:
    pass


@scoped("request")
class RequestContext:
    pass


@transient
class Validator:
    pass

Available Scopes

Scope	Behavior
`@singleton`	One instance for the entire application
`@scoped()`	One instance per scope/context
`@transient`	New instance every resolution

Perfect for:

Database pools
Request contexts
Per-task resources
Stateless utilities

6. Async-First by Design

InjectQ was built for modern async Python.

from injectq import inject


@inject
async def async_task(service: AsyncService):
    return await service.process()


result = await async_task()

Supports:

Async dependency resolution
Async factories
Async scopes
Async frameworks

No hacks. No wrappers. Native async support.

Framework Integrations

InjectQ integrates seamlessly with modern Python frameworks.

FastAPI Integration

from typing import Annotated

from fastapi import FastAPI
from injectq import InjectQ
from injectq.integrations.fastapi import (
    InjectFastAPI,
    setup_fastapi,
)

app = FastAPI()

container = InjectQ.get_instance()

setup_fastapi(container, app)


@app.get("/users/{user_id}")
async def get_user(
    user_id: int,
    service: Annotated[
        UserService,
        InjectFastAPI(UserService),
    ],
):
    return service.get_user(user_id)

Features:

Request-scoped dependencies
Type-safe injection
Async-native support
Clean route handlers

Taskiq Integration

Background job processing becomes clean and maintainable.

from typing import Annotated

from taskiq import InMemoryBroker
from injectq.integrations.taskiq import (
    InjectTaskiq,
    setup_taskiq,
)

broker = InMemoryBroker()

setup_taskiq(container, broker)


@broker.task
async def process_order(
    order_id: int,
    service: Annotated[
        OrderService,
        InjectTaskiq(OrderService),
    ],
):
    return await service.process(order_id)

Perfect for:

Async workers
Distributed tasks
Queue processing
Event-driven systems

FastMCP Integration

Build clean, dependency-injected MCP servers effortlessly.

from typing import Annotated

from fastmcp import FastMCP
from injectq import InjectQ
from injectq.integrations.fastmcp import (
    InjectFastMCP,
    setup_fastmcp,
)

mcp = FastMCP("example-server")

container = InjectQ.get_instance()

setup_fastmcp(container, mcp)


@mcp.tool()
async def get_user(
    user_id: int,
    service: Annotated[
        UserService,
        InjectFastMCP(UserService),
    ],
):
    return service.get_user(user_id)

Ideal for:

AI tools
MCP servers
Agent platforms
LLM applications

Performance That Impresses

InjectQ is designed for speed.

Benchmarks

Metric	Performance
Dependency Resolution	~1µs
10-Service Web Request	~142µs
DI Throughput	7,000+ req/sec

Even deep dependency trees resolve extremely fast.

Built with:

Thread safety
Zero unnecessary locks
Optimized async execution
Minimal overhead

Why Developers Choose InjectQ

Simple and intuitive API
Async-first architecture
Excellent performance
Type-safe dependency injection
Lightweight and minimal
Production-ready scopes
FastAPI integration
Taskiq integration
FastMCP integration
Enterprise-friendly design

Whether you're building:

Scripts
APIs
Background workers
MCP servers
Enterprise systems

InjectQ scales with your application.

Get Started

Install:

pip install injectq

Documentation:

InjectQ Docs

GitHub Repository:

InjectQ GitHub

Start building cleaner, faster, and more maintainable Python applications with InjectQ.

AgentFlow — From Agent Code to Production API in Minutes

Shudipto Trafder — Sun, 03 May 2026 17:09:58 +0000

AgentFlow — The Python Framework for Production AI Agents

Stop rebuilding the same agent infrastructure. AgentFlow gives you auth, streaming, persistence, and a React frontend — out of the box.

AgentFlow (10xscale-agentflow on PyPI) is an open-source Python framework for building and deploying multi-agent AI systems. Write your agent graph once. Run it locally. Ship it to production without rewriting your backend.

Built by 10xScale. MIT licensed. No vendor lock-in.

Why AgentFlow?

Most agent frameworks stop at the prototype. You get a cute demo, then spend weeks bolting on auth, rate limiting, persistence, and a frontend. AgentFlow is built for what comes after the demo.

One framework. From first pip install to production Docker deploy.

🔗 Links

Resource	URL
Core Python Library	github.com/10xHub/Agentflow
API & CLI	github.com/10xHub/agentflow-cli
Documentation	10xhub.github.io/agentflow-docs
PyPI — Core	pypi.org/project/10xscale-agentflow
PyPI — CLI	pypi.org/project/10xscale-agentflow-cli

The Full Stack

agentflow            →  Core Python orchestration engine
agentflow-cli        →  FastAPI server + CLI tooling
agentflow-client     →  TypeScript/React SDK (@10xscale/agentflow-client)
agentflow-playground →  Hosted UI for testing agents

Use any layer alone. Use them together for a complete AI product stack — from LLM call to browser UI — without stitching four different libraries together.

Get Running in 60 Seconds

pip install 10xscale-agentflow-cli

agentflow init   # scaffold a new project
agentflow api    # start the dev server
agentflow play   # open the playground UI

That's it. Your agent is running, streamed, and explorable in under a minute.

What You Get

Graph-Based Agent Orchestration

AgentFlow uses a StateGraph — directed nodes, conditional edges, and full control over execution flow. No black boxes. No magic routing you can't debug.

from agentflow.graph import Agent, StateGraph, ToolNode
from agentflow.state import AgentState, Message
from agentflow.utils.constants import END

def get_weather(location: str) -> str:
    """Get weather for a location."""
    return f"The weather in {location} is sunny, 72°F"

graph = StateGraph()
graph.add_node("MAIN", Agent(
    model="gemini/gemini-2.5-flash",
    system_prompt=[{"role": "system", "content": "You are a helpful assistant."}],
    tool_node_name="TOOL"
))
graph.add_node("TOOL", ToolNode([get_weather]))

def route(state: AgentState) -> str:
    if state.context and state.context[-1].tools_calls:
        return "TOOL"
    return END

graph.add_conditional_edges("MAIN", route, {"TOOL": "TOOL", END: END})
graph.add_edge("TOOL", "MAIN")
graph.set_entry_point("MAIN")

app = graph.compile()
result = app.invoke(
    {"messages": [Message.text_message("Weather in NYC?")]},
    config={"thread_id": "1"}
)

Stateful. Tool-calling. Under 30 lines.

LLM-Agnostic

Pass the model string. AgentFlow routes it.

Provider	Package
OpenAI (GPT-4o, o3, etc.)	`pip install openai`
Google Gemini + Vertex AI	`pip install google-genai`
Anthropic Claude	`pip install anthropic` (coming soon)

No provider-specific abstractions to learn. Swap models without touching your agent logic.

Parallel Tool Execution — Automatic

When an LLM calls multiple tools at once, AgentFlow runs them concurrently. No config required.

Other frameworks:  1.0s + 1.5s + 0.8s = 3.3s
AgentFlow:         max(1.0s, 1.5s, 0.8s) = 1.5s  ⚡ 2.2x faster

Production Memory — Three Layers

Working Memory    →  Current execution state (AgentState)
Session Memory    →  Redis (hot) + PostgreSQL (durable) checkpointer
Knowledge Memory  →  Qdrant vector store + Mem0 semantic recall

Redis keeps hot conversation state fast. PostgreSQL keeps it durable and horizontally scalable. Both run together — you don't pick one.

Streaming

stream_gen = app.astream(
    inp,
    config=config,
    response_granularity=ResponseGranularity.LOW,
)
async for chunk in stream_gen:
    print(chunk.model_dump())

Three granularity levels: token-by-token (ChatGPT-style), message-by-message, or node-by-node graph traces. Your frontend decides what to show.

Auth and Security — Built In, Not Bolted On

Most frameworks leave auth as an exercise for the reader. AgentFlow ships it.

{ "auth": "jwt" }
{ "auth": null }
{ "auth": { "method": "custom", "path": "auth.my_backend:MyAuth" } }

One line in agentflow.json. Switch from dev to production auth without touching your graph code.

Security features included:

JWT authentication with configurable secrets
Custom auth backends for OAuth2, API keys, and sessions
Role-Based Access Control (RBAC)
Sliding-window rate limiting (memory or Redis backends)
Configurable request size limits (DoS protection, default 10 MB)
Auto-redaction of tokens and secrets from logs
Startup validation — warns about insecure CORS and debug mode before you accidentally deploy them

Lifecycle Callbacks

Hook into every layer of execution — before and after each LLM call, tool call, or MCP invocation. Hook into the graph itself for start, end, checkpoint, interrupt, resume, and error events.

Use them for audit logs, billing meters, policy enforcement, prompt-injection checks, or any business logic that shouldn't live inside the prompt.

The CLI

agentflow init              # scaffold project + config
agentflow api               # dev server with auto-reload
agentflow play              # open playground against local backend
agentflow build --docker-compose  # generate Dockerfile + compose

Auto-generated FastAPI endpoints:

Endpoint	Method
`/invoke`	POST — synchronous agent call
`/stream`	POST — streaming agent call
`/threads`	GET — list conversation threads
`/threads/{id}`	GET — fetch thread history
`/threads/{id}`	DELETE — delete thread

Your agent graph becomes a production API. No FastAPI boilerplate to write.

Dependency Injection with InjectQ

from agentflow.utils import tool

@tool(tags=["weather"])
async def get_weather(location: str, user_id: str = Inject(UserService)) -> str:
    """Get weather for a location."""
    return f"Weather for user {user_id} in {location}: sunny"

Clean tools. Testable tools. Per-request context without global state.

Human-in-the-Loop

Pause execution mid-graph. Inject a human decision. Resume with full state intact. No re-running prior steps.

Approval workflows, moderation gates, interactive debugging — all supported without custom state management.

Event Publishing

Publisher	Use Case
Redis Pub/Sub	Lightweight in-process distribution
Kafka	High-volume event streaming
RabbitMQ	Reliable queuing, distributed systems
Console	Local debugging
Custom	Any backend you want

React/TypeScript Client SDK

@10xscale/agentflow-client gives you React hooks (useAgent, useStream, useThreads), token-level streaming for ChatGPT-style UIs, and client-side tool execution. The frontend talks to your AgentFlow API without custom integration code.

Feature Comparison

Feature	AgentFlow	LangGraph	CrewAI	AutoGen
Architecture	Graph	Graph	Role-Based	Conversational
Full Stack (Backend + Frontend SDK)	✅	❌	❌	❌
Parallel Tool Execution	✅ Auto	⚠️ Config	❌	❌
Persistence	✅ Redis + Postgres	⚠️ Postgres/SQLite	⚠️ Local	⚠️ Local
Dependency Injection	✅ Native	❌	❌	❌
CLI + Docker Deployment	✅ One command	❌	❌	❌
Auth Built-In	✅ JWT + Custom	❌	❌	❌
Rate Limiting	✅ Memory + Redis	❌	❌	❌
Lifecycle Callbacks	✅ Full	⚠️ Manual	❌	⚠️ Manual
MCP Support	✅ Native	⚠️ Partial	❌	❌
Event Publishing	✅ Kafka/Redis/AMQP	❌	❌	❌
Open Source (MIT)	✅	✅	✅	✅

Installation

# Core library
pip install 10xscale-agentflow

# Full CLI + API server
pip install 10xscale-agentflow-cli

Optional extras:

pip install 10xscale-agentflow[pg_checkpoint]   # PostgreSQL + Redis persistence
pip install 10xscale-agentflow[mcp]             # Model Context Protocol
pip install 10xscale-agentflow[google-genai]    # Google GenAI adapter
pip install 10xscale-agentflow[kafka]           # Kafka event publishing
pip install 10xscale-agentflow[redis]           # Redis publisher + rate limiting

Current Version

Package	Version
`10xscale-agentflow` (core)	v0.7.4
`10xscale-agentflow-cli`	v0.3.2

Added in v0.7.x: multimodal support (images, audio, video), extended reasoning / chain-of-thought, 3-layer memory, callback and lifecycle hooks, agent skills, Vertex AI support, structured Pydantic outputs.

Roadmap

✅ Graph engine with nodes, edges, and conditional routing
✅ Redis + PostgreSQL state checkpointing
✅ Tool integration — local Python, MCP, optional adapters
✅ Parallel tool execution
✅ Lifecycle callbacks and graph hooks
✅ Streaming + event publishing
✅ Human-in-the-loop
✅ Multimodal agents
🚧 Remote node execution for distributed processing
🚧 OpenTelemetry tracing
🚧 More persistence backends (DynamoDB, etc.)
🚧 Visual graph editor

Privacy and Licensing

MIT License — use freely in commercial products
No data collection — your conversations and agent data stay on your infrastructure
No per-call billing — you pay for your LLM API and infra, not our licensing
Deploy anywhere — Docker, Kubernetes, AWS ECS, Cloud Run, Azure, Heroku

Links


Core Library	https://github.com/10xHub/Agentflow
API & CLI	https://github.com/10xHub/agentflow-cli
Documentation	https://10xhub.github.io/agentflow-docs
PyPI Core	https://pypi.org/project/10xscale-agentflow/
PyPI CLI	https://pypi.org/project/10xscale-agentflow-cli/
Issues & Requests	https://github.com/10xHub/Agentflow/issues
Discussions	https://github.com/10xHub/Agentflow/discussions

Built by 10xScale and the community. MIT licensed.

TOON for LLMs: A Benchmark Performance Analysis

Shudipto Trafder — Sat, 27 Dec 2025 15:36:50 +0000

Every API call you make with JSON is costing you more than you think.

I ran real-world extractions using Gemini 2.5 Flash, and the results were startling: JSON consistently used 30–40% more output tokens than TOON format. In one test, JSON consumed 471 output tokens while TOON used just 227 — a 51% reduction.

But here’s where it gets interesting: TOON initially failed 70% of the time.

After optimization, I achieved 100% parsing success and discovered something counterintuitive — it uses more prompt tokens, with TOON actually saves you money overall. When I tested structured outputs with Pydantic models, JSON required 389 output tokens versus TOON’s simpler encoding.

The hidden goldmine? Tool/function calling. That’s where TOON’s compact format shines brightest, slashing token costs in agentic workflows where responses become the next prompt.

This isn’t theoretical. I’m sharing the actual prompts, parsing errors, token counts, and code that took TOON from a 70% failure rate to production-ready. Whether TOON beats JSON depends on your use case — and I have the data to prove exactly when.

Let’s break down the numbers.

Experiment #1: The Initial TOON Failure (70% Success Rate)

I started with what seemed like a straightforward test: extracting structured job description data using TOON instead of JSON.

The Setup:

My prompt was simple — ask Gemini 2.5 Flash to extract role, skills, experience, location, and responsibilities from a job posting. For the output format, I did what seemed logical: I showed TOON’s encoded structure using the actual output format (essentially a drop-in replacement approach).

Prompt:

Extract Role, Primary Skills, Secondary Skills,
Minimum Experience, Maximum Experience,
Location, Employment Type, Summary, and Responsibilities

Job Description:
<JD Text>

Output in TOON format:

Role: ""
"Primary Skills"[2]: Python,JavaScript
"Secondary Skills"[2]: Responsibility,Communication
"Minimum Experience": ""
"Maximum Experience": ""
Location: ""
"Employment Type": ""
Summary: ""
Responsibilities[2]: Task A,Task B

Here's what I suspected would work: By showing the encoded format with empty strings and generic placeholders, the model would understand the structure.

Reality check: 70% failure rate.
The errors were telling:

Error parsing TOON format for JD#2: Expected 10 values, but got 16
Error parsing TOON format for JD#5: Missing colon after key

The model was confused about arrays. Sometimes it outputs Skills: Python, JavaScript, React as a flat string. Other times, it attempted brackets but malformed the syntax.

The hypothesis: Maybe showing encoded/empty examples was the problem. The model needed to see real data patterns, especially for arrays.

Token Usage (Failed Attempts, 70% Success Rate):

Prompt: 729 tokens
Output: 227 tokens
Success Rate: ~30% initially, improved to 70% after adding two real examples with populated arrays

Json Token Usages:

Prompt: 723 tokens
Output: 471 tokens

Key Insight:
TOON's compact syntax is unforgiving. JSON has redundancy ({"key": "value"}) that helps models self-correct. TOON's Key: value format offers no such safety net. The model needed concrete examples, not abstract templates.

But 70% wasn't good enough for production. Time to fix this properly.

Experiment #2: Achieving 100% Parsing Success (And the Token Trade-off)

I needed to fix the 70% success rate. The solution? Stop being minimalist with examples.

Instead of showing encoded/empty structures, I gave the model a complete, realistic example with proper TOON formatting — especially for arrays.

The Revised Prompt:

Extract Role, Primary Skills, Secondary Skills,
Minimum Experience, Maximum Experience,
Location, Employment Type, Summary, and Responsibilities

Job Description:
<JD Text>

Output in TOON format. Example structure:

Role: "Senior Data Scientist"
Primary_Skills:

 [1]: "Machine Learning"
 [2]: "Statistical Analysis"
Secondary_Skills:
 [0]: "Big Data"
 [1]: "Cloud Platforms"
Minimum_Experience: "5 years"
Maximum_Experience: "10 years"
Location: "New York, NY or Remote"
Employment_Type: "Full-time"
Summary: "Lead data science initiatives"
Responsibilities:
 [0]: "Design ML models"
 [1]: "Analyze datasets"


Now provide the extraction in TOON format. Keep the format exactly
as shown above.

Result: 100% parsing. No more malformed arrays. No more missing colons.

But here's the catch—the prompt got heavier.

The Token Comparison: TOON vs JSON

Let me show you the actual numbers across the same 10 job descriptions:

JSON Approach: Token Usage

Prompt tokens: 723
Output tokens: 471
Success rate: 100% (JSON is forgiving)

TOON Approach (Initial — 70% success)

Prompt tokens: 729
Output tokens: 227 ✅ (51.8% reduction vs JSON)
Total: 956 tokens (saves 238 tokens per request)
Success rate: 70% ❌

TOON Approach (Optimized — 100% success)

Prompt tokens: 802 ❌ (+11% vs JSON)
Output tokens: 455 ✅ (3.4% reduction vs JSON)
Success rate: 100% ✅

The Uncomfortable Truth

For basic extraction tasks, optimized TOON costs MORE than JSON.

Yes, the output is slightly more compact (455 vs 471 tokens), but the verbose prompting needed to achieve 100% reliability completely erases any savings. In fact, you’re paying 5% more per request.

So why am I still testing TOON?

Because this experiment revealed something crucial: the baseline comparison is misleading. Real-world LLM applications don’t just extract data once — they use structured outputs for:

Pydantic model validation (native SDK support)
Tool/function calling (where output becomes input)
Multi-turn agentic workflows (repeated serialization)

That’s where the math changes completely. Let me show you.

Experiment #3: Pydantic Models — Where the SDK Does the Heavy Lifting

Here’s where things get interesting. Modern LLM SDKs have first-class support for structured outputs using Pydantic models. Instead of prompt engineering, you define a schema and let the SDK handle formatting.

The key difference: You don’t need to explain the output format in your prompt — the SDK extracts it from your Pydantic model automatically.

The Setup: Google’s GenAI SDK

I used the same job extraction task, but this time with a Pydantic model:

response = client.models.generate_content(
 model="gemini-2.5-flash",
 contents=prompt,
 config={
  "response_mime_type": "application/json",
  "response_schema": JobModel,
 },
)

Notice what’s missing: No output format instructions. No examples. No “Output as JSON with these exact keys.”

Become a member
The SDK injects the schema behind the scenes.

Token Comparison: Pydantic JSON vs Manual TOON

Pydantic + JSON (SDK-Managed)

Prompt tokens: 647 ✅ (19.3% less than optimized TOON)
Output tokens: 389 ✅ (14.5% less than optimized TOON)
Success rate: 100% ✅
Parsing: Native (SDK returns typed Python objects)

Manual TOON (From Experiment #2)

Prompt tokens: 802 ❌
Output tokens: 455 ❌
Success rate: 100% ✅
Parsing: Custom (you write the parser)

The Brutal Takeaway

For structured extraction with strong SDK support, Pydantic really shines. Native Pydantic integration delivers:

✅ Cleaner prompts (~155 fewer prompt tokens)
✅ Smaller outputs (~66 fewer output tokens)
✅ No custom parsing logic
✅ Built-in type validation
✅ Parsed objects returned directly, ready to use
✅ A much smoother developer experience

Because of this, I’ll increasingly rely on Pydantic and native parsing support for structured extraction. It’s simply more reliable and maintainable than handling parsing and validation manually.

That said, there’s one scenario where JSON’s verbosity becomes a genuine liability: tool calling in agentic workflows.
That’s where TOON finally proves its worth.

Experiment #4: Tool Calling — Where TOON Finally Wins

This is where everything clicked.

In agentic workflows, your LLM doesn’t just extract data once — it calls tools, receives results, and uses those results to reason further. The tool’s response becomes part of the next prompt. And if that response is bloated with JSON syntax, you’re paying for it twice: once as output, once as input.

The insight: Tool results are pure token waste. The model doesn’t need {"key": "value"} ceremony—it needs the data, efficiently encoded.

The Setup: Weather Agent with Function Calling

I built a simple agent that calls a get_current_weather function. The user asks for weather, the model calls the tool, the function returns data, and the model synthesizes a response.

The critical moment: What format should get_current_weather return?

Version A: JSON Tool Response

data = {
        "location": location,
        "current": {
            "temperature": "72 F",
            "condition": "sunny",
        },
        "forecast": forecast,
    }

return json.dumps(data)  # Returns JSON string

Version B: TOON Tool Response

data = {
        "location": location,
        "current": {
            "temperature": "72 F",
            "condition": "sunny",
        },
        "forecast": forecast,
    }

return encode(data)  # Returns TOON-encoded string

Main code

response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="What is the weather like in New York? Share next 15 days forecast as well.",
    config=types.GenerateContentConfig(
        tools=[get_current_weather],
    ),
)

Result Token Usage:

Initial prompt tokens: 152 (user message + tool definition)
Tool response tokens (becomes input): 480 ✅ (24% reduction)
Model’s final output: 384 (slightly longer, but reasonable)
Total tokens: 1,016 ✅ (11.5% reduction overall)

Why TOON Wins in Agentic Workflows

Here’s the math that matters:

Single Tool Call

JSON approach: 632 tokens for tool result
TOON approach: 480 tokens for tool result
Savings: 152 tokens per tool call (24%)

Multi-Turn Agent (5 tool calls)

JSON approach: 632 × 5 = 3,160 tokens in tool results
TOON approach: 480 × 5 = 2,400 tokens in tool results
Savings: 760 tokens (24%)

The Compounding Effect

Why this matters more than single extractions:

Tool results are pure input tokens — You pay for them every single time
Verbosity multiplies — JSON’s {}: , Syntax adds 20-30% overhead for nested data
No parsing penalty — The model consumes TOON just as easily (we verified this in follow-up tests)
Scales with agent complexity — More tools = more savings

The difference? Where the efficiency matters.

The Bottom Line

After this test runs across four different scenarios, here’s what the data tells us:

TOON loses at single extractions. Whether you’re doing manual prompting or using Pydantic models, JSON with SDK support is cleaner, cheaper, and more reliable. The 17.6% token savings from native schema integration beats TOON’s manual approach every time.

But TOON wins where it counts for agents: tool calling workflows.

When your LLM’s output becomes the next prompt — when data cycles between model and functions repeatedly — TOON’s 24% reduction per tool call transforms from interesting to impactful. An agent making 20 tool calls saves 3,040 tokens per session.

The decision matrix is simple:

Building a chatbot that extracts structured data? Use JSON + Pydantic.
Building an agent that calls tools 10+ times per session? Test TOON.
Building anything else? Profile first, optimize later.

Try It Yourself

I’ve open-sourced all the experiments, prompts, and token measurements: View complete code and results on GitHub Gist

The repository includes:

✅ All four experiment setups with actual prompts
✅ Token usage logs for every test case
✅ Side-by-side comparison scripts
✅ The job descriptions I used for testing

TOON isn’t magic — it’s math. And the math only works when token efficiency genuinely matters. For most applications, JSON’s ecosystem advantages outweigh the savings. But for token-heavy agentic workflows? TOON might just pay for itself.

Now you have the data to decide.

Forem: Shudipto Trafder

Claude Code Is Reading Your .env File Right Now — And You Probably Don't Know It

The False Sense of Security: Why CLAUDE.md Won't Save You

It's Not Just One Leak — It's Three

The Fix That Actually Works: Hard Deny Rules

Solving Leak #2: The Test Environment Trick

Solving Leak #3: The Pre-Commit Safety Net

The Nuclear Option: Container Isolation

The Full Config: Copy, Paste, Done

The Decision Matrix

Before You Close This Tab: The 6-Point Check

The Bottom Line

InjectQ: The Modern Python Dependency Injection Library

The Pain of Python Apps (And How InjectQ Fixes It)

5-Minute Setup

Your First InjectQ App

Output

Why InjectQ?

1. Dictionary-Simple API

2. Automatic Injection with @inject

3. Lazy Injection with Inject[T]

4. Powerful Factory Support

5. Lifecycle & Scope Control

Available Scopes

6. Async-First by Design

Framework Integrations

FastAPI Integration

Taskiq Integration

FastMCP Integration

Performance That Impresses

Benchmarks

Why Developers Choose InjectQ

Get Started

AgentFlow — From Agent Code to Production API in Minutes

AgentFlow — The Python Framework for Production AI Agents

Why AgentFlow?

🔗 Links

The Full Stack

Get Running in 60 Seconds

What You Get

Graph-Based Agent Orchestration

LLM-Agnostic

Parallel Tool Execution — Automatic

Production Memory — Three Layers

Streaming

Auth and Security — Built In, Not Bolted On

Lifecycle Callbacks

The CLI

Dependency Injection with InjectQ

Human-in-the-Loop

Event Publishing

React/TypeScript Client SDK

Feature Comparison

Installation

Current Version

Roadmap

Privacy and Licensing

Links

TOON for LLMs: A Benchmark Performance Analysis

Experiment #1: The Initial TOON Failure (70% Success Rate)

The Setup:

Token Usage (Failed Attempts, 70% Success Rate):

Json Token Usages:

Experiment #2: Achieving 100% Parsing Success (And the Token Trade-off)

The Revised Prompt:

The Token Comparison: TOON vs JSON

The Uncomfortable Truth

Experiment #3: Pydantic Models — Where the SDK Does the Heavy Lifting

The Setup: Google’s GenAI SDK

Token Comparison: Pydantic JSON vs Manual TOON

The Brutal Takeaway

Experiment #4: Tool Calling — Where TOON Finally Wins

The Setup: Weather Agent with Function Calling

Result Token Usage:

Why TOON Wins in Agentic Workflows

The Compounding Effect

The Bottom Line

Try It Yourself

2. Automatic Injection with `@inject`

3. Lazy Injection with `Inject[T]`