Forem

# llm

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Reducing LLM Cost and Latency Using Semantic Caching
Cover image for Reducing LLM Cost and Latency Using Semantic Caching

Reducing LLM Cost and Latency Using Semantic Caching

Comments 3
5 min read
I caught Claude Sonnet 4 inventing facts about a fake tool
Cover image for I caught Claude Sonnet 4 inventing facts about a fake tool

I caught Claude Sonnet 4 inventing facts about a fake tool

Comments
9 min read
Claude Designed Its Own Rule System — A Public Experiment

Claude Designed Its Own Rule System — A Public Experiment

1
Comments 1
4 min read
Qwen3.5 rodando localmente: super rápido e com ótima qualidade

Qwen3.5 rodando localmente: super rápido e com ótima qualidade

Comments
2 min read
The Great LLM Inference Engine Showdown: vLLM vs TGI vs TensorRT-LLM vs SGLang vs llama.cpp vs Ollama

The Great LLM Inference Engine Showdown: vLLM vs TGI vs TensorRT-LLM vs SGLang vs llama.cpp vs Ollama

Comments
10 min read
I Built a Benchmark That Proves Most LLM Agents Are Statistically Blind And Why That Costs Companies Real Money

I Built a Benchmark That Proves Most LLM Agents Are Statistically Blind And Why That Costs Companies Real Money

Comments
3 min read
How We Use Gherkin, Envelopes, and Schemas to Shape Agent Behavior

Behavioral science over ignored rule lists

How We Use Gherkin, Envelopes, and Schemas to Shape Agent Behavior

3
Comments 4
7 min read
The Evolution of Developer Tunnels: Bridging Local AI Experiments to the Cloud

The Evolution of Developer Tunnels: Bridging Local AI Experiments to the Cloud

Comments
9 min read
Running AI in the Browser with Gemma 4 (No API, No Server)
Cover image for Running AI in the Browser with Gemma 4 (No API, No Server)

Running AI in the Browser with Gemma 4 (No API, No Server)

2
Comments 1
2 min read
JGuardrails: Production-Ready Safety Rails for Java LLM Applications

JGuardrails: Production-Ready Safety Rails for Java LLM Applications

1
Comments
14 min read
I built a constitution for AI agents — budgets, permissions, and audits enforced before execution
Cover image for I built a constitution for AI agents — budgets, permissions, and audits enforced before execution

I built a constitution for AI agents — budgets, permissions, and audits enforced before execution

1
Comments
2 min read
I turned OpenAI Symphony into a one-command local workflow for any repo
Cover image for I turned OpenAI Symphony into a one-command local workflow for any repo

I turned OpenAI Symphony into a one-command local workflow for any repo

1
Comments
1 min read
The Model Isn't the Bottleneck — Your Prompt Structure Is

The Model Isn't the Bottleneck — Your Prompt Structure Is

Comments
3 min read
When Proxies Become the Attack Vectors in Web Architectures

When Proxies Become the Attack Vectors in Web Architectures

1
Comments
5 min read
From Chatting to Reading: Teaching Pebbles to See My Code

From Chatting to Reading: Teaching Pebbles to See My Code

Comments
5 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.