Forem

# llm

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Prompt Caching Works. Your Prompt Assembly Code Does Not.

Prompt Caching Works. Your Prompt Assembly Code Does Not.

Comments
4 min read
Opus 4.7 vs GLM 5.1: is mixing models worth it?

Opus 4.7 vs GLM 5.1: is mixing models worth it?

Comments
13 min read
Upgrading Kiwi-chan’s Brain: Pushing a 30GB "Frankenstein" GPU Rig to the Limit with Qwen 3.6-35B-A3B

Upgrading Kiwi-chan’s Brain: Pushing a 30GB "Frankenstein" GPU Rig to the Limit with Qwen 3.6-35B-A3B

Comments
4 min read
Mistral Medium 3.5 GGUF, FlashQLA Boost for Qwen, & Ollama Playground

Mistral Medium 3.5 GGUF, FlashQLA Boost for Qwen, & Ollama Playground

Comments
3 min read
When the Reranker Hurts: Recall@5 Cases Where Two-Stage Retrieval Loses to One
Cover image for When the Reranker Hurts: Recall@5 Cases Where Two-Stage Retrieval Loses to One

When the Reranker Hurts: Recall@5 Cases Where Two-Stage Retrieval Loses to One

Comments
7 min read
Why Strict JSON Mode Doesn't Stop Hallucinated Tool Calls
Cover image for Why Strict JSON Mode Doesn't Stop Hallucinated Tool Calls

Why Strict JSON Mode Doesn't Stop Hallucinated Tool Calls

Comments
7 min read
Every LLM Eval Library Has the Same Bug: Stochastic Judges Used as Deterministic Oracles
Cover image for Every LLM Eval Library Has the Same Bug: Stochastic Judges Used as Deterministic Oracles

Every LLM Eval Library Has the Same Bug: Stochastic Judges Used as Deterministic Oracles

Comments
7 min read
Local AI Accessibility, JetBrains’ 2026 IDE Plans, and Agentic Architecture Pitfalls

Local AI Accessibility, JetBrains’ 2026 IDE Plans, and Agentic Architecture Pitfalls

Comments
2 min read
Announcing Cliche

Announcing Cliche

Comments
3 min read
Anthropic Prompt Caching Saves 90% — Here's the One Caveat Nobody Mentions
Cover image for Anthropic Prompt Caching Saves 90% — Here's the One Caveat Nobody Mentions

Anthropic Prompt Caching Saves 90% — Here's the One Caveat Nobody Mentions

Comments
7 min read
Why I Built an AI That Tries to Destroy Your Legal Argument
Cover image for Why I Built an AI That Tries to Destroy Your Legal Argument

Why I Built an AI That Tries to Destroy Your Legal Argument

Comments
11 min read
Building an AI Agent That Owns Post-Call Execution: Architecture Decisions
Cover image for Building an AI Agent That Owns Post-Call Execution: Architecture Decisions

Building an AI Agent That Owns Post-Call Execution: Architecture Decisions

Comments
6 min read
Tokenizer Quirks: Claude, GPT, and Gemini Don't Count the Same Text the Same Way
Cover image for Tokenizer Quirks: Claude, GPT, and Gemini Don't Count the Same Text the Same Way

Tokenizer Quirks: Claude, GPT, and Gemini Don't Count the Same Text the Same Way

Comments
6 min read
The Hidden Tax of Structured Output: How Much Extra You Pay for JSON Mode
Cover image for The Hidden Tax of Structured Output: How Much Extra You Pay for JSON Mode

The Hidden Tax of Structured Output: How Much Extra You Pay for JSON Mode

Comments
7 min read
When 'Take a Deep Breath' Stopped Working: Prompt Tricks With an Expiry Date
Cover image for When 'Take a Deep Breath' Stopped Working: Prompt Tricks With an Expiry Date

When 'Take a Deep Breath' Stopped Working: Prompt Tricks With an Expiry Date

Comments
7 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.