Forem

# llm

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
How I Think About Reliability in LLM Applications
Cover image for How I Think About Reliability in LLM Applications

How I Think About Reliability in LLM Applications

3
Comments 1
6 min read
Title: Why we built a P2P inference network instead of another AI API wrapper

Title: Why we built a P2P inference network instead of another AI API wrapper

Comments
2 min read
When the AI's memory explodes: context overflow and compaction failures in production

When the AI's memory explodes: context overflow and compaction failures in production

Comments
3 min read
Why the f*** does AI always use em dashes — the involuntary AI watermark
Cover image for Why the f*** does AI always use em dashes — the involuntary AI watermark

Why the f*** does AI always use em dashes — the involuntary AI watermark

5
Comments 15
2 min read
SGLang vs vLLM: Which is Better for Your Needs in 2026?
Cover image for SGLang vs vLLM: Which is Better for Your Needs in 2026?

SGLang vs vLLM: Which is Better for Your Needs in 2026?

Comments
5 min read
6 JavaScript Patterns That Turn LLM APIs Into Production AI Systems
Cover image for 6 JavaScript Patterns That Turn LLM APIs Into Production AI Systems

6 JavaScript Patterns That Turn LLM APIs Into Production AI Systems

Comments
4 min read
From 66% to 96%: How I Fixed a Drive-Thru Voice Agent Before It Took a Single Real Call
Cover image for From 66% to 96%: How I Fixed a Drive-Thru Voice Agent Before It Took a Single Real Call

From 66% to 96%: How I Fixed a Drive-Thru Voice Agent Before It Took a Single Real Call

1
Comments
4 min read
Your MCP Agents Are Over-Privileged. Here's How to Fix It.
Cover image for Your MCP Agents Are Over-Privileged. Here's How to Fix It.

Your MCP Agents Are Over-Privileged. Here's How to Fix It.

1
Comments
9 min read
Unleashing AI in Quantum Research: Why TensorCircuit-NG is the Ultimate Foundation for the Agent Era

Unleashing AI in Quantum Research: Why TensorCircuit-NG is the Ultimate Foundation for the Agent Era

1
Comments
3 min read
AI in machines: why the problem runs deeper than we think

AI in machines: why the problem runs deeper than we think

3
Comments 2
3 min read
Stop Using JSON in Claude Prompts. I Tested 4 Formats — One Won by 30%.
Cover image for Stop Using JSON in Claude Prompts. I Tested 4 Formats — One Won by 30%.

Stop Using JSON in Claude Prompts. I Tested 4 Formats — One Won by 30%.

Comments
17 min read
NVIDIA AI Releases Nemotron-Terminal: A Systematic Data Engineering Pipeline for Scaling LLM Terminal Agents
Cover image for NVIDIA AI Releases Nemotron-Terminal: A Systematic Data Engineering Pipeline for Scaling LLM Terminal Agents

NVIDIA AI Releases Nemotron-Terminal: A Systematic Data Engineering Pipeline for Scaling LLM Terminal Agents

Comments
4 min read
Anthropic Built a 300K-Query Behavioral Auditing Tool Because Model Behavior Changes. Here's the Production Version.

Anthropic Built a 300K-Query Behavioral Auditing Tool Because Model Behavior Changes. Here's the Production Version.

Comments
4 min read
No GPU? No problem!, running local AI efficiently on my CPU.
Cover image for No GPU? No problem!, running local AI efficiently on my CPU.

No GPU? No problem!, running local AI efficiently on my CPU.

Comments
5 min read
# Understanding RAPTOR: A Powerful Architecture for Hierarchical Retrieval in RAG Systems

# Understanding RAPTOR: A Powerful Architecture for Hierarchical Retrieval in RAG Systems

1
Comments
6 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.