Forem

# llm

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
DeepSeek-V4: Finally, a Context Window Built for Agents

DeepSeek-V4: Finally, a Context Window Built for Agents

2
Comments 2
2 min read
I built a client-side LLM token counter because I kept guessing at prompt costs

I built a client-side LLM token counter because I kept guessing at prompt costs

Comments 1
4 min read
AI Weekly: Voice Models, Custom Silicon, MCP Goes Enterprise (May 7–13, 2026)
Cover image for AI Weekly: Voice Models, Custom Silicon, MCP Goes Enterprise (May 7–13, 2026)

AI Weekly: Voice Models, Custom Silicon, MCP Goes Enterprise (May 7–13, 2026)

Comments
16 min read
The AI Stack For 2026: LLMs, Vector Databases, Tool Calling, Agents, And Observability

The AI Stack For 2026: LLMs, Vector Databases, Tool Calling, Agents, And Observability

6
Comments
7 min read
Four LLM Engines, One ClickHouse Cluster: An Agentic AI Architecture

Four LLM Engines, One ClickHouse Cluster: An Agentic AI Architecture

Comments 1
23 min read
The AI system that worked in staging destroyed us in production. Here's what we missed.

The AI system that worked in staging destroyed us in production. Here's what we missed.

Comments
4 min read
LLM Token Costs: Why Your Prompt Might Cost 10x More Than You Think

LLM Token Costs: Why Your Prompt Might Cost 10x More Than You Think

Comments 1
1 min read
Right Model, Right Time: Why Model Routing Is Becoming Core to GenAI Platforms
Cover image for Right Model, Right Time: Why Model Routing Is Becoming Core to GenAI Platforms

Right Model, Right Time: Why Model Routing Is Becoming Core to GenAI Platforms

Comments 2
3 min read
RLHF trained Claude to be verbose. Here's the proof

RLHF trained Claude to be verbose. Here's the proof

Comments
5 min read
Microsoft Says 50% AI Code. Here's What That Actually Means for Engineers

Microsoft Says 50% AI Code. Here's What That Actually Means for Engineers

Comments
3 min read
Google's I/O 2024 announcements just reset the AI developer stack

Google's I/O 2024 announcements just reset the AI developer stack

Comments
3 min read
How LumiClip Finds the Best Moments in Your Video and Reframes Them for Mobile

How LumiClip Finds the Best Moments in Your Video and Reframes Them for Mobile

Comments
5 min read
Compass v1.1.0 · we shipped a memory plugin that catches its own consumption drift

Compass v1.1.0 · we shipped a memory plugin that catches its own consumption drift

Comments
5 min read
⚙️ NoLife Models - Vers une infrastructure locale des runtimes IA avec Symfony

⚙️ NoLife Models - Vers une infrastructure locale des runtimes IA avec Symfony

1
Comments
7 min read
Replay Production Call Transcripts Against Your Voice Agent's Current Graph

Replay Production Call Transcripts Against Your Voice Agent's Current Graph

Comments
5 min read
👋 Sign in for the ability to sort posts by relevant, latest, or top.