Forem

# llm

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
The Real Cost of LLM Inference: Memory Bandwidth, Not FLOPs
Cover image for The Real Cost of LLM Inference: Memory Bandwidth, Not FLOPs

The Real Cost of LLM Inference: Memory Bandwidth, Not FLOPs

Comments
3 min read
The MCP Maturity Model: Evaluating Your Multi-Agent Context Strategy

The MCP Maturity Model: Evaluating Your Multi-Agent Context Strategy

2
Comments 1
20 min read
What You’re Getting Wrong When Building AI Applications in 2025

What You’re Getting Wrong When Building AI Applications in 2025

Comments
7 min read
Long Long Ago — The History of Generative AI
Cover image for Long Long Ago — The History of Generative AI

Long Long Ago — The History of Generative AI

Comments
5 min read
DragonMemory: Neural Sequence Compression for Production RAG

DragonMemory: Neural Sequence Compression for Production RAG

3
Comments
8 min read
Skills, MCPs, and Commands are the same context engineering trend.
Cover image for Skills, MCPs, and Commands are the same context engineering trend.

Skills, MCPs, and Commands are the same context engineering trend.

2
Comments 1
8 min read
Is the Monolith Dead? Introducing MQ-AGI: A Modular, Neuro-Symbolic Architecture for Scalable AI

Is the Monolith Dead? Introducing MQ-AGI: A Modular, Neuro-Symbolic Architecture for Scalable AI

1
Comments
2 min read
The Scaling Arms Race Is Over - The Application Age Has Begun

The Scaling Arms Race Is Over - The Application Age Has Begun

Comments
7 min read
Running Local AI on Linux With GPU: Ollama + Open WebUI + Gemma
Cover image for Running Local AI on Linux With GPU: Ollama + Open WebUI + Gemma

Running Local AI on Linux With GPU: Ollama + Open WebUI + Gemma

24
Comments 1
4 min read
Shrinking Giants: A Word on Floating-Point Precision in LLM Domain for Faster, Cheaper Models

Shrinking Giants: A Word on Floating-Point Precision in LLM Domain for Faster, Cheaper Models

1
Comments 2
8 min read
Building with LLMs at Scale: Part 3 - Higher-Level Abstractions

Building with LLMs at Scale: Part 3 - Higher-Level Abstractions

Comments
9 min read
Building with LLMs at Scale: Part 2 - Ergonomics and Observability

Building with LLMs at Scale: Part 2 - Ergonomics and Observability

Comments
6 min read
I Let an LLM Write JavaScript Inside My AI Runtime. Here’s What Happened

I Let an LLM Write JavaScript Inside My AI Runtime. Here’s What Happened

6
Comments 1
5 min read
Prompt Tracker: Turn Your Coding Sessions into a Star Wars Opening Crawl

Prompt Tracker: Turn Your Coding Sessions into a Star Wars Opening Crawl

Comments
8 min read
TOON: The Token Ninja

TOON: The Token Ninja

1
Comments
3 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.