Forem

# llm

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Addressing LLM Benchmarking Obsolescence: Strategies for Timely and Relevant Model Evaluation

Addressing LLM Benchmarking Obsolescence: Strategies for Timely and Relevant Model Evaluation

1
Comments
13 min read
The "Always" Trap: Why Your AI Ignores Nuance (And How to Fix It)
Cover image for The "Always" Trap: Why Your AI Ignores Nuance (And How to Fix It)

The "Always" Trap: Why Your AI Ignores Nuance (And How to Fix It)

1
Comments
7 min read
AI개인화_블로그글

AI개인화_블로그글

Comments
3 min read
I Built an LLM Gateway That Learns Which Model to Use — Here's How the Routing Works

I Built an LLM Gateway That Learns Which Model to Use — Here's How the Routing Works

1
Comments
5 min read
GPT-5.4 Is Here, Cursor Just Got Agentic, and Open-Source LLMs Are Winning — Here's What's Happening in AI Right Now
Cover image for GPT-5.4 Is Here, Cursor Just Got Agentic, and Open-Source LLMs Are Winning — Here's What's Happening in AI Right Now

GPT-5.4 Is Here, Cursor Just Got Agentic, and Open-Source LLMs Are Winning — Here's What's Happening in AI Right Now

Comments
3 min read
I shipped an LLM feature, got 11 users, then the model silently changed on me. Here's what I built to stop it happening again.

I shipped an LLM feature, got 11 users, then the model silently changed on me. Here's what I built to stop it happening again.

Comments
3 min read
Disassembling AI Agents - Part 2: Claude Code
Cover image for Disassembling AI Agents - Part 2: Claude Code

Disassembling AI Agents - Part 2: Claude Code

Comments
15 min read
🏗️ 📐 Harness Engineering: The Emerging Discipline of Making AI Agents Reliable 🤖
Cover image for 🏗️ 📐 Harness Engineering: The Emerging Discipline of Making AI Agents Reliable 🤖

🏗️ 📐 Harness Engineering: The Emerging Discipline of Making AI Agents Reliable 🤖

10
Comments 2
20 min read
LLM vs RAG

LLM vs RAG

1
Comments 1
1 min read
How to Add LLM Drift Monitoring to Your CI/CD Pipeline (Free, 5 Minutes)

How to Add LLM Drift Monitoring to Your CI/CD Pipeline (Free, 5 Minutes)

Comments
3 min read
Is GitHub Copilot open source or proprietary?
Cover image for Is GitHub Copilot open source or proprietary?

Is GitHub Copilot open source or proprietary?

1
Comments
7 min read
Gemini 1.5 Pro Also Drifts: Known Regression Patterns and How to Monitor Them

Gemini 1.5 Pro Also Drifts: Known Regression Patterns and How to Monitor Them

1
Comments
3 min read
LLM Wiki: I Set Up Karpathy's Local Knowledge Base — Here's What Actually Works [2026 Guide]

LLM Wiki: I Set Up Karpathy's Local Knowledge Base — Here's What Actually Works [2026 Guide]

1
Comments
7 min read
I Found a 0.575 Drift Score Between Two Consecutive LLM Runs. Here's Exactly What Changed.

I Found a 0.575 Drift Score Between Two Consecutive LLM Runs. Here's Exactly What Changed.

Comments
3 min read
My LLM Started Lying to My App and I Didn't Notice for Three Days

My LLM Started Lying to My App and I Didn't Notice for Three Days

Comments
4 min read
👋 Sign in for the ability to sort posts by relevant, latest, or top.