Forem

# llm

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Build an End-to-End RAG Pipeline for LLM Applications
Cover image for Build an End-to-End RAG Pipeline for LLM Applications

Build an End-to-End RAG Pipeline for LLM Applications

Comments
12 min read
I Tested 6 Attacks on Multi-Agent Systems — Here's Which Ones Agents Can't See
Cover image for I Tested 6 Attacks on Multi-Agent Systems — Here's Which Ones Agents Can't See

I Tested 6 Attacks on Multi-Agent Systems — Here's Which Ones Agents Can't See

Comments
4 min read
MoE Beat Dense 27B by 2.4x on 8GB VRAM — The 35B-A3B Benchmark Nobody Expected

MoE Beat Dense 27B by 2.4x on 8GB VRAM — The 35B-A3B Benchmark Nobody Expected

Comments
5 min read
LLM Inference Optimization: Techniques That Actually Reduce Latency and Cost
Cover image for LLM Inference Optimization: Techniques That Actually Reduce Latency and Cost

LLM Inference Optimization: Techniques That Actually Reduce Latency and Cost

Comments
9 min read
AI Weekly: 3/27–4/1 | Anthropic's Triple Shock, Arm's First-Ever Chip, Apple Opens Siri to Rivals
Cover image for AI Weekly: 3/27–4/1 | Anthropic's Triple Shock, Arm's First-Ever Chip, Apple Opens Siri to Rivals

AI Weekly: 3/27–4/1 | Anthropic's Triple Shock, Arm's First-Ever Chip, Apple Opens Siri to Rivals

Comments
6 min read
Building a Real-Time Security Camera System with Local Vision LLMs

Building a Real-Time Security Camera System with Local Vision LLMs

Comments
5 min read
Distributed LLM Inference Across NVIDIA Blackwell and Apple Silicon Over 10GbE

Distributed LLM Inference Across NVIDIA Blackwell and Apple Silicon Over 10GbE

Comments
4 min read
Multi-Agent AI Systems: Architecture Patterns That Actually Work

Multi-Agent AI Systems: Architecture Patterns That Actually Work

Comments
7 min read
Agentic AI Fails in Production for Simple Reasons — What MLDS 2026 Taught Me
Cover image for Agentic AI Fails in Production for Simple Reasons — What MLDS 2026 Taught Me

Agentic AI Fails in Production for Simple Reasons — What MLDS 2026 Taught Me

Comments
3 min read
TurboQuant and RaBitQ: What the Public Story Gets Wrong
Cover image for TurboQuant and RaBitQ: What the Public Story Gets Wrong

TurboQuant and RaBitQ: What the Public Story Gets Wrong

2
Comments
7 min read
Cross-Model Persona Fidelity: Is Your AI Agent Still 'Them' on a Different LLM?

Cross-Model Persona Fidelity: Is Your AI Agent Still 'Them' on a Different LLM?

Comments
2 min read
From Prompting to Programming: Making LLM Outputs More Predictable with Structure
Cover image for From Prompting to Programming: Making LLM Outputs More Predictable with Structure

From Prompting to Programming: Making LLM Outputs More Predictable with Structure

Comments
2 min read
Codex Fast Mode vs Claude Fast Mode: What’s Actually Different?
Cover image for Codex Fast Mode vs Claude Fast Mode: What’s Actually Different?

Codex Fast Mode vs Claude Fast Mode: What’s Actually Different?

Comments
7 min read
I built a context engine that saves Claude Code 73% of its tokens on large codebases

I built a context engine that saves Claude Code 73% of its tokens on large codebases

Comments
2 min read
The Model Isn't the Bottleneck — Your Prompt Structure Is

The Model Isn't the Bottleneck — Your Prompt Structure Is

Comments
3 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.