Forem

# rag

Retrieval augmented generation, or RAG, is an architectural approach that can improve the efficacy of large language model (LLM) applications by leveraging custom data.

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Why AI Video Feels Unreliable — and What Reference-to-Video Fixes
Cover image for Why AI Video Feels Unreliable — and What Reference-to-Video Fixes

Why AI Video Feels Unreliable — and What Reference-to-Video Fixes

Comments
2 min read
Routing, Load Balancing, and Failover in LLM Systems

Routing, Load Balancing, and Failover in LLM Systems

5
Comments
3 min read
Human-in-the-Loop Systems: Building AI That Knows When to Ask for Help

Human-in-the-Loop Systems: Building AI That Knows When to Ask for Help

Comments
17 min read
Prompt -> RAG -> Eval: System Overview for LLM Engineers
Cover image for Prompt -> RAG -> Eval: System Overview for LLM Engineers

Prompt -> RAG -> Eval: System Overview for LLM Engineers

Comments
3 min read
I Built a PDF Chat App in Under an Hour Using RAG- Here's How You Can Too
Cover image for I Built a PDF Chat App in Under an Hour Using RAG- Here's How You Can Too

I Built a PDF Chat App in Under an Hour Using RAG- Here's How You Can Too

Comments
3 min read
Implementing Retrieval-Augmented Generation (RAG) with Real-World Constraints
Cover image for Implementing Retrieval-Augmented Generation (RAG) with Real-World Constraints

Implementing Retrieval-Augmented Generation (RAG) with Real-World Constraints

Comments
3 min read
RAG Pipeline: How Retrieval-Augmented Generation Really Works in Production?
Cover image for RAG Pipeline: How Retrieval-Augmented Generation Really Works in Production?

RAG Pipeline: How Retrieval-Augmented Generation Really Works in Production?

Comments
3 min read
Functional MCP AI System Diagram

Functional MCP AI System Diagram

Comments
1 min read
Engineers who explore build better AI products

Engineers who explore build better AI products

2
Comments
2 min read
Why GenAI Observability Breaks in Production

Why GenAI Observability Breaks in Production

Comments
2 min read
Launching your personal assistant
Cover image for Launching your personal assistant

Launching your personal assistant

5
Comments
14 min read
Why RAG is the Future of Search (And How Elastic Search Makes it Possible )
Cover image for Why RAG is the Future of Search (And How Elastic Search Makes it Possible )

Why RAG is the Future of Search (And How Elastic Search Makes it Possible )

1
Comments
4 min read
Before You Build a Client RAG/Agent: My Pre-Build Checklist (With Examples + What to Automate)

Before You Build a Client RAG/Agent: My Pre-Build Checklist (With Examples + What to Automate)

Comments
5 min read
Multi-Step Reasoning and Agentic Workflows: Building AI That Plans and Executes

Multi-Step Reasoning and Agentic Workflows: Building AI That Plans and Executes

Comments
16 min read
I made a fast, structured PDF extractor for RAG; 300 pages a second

I made a fast, structured PDF extractor for RAG; 300 pages a second

Comments
3 min read
RAG for Developers — Built for Code, Not Just Text (Review Requested)
Cover image for RAG for Developers — Built for Code, Not Just Text (Review Requested)

RAG for Developers — Built for Code, Not Just Text (Review Requested)

Comments
1 min read
Why Static Load Balancing Fails for LLM Infrastructure (And What Works Instead)

Why Static Load Balancing Fails for LLM Infrastructure (And What Works Instead)

5
Comments
7 min read
Beyond RAG: Building an Autonomous "Epistemic Engine" to Fight AI Hallucination

Beyond RAG: Building an Autonomous "Epistemic Engine" to Fight AI Hallucination

Comments
2 min read
Building a RAG-Powered Documentation Assistant: Why I Used Bifrost LLM Gateway Instead of Direct API Calls

Building a RAG-Powered Documentation Assistant: Why I Used Bifrost LLM Gateway Instead of Direct API Calls

6
Comments
5 min read
Stop feeding garbage to your LLM: How to get clean Markdown from Documentation

Stop feeding garbage to your LLM: How to get clean Markdown from Documentation

Comments
1 min read
My hands-on experience with Qdrant and Docling (and Ollama)

My hands-on experience with Qdrant and Docling (and Ollama)

Comments
11 min read
RAG-Augmented Agile Story Generation: An Architectural Framework for LLM-Powered Backlog Automation

RAG-Augmented Agile Story Generation: An Architectural Framework for LLM-Powered Backlog Automation

Comments
8 min read
Building a Simple RAG System Using FAISS
Cover image for Building a Simple RAG System Using FAISS

Building a Simple RAG System Using FAISS

1
Comments
3 min read
Reranking and Two-Stage Retrieval: Precision When It Matters Most

Reranking and Two-Stage Retrieval: Precision When It Matters Most

Comments
2 min read
LLMs Hallucinate. RAG Fixes That — Here’s How We Built a Reliable Healthcare AI
Cover image for LLMs Hallucinate. RAG Fixes That — Here’s How We Built a Reliable Healthcare AI

LLMs Hallucinate. RAG Fixes That — Here’s How We Built a Reliable Healthcare AI

Comments
3 min read
loading...