Forem

# llm

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
The Squeezing Effect: Why Your Aligned AI Model Gets Worse

The Squeezing Effect: Why Your Aligned AI Model Gets Worse

Comments
3 min read
A Guide to HITL, HOTL, and HOOTL Workflows

A Guide to HITL, HOTL, and HOOTL Workflows

1
Comments
3 min read
How to run Cisco Foundation-Sec-8B on Colab for FREE

How to run Cisco Foundation-Sec-8B on Colab for FREE

Comments
3 min read
Optimal Chunking for Ontology RAG: Empirical Analysis & Orphan Axiom Problem
Cover image for Optimal Chunking for Ontology RAG: Empirical Analysis & Orphan Axiom Problem

Optimal Chunking for Ontology RAG: Empirical Analysis & Orphan Axiom Problem

Comments
12 min read
Developers Love Tools. AI Needs Better Instructions.
Cover image for Developers Love Tools. AI Needs Better Instructions.

Developers Love Tools. AI Needs Better Instructions.

4
Comments
3 min read
How to Build Multi-Provider Failover Strategies with Bifrost for Ultra‑Reliable AI Applications

How to Build Multi-Provider Failover Strategies with Bifrost for Ultra‑Reliable AI Applications

5
Comments
8 min read
Semantic Caching with Bifrost: Reduce LLM Costs and Latency by Up to 70%

Semantic Caching with Bifrost: Reduce LLM Costs and Latency by Up to 70%

Comments
7 min read
Dec 19, 2025 | The Tongyi Weekly: Your weekly dose of cutting-edge AI from Tongyi Lab

Dec 19, 2025 | The Tongyi Weekly: Your weekly dose of cutting-edge AI from Tongyi Lab

Comments
5 min read
The Art of Context Windows: Our AI Had Alzheimer's: Here's How We Taught It To Remember

The Art of Context Windows: Our AI Had Alzheimer's: Here's How We Taught It To Remember

3
Comments
9 min read
📌 Most models use Grouped Query Attention. That doesn’t mean yours should.📌

📌 Most models use Grouped Query Attention. That doesn’t mean yours should.📌

1
Comments
1 min read
Mooncake Memory Deep Dive: KVCache, Token Cost, DRAM Usage, and Saturation Analysis

Mooncake Memory Deep Dive: KVCache, Token Cost, DRAM Usage, and Saturation Analysis

Comments
5 min read
Part 1: Why Transformers Still Forget
Cover image for Part 1: Why Transformers Still Forget

Part 1: Why Transformers Still Forget

Comments
5 min read
How to Use Synthetic Data to Evaluate LLM Prompts: A Step-by-Step Guide

How to Use Synthetic Data to Evaluate LLM Prompts: A Step-by-Step Guide

Comments
8 min read
I Didn’t Build a Chatbot — I Built an AI That Runs the System
Cover image for I Didn’t Build a Chatbot — I Built an AI That Runs the System

I Didn’t Build a Chatbot — I Built an AI That Runs the System

Comments
2 min read
Architecting LLM Reliability for Compliance Workflows

Architecting LLM Reliability for Compliance Workflows

3
Comments
6 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.