Forem

# llm

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Building LLMs for Bharat: What 6 Months of Rural AI Deployment Taught Us

Building LLMs for Bharat: What 6 Months of Rural AI Deployment Taught Us

Comments
4 min read
GPU cloud servers for AI workloads: how to choose the right instance and deploy without waste
Cover image for GPU cloud servers for AI workloads: how to choose the right instance and deploy without waste

GPU cloud servers for AI workloads: how to choose the right instance and deploy without waste

1
Comments
15 min read
Qwen 3.6, llama.cpp Speculative Decoding, Deepseek TileKernels for Local AI on Consumer GPUs

Qwen 3.6, llama.cpp Speculative Decoding, Deepseek TileKernels for Local AI on Consumer GPUs

Comments
3 min read
I built a new file format to cut AI token costs by 70% — here's how it works

I built a new file format to cut AI token costs by 70% — here's how it works

1
Comments
5 min read
Doby: How I Cut Claude Code's Navigation Tokens by 95% with a Spec-First Workflow

Doby: How I Cut Claude Code's Navigation Tokens by 95% with a Spec-First Workflow

Comments
1 min read
I evaluated the leaked system prompts of the biggest AI coding tools. Here's what I found.

I evaluated the leaked system prompts of the biggest AI coding tools. Here's what I found.

Comments
4 min read
LocalForge: I built a self-hosted LLM control plane with intelligent routing and LoRA finetuning
Cover image for LocalForge: I built a self-hosted LLM control plane with intelligent routing and LoRA finetuning

LocalForge: I built a self-hosted LLM control plane with intelligent routing and LoRA finetuning

Comments
2 min read
48 Hours After Publishing: Second-Order Injection Field Notes

48 Hours After Publishing: Second-Order Injection Field Notes

1
Comments
2 min read
The Actual Cost of Self-Hosting Your LLM (Nobody Does This Math First)

The Actual Cost of Self-Hosting Your LLM (Nobody Does This Math First)

Comments
4 min read
A Minimal ~9M Parameter Transformer LLM Trained from Scratch

A Minimal ~9M Parameter Transformer LLM Trained from Scratch

Comments
2 min read
LLM Observability tool
Cover image for LLM Observability tool

LLM Observability tool

Comments
1 min read
AI Duel on Building Retro RPG Quest Journal
Cover image for AI Duel on Building Retro RPG Quest Journal

AI Duel on Building Retro RPG Quest Journal

Comments
3 min read
Qwen3.6-Plus Benchmark: It Is Trying to Finish the Job, Not Just Win Chat Scores

Qwen3.6-Plus Benchmark: It Is Trying to Finish the Job, Not Just Win Chat Scores

1
Comments
5 min read
Context Compression and Persistent Memory Design for Terminal AI Assistants

Context Compression and Persistent Memory Design for Terminal AI Assistants

1
Comments
7 min read
qwen3.6-27b scores 77.2% on SWE-bench. the dense model is winning against MoE.

qwen3.6-27b scores 77.2% on SWE-bench. the dense model is winning against MoE.

Comments
3 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.