Forem

Machine Learning

A branch of artificial intelligence (AI) and computer science which focuses on the use of data and algorithms to imitate the way that humans learn, gradually improving its accuracy.

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Como comprimir o KV cache do seu LLM em 33x sem treino
Cover image for Como comprimir o KV cache do seu LLM em 33x sem treino

Como comprimir o KV cache do seu LLM em 33x sem treino

Comments
3 min read
One line of Python to extend your LLM's context window 10x
Cover image for One line of Python to extend your LLM's context window 10x

One line of Python to extend your LLM's context window 10x

Comments
1 min read
KV cache memory calculator: how much does your LLM actually use?
Cover image for KV cache memory calculator: how much does your LLM actually use?

KV cache memory calculator: how much does your LLM actually use?

Comments
3 min read
What the Hell is a Token?
Cover image for What the Hell is a Token?

What the Hell is a Token?

1
Comments
4 min read
Stuart Russell's 2026 AI Update Rewrites the Rulebook

Stuart Russell's 2026 AI Update Rewrites the Rulebook

Comments
5 min read
How Much GPU Memory Does NexusQuant Actually Save?
Cover image for How Much GPU Memory Does NexusQuant Actually Save?

How Much GPU Memory Does NexusQuant Actually Save?

Comments
4 min read
What I Learned Testing 12 Compression Approaches That Failed
Cover image for What I Learned Testing 12 Compression Approaches That Failed

What I Learned Testing 12 Compression Approaches That Failed

Comments
6 min read
The Math Behind E8 Lattice Quantization (with Code)
Cover image for The Math Behind E8 Lattice Quantization (with Code)

The Math Behind E8 Lattice Quantization (with Code)

Comments
6 min read
Six Characters Fixed My AI's Personality: A Fine-Tuning Story

Six Characters Fixed My AI's Personality: A Fine-Tuning Story

Comments
4 min read
How to deploy NexusQuant in production (and what's missing)
Cover image for How to deploy NexusQuant in production (and what's missing)

How to deploy NexusQuant in production (and what's missing)

Comments
4 min read
NexusQuant benchmarks: every number, honestly
Cover image for NexusQuant benchmarks: every number, honestly

NexusQuant benchmarks: every number, honestly

Comments
5 min read
NexusQuant vs KVTC vs TurboQuant vs CommVQ — honest comparison
Cover image for NexusQuant vs KVTC vs TurboQuant vs CommVQ — honest comparison

NexusQuant vs KVTC vs TurboQuant vs CommVQ — honest comparison

Comments
4 min read
I Built a Semantic Cache That Cuts LLM API Costs by 72% - What Actually Worked and What Didn't
Cover image for I Built a Semantic Cache That Cuts LLM API Costs by 72% - What Actually Worked and What Didn't

I Built a Semantic Cache That Cuts LLM API Costs by 72% - What Actually Worked and What Didn't

Comments
6 min read
Building Privacy-Preserving Machine Learning: A Practical Guide to Federated Learning
Cover image for Building Privacy-Preserving Machine Learning: A Practical Guide to Federated Learning

Building Privacy-Preserving Machine Learning: A Practical Guide to Federated Learning

2
Comments
4 min read
Longer contexts are easier to compress (not harder)
Cover image for Longer contexts are easier to compress (not harder)

Longer contexts are easier to compress (not harder)

Comments
2 min read
👋 Sign in for the ability to sort posts by relevant, latest, or top.