Llm Page 170

👋 Sign in for the ability to sort posts by relevant, latest, or top.

Cover image for Local AI in 2026: Ollama Benchmarks, $0 Inference, and the End of Per-Token Pricing

Pooya Golchian

Apr 7

Local AI in 2026: Ollama Benchmarks, $0 Inference, and the End of Per-Token Pricing

#ai #ollama #llm #localai

6 min read

Cover image for Google's Gemma 4 Explained: The Open-Source Agent Toolkit We've Been Waiting For

Aaryan Shukla

Apr 7

Google's Gemma 4 Explained: The Open-Source Agent Toolkit We've Been Waiting For

#agents #google #llm #opensource

3 min read

Cover image for The 12 approaches I tested before finding one that works

João André Gomes Marques

Apr 7

The 12 approaches I tested before finding one that works

#machinelearning #python #ai #llm

5 min read

Ceyhun Aksan

Mar 27

I Benchmarked 5 File Editing Strategies for AI Coding Agents. Here's What Actually Works.

#ai #llm #productivity #developertools

2 min read

Uses all six Claude Code hooks

Mike Dolan

Apr 2

How I Built Persistent Memory for Claude Code

#showdev #ai #llm #productivity

9 min read

Moon Robert

Mar 8

RAG in the Wild: What I Learned After Two Weeks of Chunking Experiments

#rag #vectordatabases #llm #embeddings

7 min read

Cover image for How to Reduce OpenAI Bill Without Hurting Quality: A Practical Audit Framework

Daniel R. Foster for OptyxStack

Mar 8

How to Reduce OpenAI Bill Without Hurting Quality: A Practical Audit Framework

#ai #openai #llm #softwareengineering

6 min read

Cover image for Running 1M-token context on a single GPU (the math)

João André Gomes Marques

Apr 7

Running 1M-token context on a single GPU (the math)

#ai #gpu #llm #infrastructure

2 min read

Cover image for I Read a Paper That Genuinely Made Me Stop and Think — AI is Now Jailbreaking Other AI

Aaryan Shukla

Mar 4

I Read a Paper That Genuinely Made Me Stop and Think — AI is Now Jailbreaking Other AI

#discuss #ai #llm #machinelearning

3 min read

Cover image for One line of Python to extend your LLM's context window 10x

João André Gomes Marques

Apr 7

One line of Python to extend your LLM's context window 10x

#python #machinelearning #ai #llm

1 min read

Cover image for KV cache memory calculator: how much does your LLM actually use?

João André Gomes Marques

Apr 7

KV cache memory calculator: how much does your LLM actually use?

#llm #machinelearning #python #gpu

3 min read

Zafer Dace

Apr 7

Build Your Own AI-Powered Knowledge Base with LLMs and Obsidian

#ai #llm #productivity #tutorial

6 min read

Cover image for How Much GPU Memory Does NexusQuant Actually Save?

João André Gomes Marques

Apr 7

How Much GPU Memory Does NexusQuant Actually Save?

#machinelearning #gpu #llm #python

4 min read

Cover image for What I Learned Testing 12 Compression Approaches That Failed

João André Gomes Marques

Apr 7

What I Learned Testing 12 Compression Approaches That Failed

#machinelearning #llm #research #python

6 min read

Cover image for The Math Behind E8 Lattice Quantization (with Code)

João André Gomes Marques

Apr 7

The Math Behind E8 Lattice Quantization (with Code)

#machinelearning #math #python #llm

6 min read

👋 Sign in for the ability to sort posts by relevant, latest, or top.

Forem

# llm

Local AI in 2026: Ollama Benchmarks, $0 Inference, and the End of Per-Token Pricing

Google's Gemma 4 Explained: The Open-Source Agent Toolkit We've Been Waiting For

The 12 approaches I tested before finding one that works

I Benchmarked 5 File Editing Strategies for AI Coding Agents. Here's What Actually Works.

How I Built Persistent Memory for Claude Code

RAG in the Wild: What I Learned After Two Weeks of Chunking Experiments

How to Reduce OpenAI Bill Without Hurting Quality: A Practical Audit Framework

Running 1M-token context on a single GPU (the math)

I Read a Paper That Genuinely Made Me Stop and Think — AI is Now Jailbreaking Other AI

One line of Python to extend your LLM's context window 10x

KV cache memory calculator: how much does your LLM actually use?

Build Your Own AI-Powered Knowledge Base with LLMs and Obsidian

How Much GPU Memory Does NexusQuant Actually Save?

What I Learned Testing 12 Compression Approaches That Failed

The Math Behind E8 Lattice Quantization (with Code)