Forem: RoTSL

What Happens When You Try to Reverse Biology? A Deep Look at the Protein DNA Analysis Simulator

RoTSL — Fri, 24 Apr 2026 12:57:47 +0000

Most biology tools move in one direction. DNA becomes RNA. RNA becomes protein.

Photo by Warren Umoh on Unsplash

That flow is drilled into anyone who has taken genetics or molecular biology. After a while, it becomes background knowledge. You stop questioning it because the pathway feels settled.

Then a project comes along and asks a slightly uncomfortable question:

What if we tried to move backward?

The Protein → DNA Synthesis Simulator explores that idea.

Project links:

This is not laboratory software. It is not a validated bioinformatics pipeline. It is a simulation designed to explore a biological question.

That distinction matters from the beginning.

This project is simulation and hypothetical only and has not been tested or validated. This project is for educational purposes only.

Why reverse biology is harder than it sounds

At first glance, reverse translation seems simple.

If DNA creates proteins, then proteins should point back to DNA.

But biology does not preserve information perfectly.

A protein sequence contains amino acids. DNA contains codons.

The problem is that multiple codons can encode the same amino acid.

For example:

Leucine can be encoded by six codons
Serine can be encoded by six codons
Arginine can be encoded by six codons

This redundancy is called degeneracy of the genetic code.

Once translation happens, part of the original DNA detail disappears.

You keep the protein.

You lose some of the nucleotide history.

That means reverse translation is not reconstruction.

It is estimation.

The simulator starts with a simple but useful idea

The main GitHub project allows users to input a protein sequence and generate a possible DNA equivalent.

The workflow is straightforward:

Enter a protein sequence
Parse amino acids
Match amino acids to codons
Build a theoretical DNA sequence
Display a reconstructed output

The important word here is possible.

The simulator does not claim to discover the original DNA strand.

It generates a biologically plausible candidate.

That may sound like a small distinction, but it changes how the project should be understood.

This is closer to a thought experiment than a prediction engine.

The protein analysis page adds a second layer

The main simulator introduces reverse translation.

Main Page

Link: https://rotsl.github.io/protein2dna_synthesis-simulator/protein_analysis/

Instead of immediately converting protein into DNA, the analysis layer pauses to inspect the sequence itself.

That shift makes the project more interesting.

A protein sequence contains more than letters.

It contains patterns.

The analysis process can reveal:

Amino acid composition
Sequence length
Repeating motifs
Structural tendencies
Hydrophobic or hydrophilic regions
Codon ambiguity possibilities
Conserved residue clusters

This changes the role of the simulator.

You stop treating proteins as outputs and start treating them as encoded information.

Protein sequences carry clues, not complete answers

One misconception in biology education is that proteins behave like exact reflections of DNA.

They do not.

A protein preserves order.

It does not preserve every codon decision.

This becomes obvious during reverse translation.

One protein sequence may correspond to many DNA sequences.

Different organisms may also favor different codon usage patterns.

A bacterial sequence and a mammalian sequence may encode the same amino acids using different codon preferences.

The simulator exposes this uncertainty instead of hiding it.

That is one of its strongest features.

The protein analysis page changes how users think

Many educational biology tools focus on output.

Input something. Receive a result.

The analysis page encourages a different rhythm.

You enter a sequence and ask:

What kind of protein is this?
Are certain amino acids overrepresented?
Are there repeating patterns?
Could this suggest structural behavior?
How ambiguous is reverse translation?

These questions matter because biological interpretation happens before prediction.

Researchers rarely jump directly to answers.

They inspect patterns first.

The analysis page mirrors that mindset.

Analysis

The live simulation version feels more interactive

The GitHub build introduces the idea.

The version feels closer to a working prototype.

Try it out here : https://protein-dna-simulator.vercel.app/

Preview structures

At first, both versions appear similar.

Protein input. DNA output. Translation logic.

But the live build feels faster and more responsive.

It behaves less like a static webpage and more like an active sequence workspace.

You can experiment quickly.

Change one amino acid.

Watch the sequence shift.

Remove residues.

See how output changes.

That immediate feedback matters.

Learning becomes easier when interaction is continuous.

The live version works like a live sequence interpreter

Many biology tools rely on a submission model:

Paste sequence
Configure settings
Submit request
Wait for processing
Read results

The live version presented shortens that cycle.

You enter data and receive near-instant interpretation.

That makes experimentation feel natural.

You stop thinking in terms of “jobs” and start thinking in terms of exploration.

The live simulator combines several biological layers

The interface appears to merge multiple ideas into one workflow.

Engineering Tab

The system includes:

Protein parsing
Amino acid recognition
Codon mapping
DNA sequence estimation
Educational translation logic
Sequence relationship visualization

This matters because reverse translation is not a single operation.

It is a chain of assumptions.

Each amino acid creates branching possibilities.

The simulator turns those possibilities into something visible.

Reverse translation exposes a hidden truth about biology

Most diagrams simplify biology into arrows.

DNA → RNA → Protein.

That model is useful.

It is also incomplete.

Real biology includes ambiguity.

Codon redundancy means several nucleotide sequences can create identical proteins.

That creates uncertainty.

The simulator does not remove uncertainty.

It places uncertainty at the center of the experience.

That makes the project more honest than many educational demos.

The project feels closer to computational biology than traditional teaching software

There is a subtle shift that happens when using the simulator.

You stop memorizing.

You start interpreting.

That makes the project feel closer to lightweight bioinformatics.

Professional sequence-analysis tools often involve:

Pattern recognition
Sequence comparison
Codon usage analysis
Structural inference
Similarity scoring
Translation mapping

The simulator is not competing with research-grade systems.

It borrows concepts from computational biology and simplifies them into something approachable.

Inspiration from published Science research

The project notes inspiration from research published in Science:

https://www.science.org/doi/abs/10.1126/science.aed1656

The simulator does not reproduce the paper’s findings.

The connection is conceptual.

Modern biology increasingly depends on inference.

Researchers often estimate relationships between biological systems rather than directly observing every process.

Protein folding prediction, sequence inference, and molecular modeling all rely on computational interpretation.

The simulator fits within that broader idea.

It asks:

If proteins preserve traces of DNA history, how much can we estimate from those traces?

That question alone makes the project worth exploring.

Why uncertainty is the most valuable part of the simulator

The strongest lesson here is not reverse translation.

It is uncertainty.

Science education sometimes creates the illusion that biology always produces clean answers.

The simulator quietly pushes against that assumption.

You expect one DNA sequence.

You discover many possibilities.

You expect certainty.

You find ambiguity.

That is closer to how real biological reasoning works.

What makes the project useful for education

The simulator works well for:

Students learning transcription and translation
Beginners exploring codon relationships
Developers interested in biological computation
Bioinformatics learners experimenting with sequence logic
People curious about molecular coding systems

The project does not require laboratory experience.

It asks users to think.

That alone gives it educational value.

Where the simulator could grow

There are several additions that could deepen the learning experience.

Organism-specific codon bias

Different organisms prefer different codons.

Adding selectable species would make reverse translation more realistic.

Multiple DNA candidates

Instead of returning one sequence, the simulator could generate ranked alternatives.

Probability scoring

Codon likelihood could help explain why some outputs are more plausible.

Structural hints

The analysis page could flag patterns associated with alpha helices or beta sheets.

Sequence comparison

Comparing proteins side-by-side would help explain mutation and similarity.

These additions would not make the simulator “correct.”

They would make uncertainty easier to understand.

Final thoughts

The Protein → DNA Synthesis Simulator works best when treated as a reasoning tool.

It does not reconstruct biology.

It explores biological possibility.

The GitHub version explains the concept.

The protein analysis page adds interpretation.

The live web version makes the process interactive.

Together, they create a small ecosystem for thinking about biological information in reverse.

The project becomes more interesting once you stop asking:

“Is this the original DNA?”

and start asking:

“What assumptions make this sequence plausible?”

That shift changes the experience.

You stop seeing proteins as endpoints.

You start seeing them as traces.

And sometimes, tracing hidden information is more interesting than certainty.

Project links

Main simulator: https://rotsl.github.io/protein2dna_synthesis-simulator/
Protein analysis page :
Interactive web version: https://protein-dna-simulator.vercel.app/

References

[1] Peiwei Deng, Heewon Lee, Carlos Armijo, Hao Wang, and Albert Gao. Protein-templated synthesis of di-nucleotide repeat DNA by an anti-phage reverse transcriptase. Science, page aed1656, 2026. PDB: 9Z6Y.

[2] Alexander A. Green, Pamela A. Silver, James J. Collins, and Peng Yin. Toehold switches: De-novo-designed regulators of gene expression. Cell, 159:925–939, 2014.

[3] Andrew V. Anzalone, Peyton B. Randolph, Jessie R. Davis, Alexander A. Sousa, LukeW.Koblan, JonathanM.Levy, PeterJ.Chen, ChristopherWilson, GregoryA. Newby, Aditya Raguram, and David R. Liu. Search-and-replace genome editing without double-strand breaks or donor DNA. Nature, 576:149–157, 2019.

[4] Andrew V. Anzalone, Xin D. Gao, Christopher J. Podracky, Andrew T. Nelson, Luke W. Koblan, Aditya Raguram, Jonathan M. Levy, Jeffry A. M. Mercer, and David R. Liu. Programmable deletion, replacement, integration and inversion of large DNA sequences with twin prime editing. Nature Biotechnology, 40:731–740, 2022.

[5] Rohan R. DRT3b Engineering Studio: Protein →RNA →DNA simulator. https://github.com/rotsl/protein2dna_synthesis-simulator, 2026. Open-source web simulation platform. Live app: https://protein-dna-simulator.vercel.app/.

[6] Mihaly Varadi, Stephen Anyango, Mandar Deshpande, Sreenath Nair, Cindy Natassia, Galabina Yordanova, David Yuan, Oana Stroe, Gemma Wood, Agata Laydon, Augustin Žídek, Tim Green, Kathryn Tunyasuvunakool, Stig Petersen, John Jumper, Ellen Clancy, Richard Green, Oriol Vinyals, Demis Hassabis, and Sameer Velankar. AlphaFold Protein Structure Database: massive structural coverage for biology and medicine. Nucleic Acids Research, 50:D439–D444, 2022.

[7] John Jumper, Richard Evans, Alexander Pritzel, Tim Green, Michael Figurnov, Olaf Ronneberger, Kathryn Tunyasuvunakool, Russ Bates, Augustin Žídek, Anna Potapenko, Alex Bridgland, Clemens Meyer, Simon A. A. Kohl, Andrew J. Ballard, Andrew Cowie, Bernardino Romera-Paredes, Stanislav Nikolov, Rishub Jain, Jonas Adler, Trevor Back, Stig Petersen, David Reiman, Ellen Clancy, Michal Zielinski, Martin Steinegger, Michalina Pacholska, Tamas Berghammer, Sebastian Bodenstein, David Silver, Oriol Vinyals, Andrew W. Senior, Koray Kavukcuoglu, Pushmeet Kohli, and Demis Hassabis. Highly accurate protein structure prediction with AlphaFold. Nature, 596:583–589, 2021.

[8] Kirsten L. Frieda, James M. Linton, Sahand Hormoz, Junhong Choi, Koo-Lok K. Chow, Zeba S. Singer, Mark W. Budde, Michael B. Elowitz, and Long Cai. Synthetic recording and in situ readout of lineage information in single cells. Nature, 541:107–111, 2017.

[9] Bushra Raj, Daniel E. Wagner, Aaron McKenna, Shristi Pandey, Allon M. Klein, Jay Shendure, Deepak L. Bhatt, and Bhatt Bhatt. GESTALT: a method for tracing lineage and clonal dynamics at single-cell resolution. Nature Biotechnology, 36:442–450, 2018.

[10] Wenyuan Tang, James H. Hu, and David R. Liu. PEAR: a highly multiplexed lineage recorder based on prime editing and in situ barcode readout. Nature Methods, 21:1054–1065, 2024.

[11] Junhong Choi, Wei Chen, Anna Minkina, Florence M. Chardon, Chase C. Suiter, Samuel G. Regalado, Silvia Domcke, Nobuhiko Hamazaki, Choli Lee, Beth Martin, Ryan M. Daza, and Jay Shendure. A temporally resolved, multi-symbolic molecular recorder based on sequential DNA writing. Nature Chemical Biology, 18:1204–1212, 2022.

[12] Nathaniel Roquet, Ava P. Soleimany, Alyssa C. Ferris, Scott Wick, and Timothy K. Lu. Synthetic recombinase-based state machines in living cells. Science, 353:aad8559, 2016.

⚠️ Important disclaimer

This project is simulation and hypothetical only and has not been tested or validated. This project is for educational purposes only.

Hypercontext: a framework for agents that actually know what they're doing

RoTSL — Mon, 20 Apr 2026 16:41:25 +0000

I built Hypercontext because I got tired of agent frameworks that treat context like a static blob you shove into a prompt and hope for the best. Most tools out there assume context is something you pass. I wanted something that treats context as something you can inspect, compress, score, and rewrite while the agent is running.

Hypercontext is still in Alpha phase.

This isn't about adding another layer of abstraction over OpenAI's API. It's about making agents aware of their own reasoning so they can fix it when it breaks.

What it actually does

Hypercontext is a self-referential agent framework for Python and TypeScript. The core idea is simple: agents should be able to read and modify their own system prompts, tool descriptions, memory, and runtime capabilities based on whether they're actually succeeding at the task.

The framework ships with:

A Python SDK with orchestration, agents, scoring, memory, compression, deduplication, convergence detection, archive helpers, and extensions
A TypeScript SDK for Node.js with the same primitives
A modular extension system for adding capabilities without modifying the core runtime
Built-in research tooling extensions for web search, retrieval, and evidence gathering
A CLI for running compression, archive queries, provider discovery, orchestration, and extension workflows
A curses-based terminal UI for browsing and pinning commands without leaving the shell
A browser dashboard for visual inspection
An MCP stdio daemon for Claude Desktop, Claude Code, and Codex integration
An HTTP MCP server for web integrations

Both SDKs are zero-dependency where possible. The Python core is pure Python. The TypeScript SDK has minimal deps. You can run the whole thing against Ollama locally without touching a cloud provider.

The problem with most agent frameworks

I've used agents. They all share the same blind spot: context is treated as immutable input. You construct a prompt, feed it to the model, get output back. If the output is wrong, you tweak the prompt and try again. The agent itself has no idea what worked and what didn't across runs.

Hypercontext changes this by making context a first-class citizen that agents can manipulate. Each generation gets tracked as a node in a lineage tree. You can see which parent led to which result, which branch is going stale, and which context configuration produced the best score. Successful strategies get archived and reused. Failed ones get pruned.

This isn't theoretical. The archive stores scored generations so later runs can compare branches and identify the strongest evolution path. Memory is split between persistent storage (lessons across runs) and episodic storage (context within a single session).

Hypercontext also treats tooling as adaptive context. Extensions can expose capabilities dynamically during execution, allowing an agent to decide when external retrieval, research, or analysis should become part of its reasoning loop.

How the context loop works

Here's the basic flow:

The agent receives a task and its current context window
It generates a response and scores the result against a fitness function
If the score is below threshold, the agent reflects on what went wrong
It rewrites its own system prompt, tool descriptions, memory, or extension configuration based on that reflection
The new context configuration gets tested in the next generation
Successful configurations get archived; failed ones get discarded

This happens automatically in the TaskAgent and MetaAgent classes. You don't need to hand-code the reflection logic unless you want to.

The MetaAgent goes further. It can perform repository-aware tool use, extension orchestration, and self-modification workflows. If you point it at a codebase with --workdir, it can inspect files, suggest modifications, and track whether those modifications improved the code quality.

Extensions system

One of the biggest additions to Hypercontext is the extension architecture.

Extensions allow you to add runtime capabilities without bloating the core framework. Instead of hardcoding every tool into the agent runtime, Hypercontext lets you compose functionality based on the task.

Extensions can provide:

Tool registries
Retrieval pipelines
Research workflows
Context enrichers
External APIs
Scoring hooks
Middleware layers
Memory augmentation
Custom orchestration behaviors

This keeps the framework modular while still allowing deeply integrated workflows.

Extensions are designed to be lightweight and composable. You can enable only what your workflow requires.

Research tools extension

The research_tools extension adds structured information gathering to Hypercontext.

Instead of forcing agents to rely purely on static context or hallucinated recall, the extension provides a research layer that can gather, refine, and reuse evidence during execution.

The extension includes capabilities for:

Query expansion
Iterative research loops
Evidence collection
Citation-aware retrieval
Multi-step search refinement
Context injection from retrieved sources
Search result scoring and filtering
Research memory persistence across generations

This matters because long-running reasoning tasks often fail not from poor prompting but from incomplete information. Research tooling allows the agent to actively reduce uncertainty.

The extension integrates directly into the context evolution loop, meaning research results become part of the lineage and scoring process rather than existing as disconnected tool outputs.

Installation and setup

For Python:

pip install hypercontext

For Node.js:

npm install hypercontext-node-sdk

That's it. No separate MCP package to install. No complex dependency tree.

The Python package includes:

Core SDK
CLI
TUI
Browser dashboard launcher
MCP stdio daemon
HTTP server
Built-in extensions

The npm package is the SDK layer for Node.js projects.

Provider setup

Hypercontext doesn't lock you into a provider. It supports Claude, OpenAI, Ollama, OpenAI-compatible servers, and local transformers models. You set credentials via environment variables or a YAML config file with named presets.

For Claude:

HYPERCONTEXT_PROVIDER=anthropic
HYPERCONTEXT_MODEL=claude-sonnet-4-20250514
ANTHROPIC_API_KEY=your-key-here

For Ollama (fully local):

ollama serve
ollama pull llama3

HYPERCONTEXT_PROVIDER=ollama
HYPERCONTEXT_MODEL=llama3
OLLAMA_BASE_URL=http://localhost:11434

The named preset feature is useful when you want multiple backends in one project. You define them in a YAML file and resolve by name at runtime. The framework expands ${VAR} values from the environment, so secrets stay out of config files.

Using extensions in Python

Extensions are loaded directly into the runtime.

from hypercontext import HyperContext
from hypercontext.extensions import ResearchToolsExtension

hc = HyperContext(output_dir="./hypercontext_output")

hc.use(
    ResearchToolsExtension()
)

summary = hc.run(max_generations=3)
print(summary)

You can compose multiple extensions together depending on the workflow.

hc.use(ResearchToolsExtension())
hc.use(CustomMemoryExtension())
hc.use(CustomScoringExtension())

Extensions participate in orchestration rather than existing as isolated plugins.

Using it in Python

Direct orchestration is straightforward:

from hypercontext import HyperContext

hc = HyperContext(output_dir="./hypercontext_output")
summary = hc.run(max_generations=3)

print(summary)

If you want provider-backed calls without the full orchestration loop:

from hypercontext import LLMClient
from hypercontext.providers import ProviderRegistry

registry = ProviderRegistry.instance()

provider = registry.create(
    "anthropic",
    model="claude-sonnet-4-20250514",
    api_key="your-key-here",
    base_url="https://api.anthropic.com",
)

client = LLMClient(provider=provider)

text, history, metadata = client.complete(
    "Summarize this in one sentence."
)

For agent workflows, you choose between TaskAgent (repeatable tasks) and MetaAgent (repository-aware reasoning and self-modification). Both support context evolution and extension-aware execution.

Using it in TypeScript

The Node SDK follows the same patterns:

import {
  ContextWindow,
  TaskAgent,
  StructuredOutputParser,
  EnhancedToolRegistry,
  LoggingMiddleware,
} from "hypercontext-node-sdk";

const window = new ContextWindow(4096);
window.add("Important context", 1.0, "system");

const agent = new TaskAgent({
  name: "demo",
  maxTokens: 1024,
});

const result = agent.forward({
  query: "hello",
});

const parser = new StructuredOutputParser();

console.log(
  parser.parseFirst('Answer: {"status":"ok"}')
);

const registry = new EnhancedToolRegistry();

registry.use(new LoggingMiddleware());

registry.registerTool(
  {
    name: "echo",
    description: "Echo a payload back",
    parameters: { type: "object" },
  },
  async (args) => args,
);

The TypeScript SDK includes context compression, retrieval, lineage tracking, persistent memory, fitness evaluation, structured output parsing, and extension support.

CLI and terminal UI

The Python package includes a full CLI:

python -m hypercontext version
python -m hypercontext providers
python -m hypercontext run --generations 5 --output-dir ./runs/demo --workdir .
python -m hypercontext compress --input long_text.txt --ratio 0.4
python -m hypercontext archive --list
python -m hypercontext extensions --list

The TUI is a curses dashboard for browsing commands, pinning favorites, and executing them without leaving the terminal.

python -m hypercontext tui --workdir /path/to/project

For desktop assistants, the stdio MCP daemon handles Claude Desktop, Claude Code, and Codex:

python -m hypercontext mcp --workdir /path/to/project

For browser integrations, the HTTP server exposes the same tools over a REST interface:

python -m hypercontext serve --port 8080 --workdir /path/to/project

MCP integration without the hassle

Most MCP implementations require you to install a separate package and configure JSON files.

Hypercontext bundles the stdio daemon and HTTP server directly.

You don't need to install additional MCP dependencies.

The stdio daemon speaks the Model Context Protocol natively. Claude Desktop can discover and invoke Hypercontext tools without manual configuration. The HTTP server provides the same capability for browser and web integrations.

Extensions can also expose MCP-accessible tools, making them available to external clients automatically.

Context compression and deduplication

One of the practical problems with long-running agents is context bloat.

Hypercontext includes a ContextCompressor that reduces text size while preserving semantic meaning.

There's also a validator that checks compression fidelity so you don't accidentally drop important information.

The deduplication layer identifies repeated patterns across generations and collapses them.

This matters when you're running evolutionary loops where similar context configurations get tested repeatedly.

Lineage tracking

Every generation gets a unique ID and tracks its parent.

You can query the lineage tree to answer questions like:

Which generation produced the best score?
Which parent led to this result?
Which branch hasn't improved in the last 10 generations?

This isn't just logging.

The lineage data feeds back into the parent selection strategy for the next generation.

Stagnant branches get deprioritized. High-fitness branches get explored further.

Research extension outputs can also become lineage artifacts, meaning evidence chains are tracked alongside prompt evolution.

Archive and transfer learning

The archive stores proven context configurations ranked by fitness score.

When you start a new task, the framework can query the archive for context patterns that worked well on similar tasks.

This is transfer learning without neural network retraining.

You're transferring context strategies instead of model weights.

The archive is queryable via CLI:

python -m hypercontext archive --query "task:code-review fitness:>0.8"

Archived runs can include extension-derived context, allowing successful research workflows to be reused across future tasks.

What I learned building this

I started this project after reading the Hyperagents paper and getting frustrated that none of the existing frameworks implemented the meta-cognitive ideas in a practical way.

Most research code is a mess of Jupyter notebooks and hardcoded paths.

I wanted something you could actually install and use.

The hardest part wasn't compression or lineage tracking.

It was designing the agent loop so self-modification doesn't spiral into chaos.

If an agent can rewrite its own system prompt, it can also break its own system prompt.

The convergence detection layer stops the loop when scores plateau or when context configurations start cycling.

I also learned that modularity matters more than feature count.

Extensions let Hypercontext grow without turning the framework into an unmaintainable monolith.

Current state and what's next

The framework is functional and I'm using it in my own projects.

The Python package is on PyPI, the TypeScript SDK is on npm, and the docs are on GitHub Pages.

I'm currently working on:

Better convergence heuristics for multi-objective optimization
A web-based lineage visualizer
Additional extension categories
Improved research pipelines
Benchmark suites to compare context strategies across tasks
Better local model workflows
More retrieval-aware orchestration patterns

The repo includes runnable examples for evolution, lineage tracking, self-modifying agents, extensions, provider workflows, and research tooling.

If you want to see what the framework can do without writing code, start with:

examples/python/feature_gallery.py

Try it

# Python
pip install hypercontext
python -m hypercontext version

# TypeScript
npm install hypercontext-node-sdk

Docs
Extensions
PyPI
npm

NoB (Noticeably Better): a compiled language that tries to stay out of your way

RoTSL — Mon, 13 Apr 2026 12:39:52 +0000

Most new languages promise the same things: performance, simplicity, better tooling. NoB is trying to hit those too—but the interesting part is how it actually does it.

At its core, NoB is a compiled language that targets C++20, with a second execution path through a bytecode VM. That split ends up being more practical than it sounds.

Website: https://nob-lang.omni-flows.uk/
Source: private / proprietary
Platforms: macOS, Linux, Windows (via WSL2)

What NoB actually is

NoB compiles .nob code into C++20, then uses clang++ to produce a native binary.

There’s also a VM mode that skips compilation entirely and runs bytecode instead.

That gives you two very different workflows:

Native pipeline → slower startup, fast runtime
VM pipeline → instant startup, slower runtime

In practice, it feels like using two tools under one language.

The two pipelines (and when they matter)

Native (default)

nob file.nob
nob file.nob -o app

This is what you’d use for anything serious.
• Compiles via C++20
• Uses clang++
• Runs as a native binary
• Supports everything (networking, threads, async, etc.)

VM mode

nob --vm file.nob

This skips the compiler completely.

It’s useful for:
• quick scripts
• REPL work
• testing ideas

But it’s not feature-complete. Networking, threading, and some advanced features don’t work here.

The syntax (closer to Python than C++)

The syntax leans readable without being too loose.


set name to "Alice"

function greet(name)
  return "Hello, {name}"
end

print greet("Bob")

A few things stand out:
• set vs let (mutable vs immutable)
• 1-based indexing
• string interpolation built-in
• structured blocks without braces

It’s easy to pick up, especially if you’ve used Python or Lua.

Features that are actually interesting

Tail-call optimization

Recursive functions don’t blow the stack if written in tail form:


function sum_tail(n, acc)
  if n == 0 then return acc end
  return sum_tail(n - 1, acc + n)
end

This gets compiled into a loop automatically.

Pipe operator


words
  |> filter(function(w) return len(w) > 3 end)
  |> map(function(w) return upper(w) end)
  |> sort()

It makes chained transformations easier to read.

Macros (compile-time)


macro swap(a, b)
  set tmp to a
  a = b
  b = tmp
end

These run at compile time, not runtime.

Python backend


nob py file.nob -o file.py --run

This converts NoB into Python so you can use Python libraries like NumPy or OpenCV.

There are limits (no macros, no pipe operator), but it’s useful when you need the ecosystem.

Built-in concurrency (native only)


set t to thread_spawn(function()
  print "running"
end)

thread_join(t)

Includes threads, mutexes, channels, and async support.

Tooling (built-in)

NoB ships with a lot already included:


nob repl
nob check file.nob
nob profile file.nob
nob fmt file.nob
nob gui

Notable pieces:
• GUI REPL (nob gui)
• formatter and profiler included
• package manager (nob pkg)

You don’t need to assemble a separate toolchain.

Performance

From the official benchmarks:

Benchmark	Python	NoB	Speedup
Simple loop	0.498s	0.046s	~10.9×
Prime count	0.090s	0.018s	~4.9×
Fibonacci	0.734s	0.484s	~1.5×

What this means
• Big gains in loops and numeric work
• Smaller gains in recursive workloads
• Much faster than Python for CPU-heavy code

Compilation targets


nob file.nob --profile simd
nob file.nob --profile cuda
nob file.nob --profile wasm

Supports:
• SIMD optimized builds
• CUDA / OpenCL
• WebAssembly
• LLVM IR

Platform support

Platform Support
macOS Native
Linux Native
Windows WSL2

Pricing (indicative)

Tier	Price
Free	£0
Indie	£5–10/month
Pro	£10–20/month

Where it fits

NoB makes sense if you want:
• something faster than Python
• something simpler than C++
• built-in tooling without extra setup

Less ideal if you need:
• a large ecosystem
• long-established tooling

Final thoughts

NoB isn’t trying to reinvent programming. It’s trying to remove friction.

The dual pipeline is the most practical part—you can prototype quickly in VM mode, then switch to native when performance matters.

It’s early, but the core design is solid. Worth keeping an eye on.

I built a safety net for python environments because I was tired of debugging “It works on my machine”

RoTSL — Thu, 09 Apr 2026 09:59:33 +0000

Why every python developer needs a preflight check for their code

I used to have a recurring nightmare. I’d be halfway through a machine learning experiment, three hours into training, when suddenly everything would explode. Not because my code was wrong, but because my environment was quietly broken in a way I couldn’t see coming. Wrong Python version. A package that got installed with pip instead of conda. A wheel that claimed to support my architecture but didn’t. Rosetta 2 running my arm64 Python as x86_64 and tanking my GPU acceleration.

The error messages were always cryptic. The fixes were always tedious. And the worst part? I never knew if my environment was actually healthy until something went wrong.

So I built EnvGuard—a CLI tool that validates your Python environment before you run your code, not after it breaks. Think of it as a preflight checklist for your Python projects. If something’s wrong, it blocks execution and tells you exactly what’s broken and how to fix it. If everything passes, your command runs in a validated environment.

It’s macOS-first (because that’s where I do my work), runs on Linux with partial support, and deliberately doesn’t support Windows (because life’s too short for three-platform maintenance). You can install it from PyPI as envguard-tool — the CLI command is just envguard.

The Problem: Python Environments Are Fragile and Invisible

Python environment management is a solved problem in the same way that herding cats is a solved problem. Technically there are tools. Practically, things go wrong constantly.

Here’s what actually happens in the wild:

The architecture confusion. You install Python on an M1 Mac, but somehow you’re running the x86\_64 version under Rosetta 2. Everything works, but your Metal Performance Shaders (MPS) acceleration is silently disabled. PyTorch falls back to CPU. Your training takes 10x longer. You don’t notice until you check activity monitor and see no GPU usage.

The mixed ownership trap. You create a conda environment, but then you pip install something because the conda version is outdated. Then you conda install something else. Now you have packages owned by two different managers, and pip check is screaming about conflicts, but your code still runs so you ignore it until it doesn’t.

The CUDA delusion. You’re on macOS, but your requirements.txt includes torch==2.0.0+cu118 because you copied it from a Linux server. It installs fine. It even imports. But CUDA doesn’t exist on Apple Silicon, and your code fails three layers deep in a stack trace that mentions nothing about GPU compatibility.

The “it worked yesterday” mystery. Your environment was fine. Then you updated one package. Now something else is broken. You have no idea what changed or when.

These aren’t exotic edge cases. They’re daily experiences for Python developers working on ML, data science, or scientific computing projects. The existing tools — conda, pip, poetry, uv, pyenv — are great at creating environments. They’re terrible at validating them continuously.

The Solution: Preflight Validation Every Single Time

EnvGuard’s core idea is simple: instead of running python train.py and hoping, you run envguard run — python train.py. Before your code executes, EnvGuard runs a nine-step preflight pipeline:

Detect the host — OS version, architecture (native arm64 vs Intel vs Rosetta 2), available package managers, network connectivity
Discover the project — scan for pyproject.toml, requirements.txt, environment.yml, Pipfile, poetry.lock, etc.
Analyze intent — figure out what environment type (venv/conda/pipenv/poetry), Python version, and accelerator targets (CPU/MPS/CUDA) the project needs
Evaluate rules — run 15+ validation rules to catch problems
Fail-fast on critical issues — block execution if anything is unrecoverable
Create a resolution plan — determine exactly how to satisfy the environment requirements
Create or repair the environment — make sure the actual environment matches the requirements
Validate the environment — run pip check or equivalent to verify consistency
Smoke test — try importing key packages in an isolated subprocess to catch runtime failures

If any step fails with a CRITICAL finding, your command never runs. You get a clear error message explaining what went wrong and how to fix it. No cryptic tracebacks. No debugging environment issues at 2am.

What EnvGuard Actually Catches

The rules engine evaluates 15+ specific checks. Here are the ones that have saved me the most pain:

Rule	Severity	What it catches
`CUDA_ON_MACOS`	CRITICAL	Any CUDA dependency on macOS (hardware impossibility)
`ROSETTA_TRANSLATION_DETECTED`	WARNING	x86_64 Python running under Rosetta 2 on Apple Silicon (kills MPS performance)
`ARCHITECTURE_MISMATCH`	ERROR	Python architecture doesn't match project requirements
`MIXED_PIP_CONDA_OWNERSHIP`	WARNING	Packages installed by both pip and conda (dependency hell indicator)
`WHEEL_INCOMPATIBLE`	WARNING	Wheel file doesn't match current platform/architecture
`BROKEN_ENVIRONMENT`	ERROR	Active venv/conda is missing Python binary or critical files
`PYTHON_VERSION_BELOW_MINIMUM`	ERROR	Python version below `requires-python` in pyproject.toml
`MPS_NOT_AVAILABLE`	INFO	Apple Silicon present but MPS not available (usually means PyTorch wasn't built with MPS support)

The CUDA_ON_MACOS rule alone has probably saved me hours of debugging. Here’s what happens: you copy a requirements.txt from a Linux machine that specifies torch==2.1.0+cu118. You install it on your M1 Mac. It seems to work — pip doesn’t complain, the import succeeds. But when you actually try to move tensors to the GPU, you get a cryptic error about CUDA devices not being available. EnvGuard catches this at the dependency resolution stage and blocks execution with a message telling you to use mps or cpu targets instead.

The ROSETTA_TRANSLATION_DETECTED rule is subtler but equally important. If you’re running an x86_64 Python binary on an Apple Silicon Mac (usually because you installed it before Rosetta was properly configured, or you’re using an old pyenv), everything works — but MPS acceleration is silently disabled. Your ML training runs on CPU. Your inference is 10x slower than it should be. EnvGuard detects this via sysctl proc_translated and warns you that you’re leaving performance on the table.

The Technical Architecture: How It Actually Works

EnvGuard is built as a layered Python package with clear separation of concerns. The source is in the GitHub repo under src/envguard/.

CLI Layer (cli.py): Typer-based interface with 25 commands across environment management, dependency resolution, lock files, publishing, and self-updating. Every command supports — json output for CI/CD integration.

Orchestration Layer (preflight.py, doctor.py): The preflight engine runs the nine-step pipeline. The doctor runs standalone diagnostics without execution. Both use the same underlying detection and rules systems.

Domain Layer : The heavy lifting happens here:

detect.py — HostDetector class gathers OS, architecture, Python, shell, network, and permission facts
rules.py — RulesEngine evaluates all 15+ validation rules
repair.py — RepairEngine can automatically fix broken environments (recreate venvs, fix mixed ownership, switch Python versions)
models.py — Pydantic models for HostFacts, ProjectIntent, RuleFinding, ResolutionRecord, etc.
project/ — Discovery (scanning for project files), intent analysis (inferring requirements), resolution (dependency solving), and lifecycle management
resolver/ — Pluggable backends for PyPI (BFS resolution via JSON API), uv, pip, and conda
lock/ — Lock file generation and management (envguard.lock in TOML format with SHA-256 content hashes)
update/ — Self-updating mechanism with SHA-256 verification and rollback support

Platform Layer : macOS-specific code for permissions, Rosetta detection, Xcode CLI tools, and LaunchAgent management. Linux gets partial support here — core pipeline works, but no LaunchAgent, no MPS detection, no Rosetta checks.

All state files (in .envguard/) are written atomically using write-to-temp-then-rename to prevent corruption from interrupted writes. Every subprocess call has explicit timeouts. The security model is documented in detail — checksum verification for updates, no shell=True with string interpolation, path traversal protection for archive extraction.

Real Usage: What My Workflow Looks Like

I work on a lot of ML projects with different requirements. Some need PyTorch with MPS. Some need TensorFlow (which has its own special hell on macOS). Some are pure Python data pipelines. Here’s how I actually use EnvGuard day-to-day.

Starting a new project:

cd ~/projects/new-ml-experiment
envguard init

This creates a .envguard/ directory with state.json, envguard.toml (config), and subdirectories for snapshots, cache, logs, and backups. It scans my project files to figure out what I’m building.

Checking if everything is healthy:

envguard doctor

This runs 10 diagnostic checks: host detection, project discovery, Python environment, package manager health, dependency consistency, accelerator support, permissions, network connectivity, and environment ownership. It outputs a report showing what’s working and what’s not.

Running my actual code:

envguard run - python train.py - epochs 100 - batch-size 32

Before train.py executes, the preflight pipeline runs. If my environment has drifted — say, I updated PyTorch and now there’s a version conflict with torchvision — EnvGuard catches it and blocks execution. I can then run envguard repair to fix it automatically, or envguard lock sync to reinstall from my lock file.

Locking dependencies for reproducibility:

envguard resolve
envguard lock generate

resolve uses the PyPI JSON API to resolve my project dependencies to exact pinned versions. lock generate writes an envguard.lock file with SHA-256 content hashes. I commit this to git. When someone else clones the repo, they run envguard install — from-lock and get exactly the same environment I have.

Self-updating:

envguard update - dry-run # check if there's a new version
envguard update # actually update with SHA-256 verification and automatic rollback snapshot

EnvGuard can update itself. Before applying an update, it creates a rollback snapshot. If something goes wrong, envguard rollback restores the previous version.

The Lock File: Reproducibility Without the Pain

Python dependency management has a reproducibility problem. requirements.txt with loose version constraints means “install something that hopefully works.” requirements.txt with pinned versions means “this worked on my machine at one specific moment, but good luck if you’re on a different architecture or Python version.”

EnvGuard’s lock file (envguard.lock) tries to be smarter. It’s a TOML file that includes:

The exact resolved dependency graph with specific versions
SHA-256 content hashes for verification
Platform and Python version markers (so you can have different resolutions for macOS-arm64 vs Linux-x86_64 if needed)
The source files that contributed to the resolution (pyproject.toml, requirements.txt, etc.)

The lock file is human-readable but machine-generated. You don’t edit it manually. You regenerate it with envguard lock generate or update specific packages with envguard lock update — package <name>.

In CI, you can run envguard lock check to verify that the lock file is up-to-date with your source requirements. It exits with code 13 if stale, which you can use to fail builds that might have inconsistent dependencies.

What It Doesn’t Do (And Why)

EnvGuard has deliberate limitations that are worth understanding:

It doesn’t intercept unmanaged launches. If you run python train.py directly, EnvGuard doesn’t see it. Only commands routed through envguard run get validated. This is by design — EnvGuard is opt-in, not a system-wide interceptor that could break other workflows.

It doesn’t support Windows. The codebase uses POSIX-specific APIs throughout (os.access() for permissions, list-form subprocess arguments, /tmp paths). Adding Windows support would require a parallel implementation of the platform layer, and I don’t use Windows enough to maintain that. WSL2 works if you need it.

It doesn’t make CUDA work on macOS. Apple Silicon physically cannot run NVIDIA CUDA. EnvGuard detects CUDA dependencies and blocks them with a clear error, but it can’t magically add CUDA support where none exists.

It doesn’t auto-activate environments on directory change. If you want cd my-project to automatically activate the right venv, use direnv. EnvGuard’s shell hooks are minimal and opt-in — they just load the integration, not the environments themselves.

These limitations are documented in the repo’s docs/limitations.md and tracked as architectural decisions in docs/adrs/.

Installation and Getting Started

If you’re on macOS 12+ (Monterey) or Linux, installation is straightforward:

pip install envguard-tool

The PyPI package is named envguard-tool because envguard was taken. The CLI command and Python import are both just envguard.

For macOS, there’s also a bootstrap script that installs shell hooks and the LaunchAgent for automatic update checking:

git clone https://github.com/rotsl/envguard.git
cd envguard
bash scripts/bootstrap.sh

Once installed, verify it works:

envguard - version
envguard doctor

Then initialize any Python project:

cd /path/to/your/project
envguard init
envguard run - python your_script.py

Why I Built This (And Who It’s For)

I built EnvGuard for myself, primarily. I’m a researcher working on machine learning for biology — specifically computer vision for fungal pathogen analysis. I work on Apple Silicon Macs. I collaborate with people on Linux servers. I deal with PyTorch, TensorFlow, JAX, and a lot of scientific Python packages with complex native dependencies.

I was tired of debugging environment issues that had nothing to do with my actual research. I wanted a tool that would catch problems before they cost me hours of training time or corrupted experimental results.

EnvGuard is for Python developers who:

Work on macOS (especially Apple Silicon) and are tired of Rosetta/architecture surprises
Need MPS acceleration for PyTorch and want to know when it’s not actually available
Collaborate across different machines and need reproducible environments
Are tired of “works on my machine” and want validation that happens before execution
Prefer CLI tools that integrate into existing workflows rather than replacing them entirely

It’s not for everyone. If you’re a web developer working with simple Python environments, you probably don’t need this. If you’re on Windows, this won’t help you (yet). If you want a full IDE-like environment manager with GUI buttons, look elsewhere.

But if you’re doing scientific Python or ML and you’ve ever lost a day to a broken environment that you didn’t know was broken until it was too late — EnvGuard might save you some pain.

Python environment management has been broken for a long time. We’ve accepted “it works on my machine” as an inevitable part of the development experience. We’ve normalized spending hours debugging issues that have nothing to do with our actual code.

I don’t think it has to be this way. EnvGuard is my attempt to bring some of the safety and validation we expect from production systems (preflight checks, reproducible builds, clear error messages) to the messy world of Python development.

It’s not perfect. It’s alpha software with known limitations. But it’s already saved me hours of debugging, and I hope it can do the same for you.

If you’re tired of environment surprises, give it a try. Run pip install envguard-tool, run envguard doctor on your project, and see what it finds. You might discover that your “working” environment has been quietly broken in ways you never noticed.

Health AI on Notion with Tribe V2

RoTSL — Thu, 02 Apr 2026 15:16:29 +0000

Local-first Notion health tracker with TRIBEv2 brain analysis, AI health insights, symptom logging, goals, medications, appointments, and a browser UI

Notion MCP Challenge*

This was supposed to be a Notion challenge submission.

I built most of it close to the deadline, got something working, and then missed the window. No big failure story. Just underestimated how long the messy parts would take.

After that, keeping it private felt pointless. So I pushed it to GitHub.

Around the same time, I came across Tribe v2. That changed how I looked at this project. Instead of treating it like a failed submission, I started treating it like something that could keep evolving in public.

That is what this is now. Not finished. Still useful.

The actual problem I was trying to solve

I sometimes already track things in Notion:

• Sleep

• Workouts

• Random notes about how I feel

The problem is not tracking. It is what happens after.

Nothing.

No aggregation. No patterns. No feedback loop. Just logs sitting there.

Every week I would think I should look at it properly. I never did.

So this project is basically me outsourcing that thinking step.

System design

The architecture is simple on paper and annoying in practice.

Pipeline

• Fetch data from Notion databases

• Normalize it into a consistent structure

• Send it to an LLM

• Write the output back into Notion

That is it. No fancy orchestration.

The difficulty is everything in between.

Notion is not a real database

At first glance, Notion feels structured. It is not.

Things that break over time:

• Property names change

• Data types shift

• Fields get added or removed

If you build with fixed schemas, your system breaks quietly.

What I did instead

I treated Notion as semi structured data:

• Map fields dynamically instead of hardcoding

• Use fallback parsing when fields do not match

• Normalize everything into an internal schema

Example internal format:

{
. "date": "2026–03–20",
. "sleep_hours": 6.5,
. "workout": "strength",
. "mood": "low"
}

No matter how messy the source is, the model only sees this cleaned version.

Data normalization is the real system

Most of the work went here.

Steps

Extract raw values from Notion API
Convert them into usable types
Handle missing or inconsistent fields
Align everything by time

Examples:

• "6 hrs" becomes 6.0
 • Empty fields get dropped from inference
 • Mixed labels get standardized

If this layer is weak, everything downstream gets worse.

LLM layer

The model is not used as a general assistant.

It has a narrow job:

• Summarize recent data

• Spot simple patterns

• Suggest small adjustments

Input structure

Each run includes:

• Recent data window

• Aggregated values

• Instructions that limit scope

Example:

Sleep: [6, 5.5, 7, 6]
Workout: [yes, no, yes, yes]
Mood: [low, medium, medium, high]

Task:

Identify patterns

Avoid assumptions without enough data

State uncertainty clearly

The main issue: the model guesses

Even with weak data, it tries to sound confident.

That is a problem, especially for anything health related.

What I added

• Minimum data thresholds before running inference

• Prompts that force uncertainty

• Restrictions on long term claims

• Filtering outputs that sound too certain

It still makes mistakes. It just makes fewer confident ones.

Writing results back to Notion

Outputs are stored as:

• Daily summaries

• Weekly insights

• Separate logs for traceability

Each output includes:

• Timestamp

• Data window used

• Generated insight

This makes it easier to debug and iterate.

Why I stayed inside Notion

I considered building a separate app.

That would solve a lot of problems:

• Cleaner schema

• Better validation

• Fewer edge cases

But nobody wants another health app.

Notion already has the data. So I built on top of it instead.

The tradeoff is dealing with inconsistency.

Influence from Tribe v2

This project shifted direction after I came across Tribe v2.

The main idea that stuck:

You do not wait until something feels ready.

You ship it. Then improve it in the open.

That is exactly what this repo reflects. Some parts are solid. Some are clearly not. That is fine.

What is still broken

A few things are still rough:

• Sparse data leads to weak outputs

• The model confuses correlation with causation

• Some insights sound better than they are

• No feedback loop yet to measure usefulness

The system works. It just does not always matter.

What I would change

If I rebuilt/rework this:

• Define a stricter schema earlier

• Separate ingestion and AI layers properly

• Add better logging from day one

• Focus more on actionable insights, not just observations

Where this could go

A few directions that feel real:

• Long term memory instead of short windows

• Feedback loops to track if suggestions help

• Wearable integrations

• Confidence scoring for outputs

Or it might just stay like this. A small layer that makes Notion slightly smarter.

Closing

Missing the deadline changed the trajectory of this project.

If I had submitted it, I probably would have moved on.

Instead, it is now something I can keep improving without pretending it is finished.

Right now, it is useful enough to keep using.

That is enough.

Repo: https://github.com/rotsl/notion-Health-AI

☕ Pot.OF — AI-Powered HTCPCP Coffee Pot

RoTSL — Thu, 02 Apr 2026 09:03:18 +0000

This is a submission for the DEV April Fools Challenge

What I Built

Pot.OF is a playful HTCPCP/1.0 coffee pot simulator inspired by RFC 2324. It includes an interactive terminal, a full 418 I'm a Teapot tea-rejection flow, decaf kernel panic mode, and three optional AI features powered by Google Gemini: an AI Coffee Therapist, an AI Brew Critic, and an AI RFC Generator.

It solves no real problems, but it does let users argue with a coffee pot that has strong opinions.

Demo

Deployed app: pot-of
Video demo: Youtube

Code

Built with Next.js 16, TypeScript, Tailwind CSS 4, shadcn/ui, Framer Motion, Zustand, Prisma, and Google Gemini.

Repo link: Github

How I Built It

Built the app as a Next.js 16 App Router project with a single interactive coffee-pot interface and dedicated API routes for both protocol behavior and AI features.
Implemented 3 Gemini-powered AI endpoints:
- /api/htcpcp/ai-therapist — a sentient coffee pot therapist with a consistent personality, multi-turn chat, and coffee-themed advice
- /api/htcpcp/ai-critic — a dramatic coffee snob that generates absurd tasting notes and scores
- /api/htcpcp/ai-rfc — an RFC-style generator that creates fake HTCPCP protocol extensions with realistic formatting
Added a bring-your-own-key flow in the GUI so users can paste their own Gemini API key locally to unlock AI features without requiring a deployment-wide secret
Built 8 total API routes:
- 5 HTCPCP-inspired core routes for brewing, status, RFC display, teapot mode, and timing
- 3 AI routes for therapist, critic, and RFC generation
Added personality-driven UI behavior including pot moods like idle, brewing, happy, offended, existential, and decaf-panic
Implemented joke protocol interactions including:
- BREW tea -> full-screen 418 I'm a Teapot
- BREW decaf -> fake kernel panic
- RFC, STATUS, WHEN, PROPFIND, and other terminal commands
Used three generated visual assets for the coffee pot mascot, teapot artwork, and coffee cup imagery
Deployed it as a Vercel-friendly app with the AI key supplied by each user in the interface instead of hardcoding a shared secret

Prize Category

Best Google AI Usage

The app uses Google Gemini across three distinct feature types: conversational AI through the therapist, creative generation through the brew critic, and structured document generation through the RFC generator. AI is not a side widget here; it is part of the product’s personality.

Best Ode to Larry Masinter

The project is built around RFC 2324, including the legendary 418 I'm a Teapot, HTCPCP-style commands, and a coffee pot that takes the protocol far too seriously.

From Kidney Stones to Convergence

RoTSL — Sat, 28 Mar 2026 08:16:21 +0000

The strange path from ultrasound physics to rethinking how solvers move through space

I didn’t expect this to start with kidney stones, but that’s honestly where it began.

I was reading about ultrasound lithotripsy, how they break stones using focused waves, and I got stuck on the geometry of it. Ellipses, focal points, energy landing exactly where it needs to.

It is one of those cases where physics feels less like equations and more like choreography.

That idea just sat there for a while.

Then, separately, I was dealing with solver code. Big systems, messy residuals, the usual “why is this not converging” loop. At some point I stopped thinking in terms of matrices. The system started to feel like a place.

Some parts resisted everything, like trying to push something heavy across rough ground. Other parts moved too easily and felt unstable. Residuals stopped feeling abstract and started feeling like forces pushing things out of balance.

That is roughly where PICD came from.

PICD does not try to replace anything. It wraps what already works.

GMRES, CG, Newton – Krylov, BDF. They still do the actual solving. PICD just watches what is happening and keeps some memory: residual history, how the system is partitioned, how different parts relate to each other.

Then it adjusts the setup for the next solve. Preconditioners, damping, small corrections. Carefully.

There is a hard boundary it does not cross. If a step does not reduce the residual, it does not count. The usual acceptance rules still apply.

The “conic” part is just how the system gets split up.

Instead of one big vector, you break it into regions. Each one tracks its own behavior. Its residual pattern, its neighbors, what worked last time.

It sounds heavier than it feels. In practice it just gives the solver a bit of context it did not have before.

The unusual part is treating those regions like they have physical properties.

It sounds heavier than it feels. In practice it just gives the solver a bit of context it did not have before.

Underneath all that is a graph.

Connections between regions depend on how similar their residuals are, how often they activate together, and the actual structure of the problem. From that you get a Laplacian:

L = D -W

It does not replace the solver. It just helps decide what should be grouped together and what should be prioritized.

The solve loop itself is pretty normal:

Pick a solver, partition, build state, adjust preconditioner, run, accept or reject, update.

The results are interesting.

Everything in the current validation set runs. 98 tests, 22 examples.

On direct comparisons, same solver with and without PICD, the PICD version is faster in the published benchmark set and uses less memory there as well.

Linear problems stand out the most. Most cases improve, sometimes by a lot. There is a Helmholtz example that jumps by hundreds of times faster.

Nonlinear and time-dependent cases are less clean. Some improve. Some do not. There is a turbulence example that clearly gets worse, with more rejected steps and slower runtime.

That part I trust more than the wins.

If there is one thing I would keep in mind, it is that PICD is deliberately limited in what it claims.

It works well in same-method comparisons. Beyond that, it depends. It does not assume every physics-inspired term helps, and the controller can reduce or disable them when they start hurting convergence.

I still come back to that original picture of energy being guided instead of forced.

That is really what this is. Instead of brute-forcing convergence, you reshape the space a little so the solver has an easier path.

But it changes how you think about the problem. And for me, that shift was the interesting part.

Read more on my reasearch here and cite it if you find it useful : https://doi.org/10.13140/RG.2.2.10721.06243

Your LLM prompts are probably wasting 90% of tokens. Here’s how I fixed mine.

RoTSL — Sun, 22 Mar 2026 13:04:10 +0000

I keep running into the same problem with LLM apps.

This work is based on my previous article on dev.to https://dev.to/rotsl/contextfusion-the-context-brain-your-llm-apps-are-missing-2gkm

You build a retrieval pipeline, hook it up to an API, and then quietly ship prompts that are full of stuff the model doesn’t need. Extra chunks. Duplicates. Half-relevant context that just bloats everything.

And you pay for all of it.

CFAdv is basically an attempt to stop doing that.

It builds on context-fusion, but adds something that turns out to matter more than I expected: even if you pick the right context, you can still mess it up by putting it in the wrong place.

Most pipelines are still doing this

Let’s be honest about the default pattern:

chunks = retriever.top_k(query, k=5)
prompt = "\n\n".join(chunks)
response = llm(prompt)

That’s it.

No budget. No filtering beyond retrieval. No thought about ordering.

More context is assumed to be better. It often isn’t.

CFAdv splits the problem in two

Instead of one “context step”, it does two separate things:
1. Decide what gets in
2. Decide where it goes

That separation is the whole point.

Step 1: selecting context under a budget

Instead of top-k, CFAdv treats selection like an optimization problem.

Each chunk gets a score based on things like:
• relevance
• trust
• freshness
• diversity
• token cost

Then it tries to pick the best combination under a fixed token budget.

At a high level:

def value(chunk):
    utility = (
        0.25 * chunk.relevance +
        0.20 * chunk.trust +
        0.15 * chunk.freshness +
        0.15 * chunk.structure +
        0.15 * chunk.diversity
    )
    risk = (
        0.40 * chunk.hallucination +
        0.35 * chunk.staleness +
        0.25 * chunk.privacy
    )
    return utility - risk

Then rank by value density:

density = value(chunk) / max(chunk.tokens, 1)

And greedily pack until you hit the budget.

The small trick that makes a big difference

There’s a simple filter before any of that:

floor = max_score * 0.15
selected = [c for c in candidates if c.score >= floor]

Anything below 15% of the best chunk just gets dropped.

That sounds minor, but it changes behavior a lot.
• If your data is clean, everything stays
• If it’s noisy, most of it disappears

So you don’t fill your prompt with mediocre content just because you have space.

Step 2: ordering for attention

This is the part I underestimated.

Even if you pick the right chunks, models don’t treat all positions equally. Stuff at the start tends to get more attention than stuff buried in the middle.

So CFAdv reorders the selected chunks based on similarity to the query.

Basic version:

def cosine(a, b):
    return (a @ b) / (norm(a) * norm(b))

scores = [cosine(embed(query), embed(chunk)) for chunk in chunks]
weights = softmax(scores)

ordered = [chunk for _, chunk in sorted(
    zip(weights, chunks),
    reverse=True
)]

Higher weight goes earlier in the prompt.

No embeddings API required

Instead of calling an external model, it uses a simple hashed bag-of-words vector.

import hashlib
import numpy as np
import re

def embed(text, dim=64):
    vec = np.zeros(dim)
    tokens = re.findall(r"\b\w+\b", text.lower())

    for t in tokens:
        h = int(hashlib.sha256(t.encode()).hexdigest(), 16)
        vec[h % dim] += 1.0
        vec[(h >> 16) % dim] += 0.5

    return vec / (np.linalg.norm(vec) + 1e-8)

It’s not fancy. No positional info, no learned weights. But for short chunks it works surprisingly well.

Two levels of ordering

There’s also a second layer.

Instead of treating everything as one list, CFAdv groups context into blocks:
• system
• history
• retrieval
• tools

Then it does:
1. sort chunks inside each block
2. sort the blocks themselves

Sketch:

# intra-block
for block in blocks:
    block.chunks.sort(key=lambda c: similarity(query, c), reverse=True)

# cross-block
block_scores = {
    block: similarity(query, mean_embed(block.chunks))
    for block in blocks
}

ordered_blocks = sorted(blocks, key=lambda b: block_scores[b], reverse=True)

So you end up shaping the whole prompt, not just shuffling pieces.

The full pipeline

CFAdv is an 8-stage pipeline, but it’s easier to think of it like this:

docs = ingest(files)
blocks = normalize(docs)
variants = represent(blocks)

candidates = retrieve(query, variants)
selected = plan(candidates, budget=120)

ordered = attention_fuse(query, selected)
packet = assemble(ordered)

prompt = compile(packet, mode="qa")

Each step is stateless. That makes it easier to test and reason about.

What happens in practice

You can cut most of the prompt without losing the answer, as long as:
• retrieval pulls in some noise
• there is redundancy
• the query only needs a subset of the data

If everything is relevant, the system mostly leaves it alone.

If only one chunk survives selection, ordering doesn’t matter.

Where this actually helps

This kind of pipeline shines when:
• your retrieval step is messy
• you’re concatenating multiple documents
• prompts are long enough for attention effects to matter

If you already have clean, minimal context, you won’t see much change.

The part that stuck with me

This isn’t really about attention or embeddings.

It’s about treating prompt assembly as something worth optimizing.

Right now most systems act like prompts are just containers. You throw things in and hope the model figures it out.

CFAdv flips that.

It asks a simple question: what is the smallest amount of context that still works?

Then it enforces it.

And once you start thinking that way, it’s hard to go back to dumping chunks into a string and calling it a day.

Try it yourself

If you want to see how this works in practice or plug it into your own workflow:

GitHub repo
Contains the full Python library, CLI, benchmarks, and tests. You can run it locally, inspect the pipeline stages, or integrate it into your own RAG setup.
Live demo
Lets you compare raw prompts vs CFAdv-compiled prompts side by side. Useful for quickly seeing how much context gets removed and how ordering changes.

If you’re already using retrieval + concatenation, the repo is the easiest place to start. Swap your prompt assembly step with CFAdv’s planner + fusion stages and see what drops out.

Resume Tailor

RoTSL — Fri, 20 Mar 2026 14:31:02 +0000

This is a submission for the Notion MCP Challenge

What I Built

Resume Tailor takes a job posting and your resume, then outputs a tailored resume and cover letter as PDFs. The whole thing runs in your browser. No sign-up, no server, no data stored anywhere except your Notion workspace if you want it there.

You pick Claude or Gemini (Gemini has a free tier, no credit card), paste or upload the job description, upload your resume, and click go. Two PDFs come out the other side.

It also runs as a local Flask app with more features (DOCX support, job URL fetching, richer PDFs) and a CLI if that's your thing.

The one rule I actually cared about

The AI is not allowed to make things up. That sounds obvious but it's easy to get wrong. The system prompt on every single call says: you may reorder and reword existing content, you may use keywords from the job description if they honestly describe something the candidate already did, but you cannot add skills, invent metrics, or fabricate roles. If the job asks for five years of Kubernetes experience and the resume doesn't mention Kubernetes, that gap stays in the output.

I've seen other resume tools confidently add skills the user never had. I didn't want to build that.

How Notion MCP works

The Notion integration reads job descriptions from Notion pages and logs every run's output back. If you track jobs in Notion, pass the page ID directly instead of copy-pasting. The system reads the page via MCP.

After each run, two databases get entries. A Job Applications table tracks company, role, date, and a snippet. A linked Outputs database stores the actual resume and cover letter text as readable blocks. A few weeks in, you have every application: what you sent and what they asked for.

I also included .mcp.json for the official @notionhq/notion-mcp-server. Claude Desktop and Cursor pick it up, letting you ask Claude things like "which applications are pending?" or "draft a follow-up for the engineering role."

The Notion API breaks if you write to a property that doesn't exist. Early versions failed when someone's title column wasn't "Name". The fix: introspect the database first, find the actual title property, and put everything else (status, date, company) in the page body as blocks instead of database properties. Works now regardless of configuration.

def _get_title_property_name(db_id):
    db = call_notion_mcp("API-retrieve-a-database", {"database_id": db_id})
    for name, data in db.get("properties", {}).items():
        if data.get("type") == "title":
            return name
    return "Name"

The refactor (late 2024): Moved from the Notion SDK to a Python MCP client. All calls now route through src/mcp_notion_client.py, which spawns the Node.js MCP server and communicates via stdio. Same behavior, but now the operations flow through MCP like the .mcp.json config intended. The MCP server is launched on-demand—no persistent process—so it's transparent to the user.

Video demo

Resume Tailor Demo

Show us the code

GitHub: https://github.com/rotsl/resume-tailor

Live demo: https://rotsl.github.io/resume-tailor

How it's structured

resume-tailor/
├── docs/index.html                   ← the GitHub Pages app, fully self-contained
├── app.py                            ← local Flask server
├── main.py                           ← CLI
├── instruct.md                       ← formatting rules injected into every prompt
├── .mcp.json                         ← Notion MCP server config
├── .github/workflows/deploy.yml      ← deploys docs/ to GitHub Pages on push
├── scripts/
│   └── setup_notion_databases.py     ← creates the Notion DBs via MCP, writes IDs to .env
└── src/
    ├── tailor.py                     ← AI engine, supports Claude and Gemini
    ├── parser.py                     ← PDF / DOCX / text extraction
    ├── pdf_generator.py              ← PDF output via ReportLab
    ├── web_context.py                ← fetches company context from the web
    ├── mcp_notion_client.py          ← Python MCP client for Notion operations
    └── notion_integration.py         ← high-level Notion read/write (uses MCP)

Supporting two AI providers

src/tailor.py has a single tailor_resume() function that accepts a provider, model, and api_key argument. The same prompts go to both. The browser version calls the APIs directly via fetch(); the local version uses the Python SDKs.

# Claude
tailored = tailor_resume(
    resume, job_description,
    provider="claude",
    model="claude-sonnet-4-6",
    api_key="sk-ant-..."
)

# Gemini free tier
tailored = tailor_resume(
    resume, job_description,
    provider="gemini",
    model="gemini-2.5-flash",
    api_key="AIza..."
)

When no key is passed, it falls back to environment variables, so the CLI reads from .env without asking every time.

The prompt structure

Two layers. The system prompt sets the hard rules (no fabrication, no adding skills). The user prompt gives the model the original resume, the job description, and any web context about the company as clearly labelled separate sections.

ABSOLUTE RULES — NEVER VIOLATE:
1. You may ONLY use information that exists in the candidate's original resume.
2. Do NOT invent, embellish, or assume any experience, skills, metrics, or facts.
3. You MAY reorder, reword, and emphasize existing content.
4. Mirror keywords from the job description only where they truthfully apply.
5. If the candidate lacks a required skill, do NOT add it. Leave it absent.

The cover letter call gets both the original resume and the already-tailored resume, so it can see exactly what was kept and what was cut.

Runtime config using `instruct.md`

Formatting rules live in instruct.md and get injected into every prompt at call time. Swap the file out and the output changes — no code edits. Someone who wants a one-page resume with a specific section order can describe that there. Someone applying to academic roles can put a different set of rules in.

The GitHub Pages version

docs/index.html is the entire app. PDF.js reads uploaded PDFs in the browser, the AI APIs are called directly via fetch, jsPDF builds the output PDFs in memory. The GitHub Actions workflow just copies that one file to Pages on every push to main.

- name: Upload Pages artifact
  uses: actions/upload-pages-artifact@v3
  with:
    path: docs/

No build step, no npm, no bundler. The tradeoff is no Notion logging on the static version, since there's nowhere safe to store the Notion API key client-side.

Notion setup script

python scripts/setup_notion_databases.py YOUR_NOTION_PAGE_ID

Creates both databases via MCP, then writes their IDs into .env automatically. No manual copy-paste needed. The script calls call_notion_mcp("API-create-a-database", {...}) for each database—same flow as the app itself.

Quick start

git clone https://github.com/YOUR_USERNAME/resume-tailor.git
cd resume-tailor
python -m venv venv && source venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
# Add GEMINI_API_KEY (free) or ANTHROPIC_API_KEY, plus NOTION_API_KEY

python scripts/setup_notion_databases.py YOUR_NOTION_PAGE_ID

python app.py  # → http://localhost:5000
# or
python main.py tailor --resume resume.pdf --job-url https://...

Stack: Claude / Gemini, Notion MCP (Python mcp client + Node.js server), ReportLab, pdfplumber, jsPDF, PDF.js, Flask.

🧠 Codex OS: I tried turning AI into a local dev “operating system”

RoTSL — Wed, 18 Mar 2026 19:31:15 +0000

I’ve been experimenting with a simple idea:

What if AI wasn’t just a tool you call… but something that behaves more like an operating system for development?

That’s how Codex OS started.
• GitHub: https://github.com/rotsl/codex-os
• Webpage: https://rotsl.github.io/codex-os/
• npm: https://www.npmjs.com/package/codexospackage

This isn’t another wrapper around an API. I was trying to build something that feels persistent — like it’s sitting there, managing tasks, running workflows, and helping you think through code instead of just spitting snippets.

I’m still figuring it out. But it’s already useful in ways I didn’t expect.

What Codex OS actually is

At its core, Codex OS is a local-first system that lets you:
• run AI-driven tasks
• structure workflows
• interact with code in a more stateful way

The key idea: treat AI like a runtime environment, not a function call.

That changes how you design everything.

Instead of:

const result = await ai.generate(prompt)

You’re closer to:

await codex.run("analyze-project")

It’s subtle, but it shifts the mindset from “ask → answer” to “delegate → process”.

Why I built it

I kept running into the same friction with AI tools:
• Context gets lost constantly
• You repeat yourself more than you should
• There’s no real “memory” unless you bolt it on
• Everything feels stateless

It works fine for small tasks. But once you try to build something non-trivial, it starts to feel like you’re babysitting the tool.

I wanted something that:
• keeps context around
• can chain tasks together
• behaves more like a system than a chatbot

So I started building it.

How it works (without the marketing layer)

There are three main pieces:

1. Task execution model

You define actions. Codex runs them.

These can be things like:
• analyze files
• generate code
• refactor parts of a project
• run multi-step workflows

The important part is that tasks can call other tasks. That’s where it starts feeling like a system instead of a script.

2. Local-first approach

Everything is designed to run locally.

That decision came early, mostly because:
• I don’t want to depend entirely on remote APIs
• local context is easier to manage
• it’s faster for iteration

It also makes the whole thing feel more like tooling and less like a service.

3. npm package integration

You can install it directly:

npm install codexospackage

Once installed, you can start wiring it into your own workflows instead of using it as a standalone tool.

That’s where it gets interesting.

A small example

Here’s a rough idea of how you might use it:

import { codex } from "codexospackage";

await codex.run("review-codebase", {
  path: "./src"
});

Instead of asking “what’s wrong with this file?”, you define a reusable task and run it whenever you need.

It’s closer to scripting your thinking than querying an assistant.

What surprised me

I expected this to be a thin abstraction.

It isn’t.

Once tasks start calling other tasks, you get something that feels… layered. Almost like a tiny OS scheduler for AI workflows.

But there’s also a downside:
• It’s easy to over-engineer things
• You can end up building systems instead of solving problems
• Debugging AI-driven flows is still messy

I’m still working through that.

Where this could go

I don’t want to oversell this. It’s early.

But a few directions feel promising:
• persistent agents that track project state
• better tooling for chaining tasks
• tighter integration with local dev environments

Right now, it’s somewhere between a tool and an experiment.

If you try it and it breaks (it probably will in some cases), I’d actually love to hear about it. That’s the only way this gets better.

Final thought

I don’t think the future of AI in dev is just better autocomplete.

It’s systems.

Small ones at first. Slightly weird. A bit unreliable. But more useful once they stick around and understand what you’re doing.

Codex OS is my attempt at that.

ContextFusion: The Context Engineering Layer Your LLM Apps Are Missing

RoTSL — Wed, 11 Mar 2026 17:31:28 +0000

Modern AI applications rely heavily on Large Language Models (LLMs), but many production systems still struggle with a critical problem:

Context management.

Developers often construct prompts by simply concatenating everything available:

system instructions
user queries
conversation history
retrieved documents
tool outputs

This works for small prototypes, but in real systems it leads to:

bloated prompts
higher API costs
increased latency
inconsistent responses

A new discipline is emerging to address this challenge: context engineering.

Instead of treating prompts as raw text, context engineering treats information as structured input that must be optimized before being sent to an LLM.

This is exactly what ContextFusion introduces.

GitHub Repository: https://github.com/rotsl/context-fusion
npm Package: https://www.npmjs.com/package/@rotsl/contextfusion

The Hidden Problem in LLM Applications

When developers optimize AI systems, they often focus on:

prompt engineering
retrieval pipelines
model selection

However, the real bottleneck is frequently the context itself.

Every LLM request must include all relevant information inside the prompt. Since LLM APIs charge and operate based on tokens, inefficient context handling directly affects performance.

More tokens mean:

higher inference latency
increased API costs
greater noise in the prompt

A typical LLM request pipeline looks like this:


User Input
↓
System Prompt
↓
Conversation History
↓
Retrieved Documents
↓
Tool Results
↓
Final Prompt

Without careful orchestration, this pipeline leads to prompt bloat, where irrelevant or duplicated context inflates token usage.

What Is ContextFusion?

ContextFusion is a provider-neutral context compiler designed for token-efficient and low-latency LLM workflows.

Instead of manually assembling prompts, developers supply structured context components.

ContextFusion then:

collects context sources
normalizes their structure
fuses relevant information
compiles an optimized prompt

Conceptually, the system works like this:


Raw Context Sources
↓
Context Normalization
↓
Context Fusion
↓
Context Optimization
↓
Compiled Prompt
↓
LLM Request

You can think of ContextFusion as a build system for LLM context.

Just as compilers optimize source code before execution, ContextFusion optimizes context before it reaches the model.

Why Context Engineering Matters

Prompt engineering helped developers get started with LLMs. But modern AI systems involve much more complexity:

multi-step reasoning agents
retrieval pipelines (RAG)
tool integrations
long-running conversations

All of these components produce context that must be merged carefully.

Consider this example:


System Prompt:           200 tokens
Conversation History:    1200 tokens
Retrieved Documents:     1800 tokens
Tool Output:             400 tokens
User Input:              50 tokens

Total: 3650 tokens

Much of this information may not be necessary for the current request.

ContextFusion helps reduce this overhead by structuring and prioritizing context before generating the prompt.

ContextFusion Architecture

ContextFusion introduces a context compilation pipeline that separates context management from prompt construction.

             +---------------------+
             |  Application Logic  |
             +----------+----------+
                        |
                        v
             +---------------------+
             |   Context Sources   |
             |---------------------|
             | System Instructions |
             | Conversation Memory |
             | Retrieved Knowledge |
             | Tool Outputs        |
             +----------+----------+
                        |
                        v
             +---------------------+
             | Context Normalizer  |
             +----------+----------+
                        |
                        v
             +---------------------+
             |   Context Fusion    |
             +----------+----------+
                        |
                        v
             +---------------------+
             | Context Optimizer   |
             +----------+----------+
                        |
                        v
             +---------------------+
             |  Compiled Prompt    |
             +----------+----------+
                        |
                        v
                   LLM Provider

This architecture creates a clean separation between:

application logic
context orchestration
model inference

Installing ContextFusion

You can install ContextFusion using npm:

npm i @rotsl/contextfusion

npm package:
https://www.npmjs.com/package/@rotsl/contextfusion

Example Usage

Instead of manually constructing prompts, developers provide structured context modules.

import { ContextFusion } from "context-fusion";

const fusion = new ContextFusion();

fusion.addContext({
  type: "system",
  content: "You are a helpful coding assistant."
});

fusion.addContext({
  type: "memory",
  content: conversationHistory
});

fusion.addContext({
  type: "retrieval",
  content: retrievedDocuments
});

fusion.addContext({
  type: "tool",
  content: toolOutput
});

const compiledPrompt = fusion.compile();

console.log(compiledPrompt);

ContextFusion automatically handles:

merging context sources
removing duplicate information
structuring prompt sections
optimizing token usage

Modular Context Pipelines

ContextFusion allows developers to structure context into logical modules:

systemContext
memoryContext
retrievalContext
toolContext
metadataContext

Each module contributes structured information to the final compiled prompt.

This modular architecture makes LLM applications easier to maintain and scale.

Designed for AI Agents

Modern AI systems increasingly rely on agent-based workflows.

A typical agent pipeline might look like this:

User Query
   ↓
Retrieve Knowledge
   ↓
Call External Tools
   ↓
Reasoning Step
   ↓
Generate Response

Each step generates additional context that must be merged efficiently.

ContextFusion manages these layers automatically, ensuring that prompts remain clean and token-efficient.

When Should You Use ContextFusion?

ContextFusion is particularly useful for:

Retrieval-Augmented Generation (RAG)

RAG pipelines often produce large sets of documents that must be structured carefully before prompting.

AI Agents

Agent workflows generate intermediate reasoning steps that become context.

Coding Assistants

Large codebases produce significant contextual data.

Long Chat Conversations

Conversation history grows rapidly over time and must be managed efficiently.

Context Engineering vs Prompt Engineering

Prompt engineering focuses on how prompts are written.

Context engineering focuses on what information the model receives.

Prompt Engineering	Context Engineering
wording prompts	selecting context
formatting instructions	structuring context
small prompt optimization	large workflow optimization
prompt phrasing	token efficiency

As AI systems grow more complex, context engineering becomes essential infrastructure.

Final Thoughts

Large Language Models continue to evolve rapidly, but context remains the primary bottleneck in real-world AI systems.

Simply increasing context window size is not enough.

Efficient AI systems must:

select relevant context
remove redundant information
structure prompts clearly
minimize token usage

ContextFusion introduces an important idea:

Treat context like code. Compile it before execution.

For developers building modern AI applications especially RAG systems, AI agents, and coding assistants, ContextFusion represents a powerful new architectural layer.

Resources

GitHub Repository
https://github.com/rotsl/context-fusion

npm Package
https://www.npmjs.com/package/@rotsl/contextfusion

ContextFusion: The Context Brain Your LLM Apps Are Missing

RoTSL — Tue, 10 Mar 2026 21:58:28 +0000

A deep dive for users who want results and developers who want control

TL;DR (For the Impatient)

Normal users: Install context-portfolio-optimizer, run cpo compile ./your-docs --budget 4000, and stop overpaying for tokens.

Developers: Middleware pipeline that ingests heterogeneous sources → normalizes → precomputes → optimizes via multi-objective knapsack → compiles provider-specific payloads with delta fusion for agents.

Both groups get 60–99% token reduction with identical answer quality.

Part 1: For Normal Users — “Just Make My LLM Cheaper and Faster”

The Problem You Actually Face

You’re building with LLMs. Maybe it’s a chatbot over your company docs. Maybe it’s a coding assistant. Maybe it’s an agent that needs to remember context across 20 turns.

You keep hitting the same frustrations:

“Why is this API call so expensive?” — You’re sending 8,000 tokens when 800 would suffice
“Why does it take 10 seconds to respond?” — Latency scales with prompt size
“Why does my agent forget everything?” — You’re not managing context deltas across turns
“Why do I have to rewrite everything when I switch from GPT-4 to Claude?” — Hardcoded prompt formats

You’ve tried RAG. You’ve tried chunking. But you’re still blindly stuffing retrieved chunks into prompts without knowing which ones actually matter.

What ContextFusion Does (No Jargon)

Think of it like a smart travel packer for your LLM trips.

You have a weight limit (token budget). You have dozens of items (documents, code, images). Some items are essential. Some are nice-to-have. Some are duplicates. Some are risky (outdated, untrusted).

ContextFusion:

Unpacks everything — PDFs, Word docs, spreadsheets, images, code files
Weighs and labels each item — How useful? How risky? How heavy?
Packs the optimal suitcase — Maximum value within your weight limit
Formats it for your destination — OpenAI’s preferred style, Anthropic’s format, or local Ollama
And for return trips (agent conversations), it remembers what you already packed and only adds what’s new.

Real Results

Benchmarks run with Claude Sonnet 4.6 on production-like workloads. Full methodology at github.com/rotsl/context-fusion/benchmarks

Getting Started (Three Options)

Option A: NPM Wrapper (Easiest — No Python Required)

# One-time setup
npm install -g @rotsl/contextfusion
npx @rotsl/contextfusion setup

# Create API keys file
npx @rotsl/contextfusion env
# Edit .env with your OPENAI_API_KEY or ANTHROPIC_API_KEY

# Run optimization
npx @rotsl/contextfusion run ./my-documents \
  --query "Summarize key findings" \
  --provider anthropic \
  --model claude-sonnet-4-6 \
  --budget 4000

# Launch Web UI
npx @rotsl/contextfusion ui --port 8080

Option B: Python Package (More Control)

pip install context-portfolio-optimizer

# Set up environment
cat > .env << 'EOF'
ANTHROPIC_API_KEY=your_key_here
OPENAI_API_KEY=your_key_here
EOF

# Run CLI
cpo run ./my-documents --budget 4000 --query "What are the main points?"

# Or compile for specific task type
cpo compile ./my-codebase \
  --task "Explain this function" \
  --provider openai \
  --model gpt-5-mini \
  --mode code \
  --budget 3000

Option C: Docker (Isolated, Reproducible)

docker build -t context-fusion:latest .
docker run --rm -it -v "$(pwd)":/app context-fusion:latest run ./data --budget 3000

The Web UI: See What Your LLM Actually Receives

Run cpo ui --port 8080 and open your browser. You'll see:

Run stats: Files ingested, blocks selected, total tokens
Representation usage: Which compact variants were chosen
Selected blocks: Source, representation type, utility score, token estimate
Context preview: Exactly what gets sent to the LLM
Model answer: Optional direct comparison

This transparency is rare. Most RAG tools are black boxes. ContextFusion shows its work.

Common Use Cases

When ContextFusion Helps Most

✅ Multi-provider setups — Same pipeline, different output formats

✅ Cost-sensitive production — 60–99% token reduction

✅ Agent conversations — Delta fusion prevents token churn

✅ Complex ingestion — PDFs, images, code, spreadsheets unified

✅ Latency requirements — Precomputation + caching

When You Might Not Need It

❌ Simple single-turn Q&A with tiny documents

❌ You’re already heavily invested in a specific RAG framework and happy with costs

❌ You need real-time streaming with sub-100ms latency (ContextFusion adds 50–200ms optimization overhead)

Part 2: For Developers — “How This Actually Works”

Architecture Overview

┌─────────────────────────────────────────────────────────────────┐
│ INGESTION LAYER │
│ PDF │ DOCX │ CSV │ JSON │ Images (OCR) │ Code │ Markdown │
└─────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│ NORMALIZATION LAYER │
│ Convert all sources to uniform ContextBlock objects │
│ - source_type, content_hash, created_at, metadata │
└─────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│ REPRESENTATION LAYER │
│ Precompute compact variants per block: │
│ - universal_summary (general purpose) │
│ - qa_extractive (question-answering focused) │
│ - code_signature (functions, classes, dependencies) │
│ - agent_condensed (working memory format) │
└─────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│ PRECOMPUTE PIPELINE │
│ Store: fingerprints, summaries, token stats, │
│ retrieval features, compact variants in .cpo_cache/ │
└─────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│ RETRIEVAL LAYER │
│ Query classification → Lexical retrieval (top-100) │
│ → Fast rerank (top-20/25) → Candidate set │
└─────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│ MULTI-OBJECTIVE PLANNER (Core) │
│ │
│ maximize Σ( w_u·utility - w_r·risk - w_t·token_cost │
│ - w_l·latency + w_c·cacheability + w_d·diversity ) │
│ │
│ subject to: Σ(token_i) ≤ budget │
│ │
│ Selects optimal representation variant per block │
└─────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│ COMPRESSION LAYER │
│ - JSON minification │
│ - Citation compaction (Source URI → [id]) │
│ - Schema field pruning │
│ Levels: none │ light │ medium │ aggressive │
└─────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│ DELTA FUSION (Agent Mode) │
│ Compute ContextDelta: │
│ - added_blocks: new since last turn │
│ - updated_blocks: changed content │
│ - removed_blocks: no longer relevant │
│ - unchanged_block_ids: reuse from cache │
└─────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│ PROVIDER ADAPTER LAYER │
│ Compile provider-specific payloads: │
│ - openai: chat.completions format │
│ - anthropic: messages with XML citations │
│ - ollama: local API structure │
│ - openai_compatible: generic wrapper │
└─────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│ CACHE-AWARE ASSEMBLY │
│ Segment into: │
│ - stable: system instructions, citation maps, cacheable blocks │
│ - dynamic: volatile content, real-time data │
└─────────────────────────────────────────────────────────────────┘

The Knapsack Formulation: Why This Isn’t Just “Smart Chunking”

Most RAG tools use semantic similarity: embed query, embed chunks, return top-k. This fails when:

Your budget is 4,000 tokens and you have 50 relevant chunks of 500 tokens each
Some chunks are high-utility but high-risk (outdated documentation)
Some chunks are cacheable, others must be fresh
You need diversity (don’t send 5 versions of the same information)

ContextFusion’s planner treats this as a constrained optimization problem:

# Pseudocode of the core algorithm
def select_context_blocks(candidates, budget, weights):
    """
    candidates: List[ContextBlock with multiple representation variants]
    budget: int (token limit)
    weights: dict[str, float] (utility, risk, latency, cacheability, diversity)
    """

    # Generate all (block, variant) pairs with scores
    items = []
    for block in candidates:
        for variant in block.representations:
            score = (
                weights['utility'] * variant.utility_score
                - weights['risk'] * block.risk_score
                - weights['token_cost'] * variant.token_count
                - weights['latency'] * variant.latency_estimate
                + weights['cacheability'] * block.cache_score
                + weights['diversity'] * diversity_bonus(block, selected)
            )
            items.append((block.id, variant, score, variant.token_count))

    # Solve 0/1 knapsack for maximum score within budget
    selected = knapsack_01(items, budget)
    return selected

This is NP-hard, but with proper indexing and heuristics, it runs in <100ms for typical workloads.

Code Example: Pipeline Integration

from context_portfolio_optimizer import PipelineRunner, Config
from context_portfolio_optimizer.providers import AnthropicAdapter

# Custom configuration
config = Config.from_yaml("""
budget:
  instructions: 1000
  retrieval: 3000
  memory: 2000
  examples: 1500
  tool_trace: 1000
  output_reserve: 1000

scoring:
  utility_weights:
    retrieval: 0.25
    trust: 0.20
    freshness: 0.15
    structure: 0.15
    diversity: 0.15
    token_cost: -0.10

provider:
  name: anthropic
  model: claude-sonnet-4-6
""")

# Initialize pipeline
runner = PipelineRunner(config=config)

# Run full pipeline
result = runner.run(
    sources=["./docs/architecture.pdf", "./src/api.py", "./data/metrics.csv"],
    query="How does the authentication flow work?",
    task_mode="qa", # chat | qa | code | agent
    budget=4000,
    use_precomputed=True,
    compute_delta=False # Set True for agent loops
)

# Inspect results
print(f"Selected {result['stats']['blocks_selected']} blocks")
print(f"Total tokens: {result['stats']['total_tokens']}")
print(f"Context preview:\n{result['context'][:500]}...")

# Direct provider compilation
adapter = AnthropicAdapter(config.provider)
payload = adapter.compile_packet(
    context_blocks=result['selected_blocks'],
    task="Answer with citations",
    model="claude-sonnet-4-6"
)
# payload is ready for anthropic.messages.create(**payload)

Delta Fusion: The Secret to Efficient Agents

Standard agent implementations re-send the entire conversation history + retrieved context on every turn. With 10 turns × 4,000 tokens = 40,000 tokens wasted.

ContextFusion’s delta tracking:

# Turn 1: Full context
turn1_result = runner.run(sources, query="Step 1...", task_mode="agent")
turn1_packet = turn1_result['context_packet']

# Turn 2: Only send what changed
turn2_result = runner.run(
    sources, 
    query="Step 2...",
    task_mode="agent",
    previous_packet=turn1_packet, # Enable delta computation
    compute_delta=True
)

# turn2_result['context_delta'] contains:
# {
# 'added_blocks': [new_retrieved_content],
# 'updated_blocks': [changed_blocks],
# 'removed_blocks': [no_longer_relevant],
# 'unchanged_block_ids': [ids_to_reuse_from_cache],
# 'full_context_hash': 'abc123...' # For cache validation
# }

The provider adapter assembles:

System instructions (stable, cached)
Citation map (stable, cached)
New/updated blocks (dynamic, sent)
Unchanged block references (cached, not sent)

Precompute Pipeline: Latency Optimization

For production workloads, precompute expensive operations:

# One-time setup (can run offline, on CI, or scheduled)
cpo precompute ./corpus \
  --store-dir .cpo_cache/precompute \
  --semantic-dedup \
  --generate-all-representations

# Runtime query uses precomputed artifacts
cpo compile ./corpus \
  --precomputed-only \
  --query "Quick question" \
  --budget 2000

Precomputed artifacts:

fingerprints.jsonl: Content hashes for deduplication
representations/: All compact variants per block
token_stats.json: Pre-counted tokens per variant
retrieval_index.faiss: FAISS index for fast similarity search
features.jsonl: Utility/risk/cacheability scores

MCP Server Integration

Expose ContextFusion as an MCP (Model Context Protocol) server:

cpo serve-mcp --host localhost --port 8765

MCP clients can now call:

tools/ingest: Add documents to context
tools/compile: Optimize and compile context
resources/context/{session_id}: Retrieve compiled packets
tools/delta: Compute context deltas

Framework Integrations

LangChain:

from context_portfolio_optimizer.integrations import ContextFusionRetriever

retriever = ContextFusionRetriever(
    sources=["./docs"],
    budget=3000,
    task_mode="qa"
)

# Use in any LangChain chain
from langchain.chains import RetrievalQA
qa = RetrievalQA.from_chain_type(
    llm=chat_model,
    chain_type="stuff",
    retriever=retriever
)

LlamaIndex:

from context_portfolio_optimizer.integrations import ContextFusionNodeParser

parser = ContextFusionNodeParser(
    budget_per_query=4000,
    precompute_dir=".cpo_cache"
)

# Use with LlamaIndex index construction
from llama_index.core import VectorStoreIndex
index = VectorStoreIndex.from_documents(
    documents,
    node_parser=parser
)

Development Setup

git clone https://github.com/rotsl/context-fusion.git
cd context-fusion
make bootstrap # Install dev dependencies

# Development workflow
make test # Run test suite (49 tests)
make lint # Ruff + mypy
make type-check # Strict type checking
make format # Auto-format code

# Local servers
make ui # Web UI on :8080
make serve-mcp # MCP server on :8765

# Benchmarking
make benchmark # Run full benchmark suite

Project Structure

Performance Characteristics

Extending ContextFusion

Custom representation:

from context_portfolio_optimizer.representations import Representation, register_representation

@register_representation("my_custom")
class MyCustomRepresentation(Representation):
    def generate(self, block: ContextBlock) -> str:
        # Your custom summarization logic
        return custom_summarize(block.content)

    def estimate_tokens(self, text: str) -> int:
        return len(text.split()) * 1.3 # Rough heuristic

Custom provider adapter:

from context_portfolio_optimizer.providers import BaseProviderAdapter, register_adapter

@register_adapter("my_provider")
class MyProviderAdapter(BaseProviderAdapter):
    def compile_packet(self, context_blocks, task, model, **kwargs):
        # Format for your custom LLM API
        return {
            "model": model,
            "messages": [
                {"role": "system", "content": self.format_system()},
                {"role": "user", "content": self.format_context(context_blocks, task)}
            ]
        }

Part 3: Common Questions

Q: How is this different from LangChain’s ContextualCompressionRetriever?

LangChain’s version compresses after retrieval using an LLM call. ContextFusion optimizes which content to retrieve and which representation to use, without requiring an LLM for compression. It’s also provider-agnostic and handles delta fusion for agents.

Q: Does this replace my vector database?

No. ContextFusion sits after retrieval. Use Pinecone, Weaviate, pgvector, or FAISS for initial retrieval — then pass candidates through ContextFusion for optimization.

Q: What about streaming responses?

ContextFusion optimizes the input context. Streaming the LLM’s output is unaffected. The optimization adds 50–200ms overhead, which is usually offset by reduced LLM latency from shorter prompts.

Q: Can I use this with local models?

Yes. The Ollama adapter works with any OpenAI-compatible local server. Budget planning and compression are even more valuable with slower local hardware.

Q: How do I debug suboptimal context selection?

Run cpo ui and inspect the "Selected Blocks" panel. Each block shows its utility score, risk score, token count, and why it was included/excluded. Run cpo ablate ./data to see which blocks contribute most to answer quality.

GitHub - rotsl/context-fusion: ContextFusion is the context brain for LLM apps - compress, rank, and route the right evidence to chat + agent models across OpenAI, Claude, Ollama, and MCP

NPM Package

Final Thoughts

ContextFusion isn’t just another RAG tool. It’s a bet that context optimization — treating token budgets as scarce resources to be allocated intelligently — will become as essential as retrieval itself.

For normal users: Install it, run it, pay less.

For developers: Extend it, integrate it, build smarter systems.

Fuse less context. Keep more signal. Ship faster answers.

⭐️ Star the repo, ⚠️file issues, ㊣ submit PRs. ContextFusion is Apache-2.0 and built for production.

Forem: RoTSL

What Happens When You Try to Reverse Biology? A Deep Look at the Protein DNA Analysis Simulator

Most biology tools move in one direction. DNA becomes RNA. RNA becomes protein.

Why reverse biology is harder than it sounds

The simulator starts with a simple but useful idea

The protein analysis page adds a second layer

Protein sequences carry clues, not complete answers

The protein analysis page changes how users think

The live simulation version feels more interactive

The live version works like a live sequence interpreter

The live simulator combines several biological layers

Reverse translation exposes a hidden truth about biology

The project feels closer to computational biology than traditional teaching software

Inspiration from published Science research

Why uncertainty is the most valuable part of the simulator

What makes the project useful for education

Where the simulator could grow

Organism-specific codon bias

Multiple DNA candidates

Probability scoring

Structural hints

Sequence comparison

Final thoughts

Project links

References

⚠️ Important disclaimer

Hypercontext: a framework for agents that actually know what they're doing

What it actually does

The problem with most agent frameworks

How the context loop works

Extensions system

Research tools extension

Installation and setup

Provider setup

Using extensions in Python

Using it in Python

Using it in TypeScript

CLI and terminal UI

MCP integration without the hassle

Context compression and deduplication

Lineage tracking

Archive and transfer learning

What I learned building this

Current state and what's next

Try it

NoB (Noticeably Better): a compiled language that tries to stay out of your way

What NoB actually is

The two pipelines (and when they matter)

Native (default)

I built a safety net for python environments because I was tired of debugging “It works on my machine”

The Problem: Python Environments Are Fragile and Invisible

The Solution: Preflight Validation Every Single Time

What EnvGuard Actually Catches

The Technical Architecture: How It Actually Works

Real Usage: What My Workflow Looks Like

The Lock File: Reproducibility Without the Pain

What It Doesn’t Do (And Why)

Installation and Getting Started

Why I Built This (And Who It’s For)

Links:

Health AI on Notion with Tribe V2

The actual problem I was trying to solve

System design

Notion is not a real database

What I did instead

Data normalization is the real system

LLM layer

Input structure

Writing results back to Notion

Why I stayed inside Notion

Influence from Tribe v2

What is still broken

What I would change

Where this could go

Closing

☕ Pot.OF — AI-Powered HTCPCP Coffee Pot

What I Built

Demo

Code

How I Built It

Runtime config using `instruct.md`