<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Daniel Lenton</title>
    <description>The latest articles on Forem by Daniel Lenton (@danlenton).</description>
    <link>https://forem.com/danlenton</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1280924%2Fcf2069a1-cfac-4f19-bcbd-c2d8e3e07e25.JPG</url>
      <title>Forem: Daniel Lenton</title>
      <link>https://forem.com/danlenton</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/danlenton"/>
    <language>en</language>
    <item>
      <title>Why we built nested steerable tool loops for AI agents</title>
      <dc:creator>Daniel Lenton</dc:creator>
      <pubDate>Mon, 30 Mar 2026 20:12:10 +0000</pubDate>
      <link>https://forem.com/danlenton/why-we-built-nested-steerable-tool-loops-for-ai-agents-2f0n</link>
      <guid>https://forem.com/danlenton/why-we-built-nested-steerable-tool-loops-for-ai-agents-2f0n</guid>
      <description>&lt;p&gt;Here's something that's bothered us for the past year: the moment you ask an AI agent to do something, it disappears. You prompt, you wait, you get a result. If you realise halfway through that you forgot to mention something, you cancel and start over. If you want to know what it's doing, you can't. If something more urgent comes up, you can't pause it and come back later.&lt;/p&gt;

&lt;p&gt;This is a fundamental limitation of how agent frameworks are built. One LLM, one loop, one tool call at a time. The model picks a tool, calls it, reads the result, picks the next tool. There's no interface for the outside world to interact with the loop while it's running.&lt;/p&gt;

&lt;p&gt;We needed something different. We're building AI assistants that you onboard like new hires — share your screen, walk them through your tools, hop on a call. They need to be doing things &lt;em&gt;while you're talking to them&lt;/em&gt;. They need to handle "actually, also check train options" without starting over.&lt;/p&gt;

&lt;p&gt;So we built steerable tool loops. Today we're open-sourcing the engine under MIT: &lt;a href="https://github.com/unifyai/unity" rel="noopener noreferrer"&gt;github.com/unifyai/unity&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Every operation returns a handle
&lt;/h2&gt;

&lt;p&gt;This is the core idea. When you ask the assistant to do something, you don't get a promise that eventually resolves. You get a live handle:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;handle&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;act&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Research flights to Tokyo and draft an itinerary&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Twenty seconds later, while it's still working:
&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;interject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Also check train options from Tokyo to Osaka&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Something urgent comes up:
&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pause&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="c1"&gt;# ... deal with it ...
&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;resume&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Or just ask what's happening:
&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Have you found anything under $800?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;ask&lt;/code&gt;, &lt;code&gt;interject&lt;/code&gt;, &lt;code&gt;pause&lt;/code&gt;, &lt;code&gt;resume&lt;/code&gt;, &lt;code&gt;stop&lt;/code&gt;. That's the interface. Every operation in the system returns one of these — from a simple contact lookup to a multi-hour task execution.&lt;/p&gt;

&lt;h2&gt;
  
  
  Handles nest
&lt;/h2&gt;

&lt;p&gt;This is where it gets interesting. The assistant isn't one loop. It's a hierarchy of them.&lt;/p&gt;

&lt;p&gt;The Actor receives your request and writes a Python program that calls typed primitives — &lt;code&gt;await primitives.contacts.ask(...)&lt;/code&gt;, &lt;code&gt;await primitives.knowledge.update(...)&lt;/code&gt;. Each of those calls starts its own LLM tool loop inside the relevant manager, which returns its own handle.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;handle.pause()
 │
 ▼
Actor (pauses)
 ├── ContactManager.ask (pauses)
 │    └── inner search operation (pauses)
 └── KnowledgeManager.update (pauses)
      └── inner write operation (pauses)

All layers pause. Resume propagates the same way. So does stop.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can steer a complex multi-step operation at any depth without knowing or caring about the internal structure. Pause the whole thing, or ask a specific sub-operation what it's doing.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this actually enables
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Talk to your assistant while it works.&lt;/strong&gt; The system has a dual-brain architecture: a slow deliberation brain that sees the full picture and makes decisions, plus a fast real-time voice agent (on LiveKit) that handles the conversation at sub-second latency. They communicate over IPC. When the slow brain finishes a background task, it tells the fast brain to weave the results into whatever you're currently discussing. You never have to wait in silence.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Redirect mid-task.&lt;/strong&gt; "Actually, don't send that email — call them instead." The interject mechanism injects new instructions into the running loop between LLM turns. If an LLM call is already in flight, it's cancelled and restarted with the interjection included. No restart, no lost context.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Run multiple things at once.&lt;/strong&gt; The conversation manager tracks concurrent in-flight actions, each with its own steerable handle. You can say "how's the flight search going?" and it routes to the right handle's &lt;code&gt;ask()&lt;/code&gt; method, while the other operations keep running.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Memory that doesn't reset.&lt;/strong&gt; Every ~50 messages, a background process extracts contacts, relationships, domain knowledge, and task commitments into structured, queryable tables. After a month, the assistant has a working model of your world — not a chat log, but typed records it can filter, join, and search.&lt;/p&gt;

&lt;h2&gt;
  
  
  The code
&lt;/h2&gt;

&lt;p&gt;The system has been in development for ~10 months. We're a YC company (Unify) and this powers our commercial product. The brain is the open-source part.&lt;/p&gt;

&lt;p&gt;If you want to see how it works, start here:``&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/unifyai/unity/blob/staging/unity/common/async_tool_loop.py" rel="noopener noreferrer"&gt;&lt;code&gt;unity/common/async_tool_loop.py&lt;/code&gt;&lt;/a&gt;&lt;/strong&gt; — the &lt;code&gt;SteerableToolHandle&lt;/code&gt; protocol and &lt;code&gt;AsyncToolLoopHandle&lt;/code&gt; implementation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/unifyai/unity/blob/staging/unity/common/_async_tool/loop.py" rel="noopener noreferrer"&gt;&lt;code&gt;unity/common/_async_tool/loop.py&lt;/code&gt;&lt;/a&gt;&lt;/strong&gt; — the loop engine: interjections, pausing, parallel tool execution, context compression&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/unifyai/unity/blob/staging/ARCHITECTURE.md" rel="noopener noreferrer"&gt;&lt;code&gt;ARCHITECTURE.md&lt;/code&gt;&lt;/a&gt;&lt;/strong&gt; — the full technical walkthrough&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We'd genuinely appreciate feedback — what we got right, what seems over-engineered, what you'd do differently. This is a complex system and outside perspective is valuable.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/unifyai/unity" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; · &lt;a href="https://youtu.be/qjSWiCd8Bq8" rel="noopener noreferrer"&gt;Launch video&lt;/a&gt; · &lt;a href="https://unify.ai" rel="noopener noreferrer"&gt;Website&lt;/a&gt;&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>automation</category>
      <category>architecture</category>
      <category>agents</category>
    </item>
    <item>
      <title>We Built a Dynamic Router Improving LLM Quality, Cost and Speed ✨</title>
      <dc:creator>Daniel Lenton</dc:creator>
      <pubDate>Wed, 22 May 2024 15:10:58 +0000</pubDate>
      <link>https://forem.com/danlenton/we-built-a-dynamic-router-improving-llm-quality-cost-and-speed-4dlf</link>
      <guid>https://forem.com/danlenton/we-built-a-dynamic-router-improving-llm-quality-cost-and-speed-4dlf</guid>
      <description>&lt;p&gt;&lt;strong&gt;Are you also overwhelmed by all the LLM models and providers constantly coming onto the scene?&lt;/strong&gt; To me it sometimes feels like trying to drink from a firehose, especially when it comes to aligning with my own specific task and prompts. Chosing the wrong model for your task means slower, more expensive, and less competent models, which nobody wants 🫠&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F980rhznqxi6cno3488ge.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F980rhznqxi6cno3488ge.png" alt="Image description" width="800" height="415"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Common Dilemma
&lt;/h2&gt;

&lt;p&gt;The AI landscape is cluttered with options like Llama, Gemini, GPT, and Mistral, leading to a common scenario:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fawb6tobutgmxn2bnbgi8.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fawb6tobutgmxn2bnbgi8.jpeg" alt="Image description" width="577" height="433"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Dynamic Routing with Unify ✨
&lt;/h2&gt;

&lt;p&gt;Before you roll your eyes at yet another buzzword, let me &lt;em&gt;try&lt;/em&gt; to explain what we've built in a bit more detail. Basically, with Unify, you don't have to manually test each model against your requirements or juggle multiple accounts and API keys. All models are available with a single API key, and you can easily &lt;a href="https://console.unify.ai/dashboard"&gt;benchmark your prompts&lt;/a&gt; to assess which LLMs and providers are best for &lt;em&gt;your&lt;/em&gt; own task.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo1ykqq3582tq2so1rxdu.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo1ykqq3582tq2so1rxdu.gif" alt="Image description" width="1653" height="538"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Unify can also automatically &lt;a href="https://unify.ai/chat?default=true"&gt;route your prompts&lt;/a&gt; to the most suitable LLM based on your preferences for quality, speed, and cost. This means you can focus on what truly matters - building your exceptional AI-driven applications 🔥&lt;/p&gt;

&lt;p&gt;Feel free to check out a more comprehensive &lt;a href="https://youtu.be/ZpY6SIkBosE"&gt;walkthrough&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqxp0ktmmy82vplwszfva.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqxp0ktmmy82vplwszfva.png" alt="Image description" width="800" height="478"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  So, high-level, what does Unify bring to the table?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;⚙️ &lt;strong&gt;Control&lt;/strong&gt;: Choose which models and providers you want to route to and then adjust how important quality, cost, and latency are for you. That's it; now the performance of your LLM app is fully in your hands, not the providers!&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;📈 &lt;strong&gt;Self Improvement&lt;/strong&gt;: As each new model and provider comes onto the scene, sit back and watch your LLM application automatically improve over time. We quickly add support for the latest and greatest, ensuring your custom cost-quality-speed requirements are always fully optimized.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;📊 &lt;strong&gt;Observability&lt;/strong&gt;: Don't want to route? No sweat. Quickly compare all models and providers, and see which are truly the best for your own needs, on your own prompts, for your own task.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;⚖️ &lt;strong&gt;Impartiality&lt;/strong&gt;: We treat all models and providers equally, as we don't have a horse in the race. You can trust our benchmarks.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;🔑 &lt;strong&gt;Convenience&lt;/strong&gt;: The power of all models and providers behind a single endpoint, queryable individually or via the router, all with a single API key. 'pip install unifyai', and away you go!&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;🧑‍💻 &lt;strong&gt;Focus&lt;/strong&gt;: Don't stress updating the model and provider every few weeks. Just specify your performance needs and get back to building great AI products. We'll handle the rest for you!&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv4lsbezswfv7sabyf0iq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv4lsbezswfv7sabyf0iq.png" alt="Image description" width="500" height="500"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting Started is a Breeze:
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pip install unifyai

from unify import Unify

unify = Unify(
    api_key=("UNIFY_KEY"),
    endpoint="router@q:1",
)

response = unify.generate(user_prompt="Hello there")

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It's that simple 👌&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnjdf2v4s15gs87vtc6g4.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnjdf2v4s15gs87vtc6g4.gif" alt="Image description" width="498" height="413"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Why We Think You'll Like Unify:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;🎨&lt;strong&gt;Focus on Development&lt;/strong&gt;: Spend more time creating and less time worrying about finding the most appropriate LLM.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;⚙️&lt;strong&gt;Adaptive and Efficient&lt;/strong&gt;: Your app will self-improve as you automatically &lt;a href="https://console.unify.ai/dashboard"&gt;benchmark&lt;/a&gt; each new LLM on your own prompts and for your own task, enabling you to quickly integrate the latest and greatest LLMs into your workflow.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;⚖️&lt;strong&gt;Quality, Cost and Speed&lt;/strong&gt;: These are the three pillars for all LLMs. Unify's &lt;a href="//unify.ai/chat?default=true"&gt;router&lt;/a&gt; ensures you never have to compromise on any of them.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  &lt;a href="https://console.unify.ai/"&gt;Every signup comes with $50 free credit to get you started!&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F76d2tbfjou30f226g0rt.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F76d2tbfjou30f226g0rt.gif" alt="Image description" width="498" height="373"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
