<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Endogen</title>
    <description>The latest articles on Forem by Endogen (@endogen).</description>
    <link>https://forem.com/endogen</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F670098%2Fe7c2dd75-e6e7-4216-a03d-20990bf63d6f.png</url>
      <title>Forem: Endogen</title>
      <link>https://forem.com/endogen</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/endogen"/>
    <language>en</language>
    <item>
      <title>Using Claude Code With NVIDIA Build’s Free Models</title>
      <dc:creator>Endogen</dc:creator>
      <pubDate>Sun, 26 Apr 2026 21:36:12 +0000</pubDate>
      <link>https://forem.com/endogen/using-claude-code-with-nvidia-builds-free-models-10j5</link>
      <guid>https://forem.com/endogen/using-claude-code-with-nvidia-builds-free-models-10j5</guid>
      <description>&lt;p&gt;I like Claude Code a lot.  &lt;/p&gt;

&lt;p&gt;Not because it always picks the perfect model, and not because every answer is magical, but because the workflow is good. It feels fast, focused, and genuinely useful for day-to-day coding. The catch, of course, is that Claude Code normally assumes you’re plugged into Anthropic’s own API.  &lt;/p&gt;

&lt;p&gt;But there’s a clever workaround.  &lt;/p&gt;

&lt;p&gt;If all you want is the &lt;strong&gt;Claude Code interface&lt;/strong&gt;—the CLI, the editor integration, the overall UX—you can keep that frontend and swap out the backend model. One of the more interesting ways to do that right now is with &lt;strong&gt;NVIDIA Build&lt;/strong&gt;, which offers a catalog of hosted models and free serverless endpoints for development.  &lt;/p&gt;

&lt;p&gt;The glue between those two worlds is an open-source project called &lt;a href="https://github.com/Alishahryar1/free-claude-code" rel="noopener noreferrer"&gt;&lt;code&gt;free-claude-code&lt;/code&gt;&lt;/a&gt;.  &lt;/p&gt;

&lt;p&gt;This post walks through what the setup actually is, why it’s interesting, and how to get it running.  &lt;/p&gt;

&lt;h2&gt;
  
  
  What this actually means
&lt;/h2&gt;

&lt;p&gt;Let’s clear up the most important point first:  &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;This does &lt;strong&gt;not&lt;/strong&gt; give you Anthropic’s Claude models for free.  &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;What it does give you is a way to use &lt;strong&gt;Claude Code as the client&lt;/strong&gt; while routing requests to a different model provider behind the scenes.  &lt;/p&gt;

&lt;p&gt;In this case, that provider is &lt;strong&gt;NVIDIA Build / NVIDIA NIM&lt;/strong&gt;.  &lt;/p&gt;

&lt;p&gt;So the setup looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Claude Code -&amp;gt; local compatibility proxy -&amp;gt; NVIDIA-hosted model
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That distinction matters. If you publish this as “free Claude,” people will feel misled. If you publish it as “use Claude Code with NVIDIA’s free models,” that’s accurate—and honestly, still pretty compelling.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Why this is interesting
&lt;/h2&gt;

&lt;p&gt;There are really two separate things people often bundle together:  &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;The model itself&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The interface used to work with the model&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Claude Code is both a model ecosystem and a very polished coding interface. The neat trick here is that you can separate those concerns.  &lt;/p&gt;

&lt;p&gt;If you enjoy the Claude Code UX, but you want to experiment with lower-cost or free hosted models, this setup gives you that option.  &lt;/p&gt;

&lt;p&gt;And NVIDIA Build is a strong fit for that kind of experimentation because it already exposes a large public model catalog, including a set of free serverless endpoints.  &lt;/p&gt;

&lt;h2&gt;
  
  
  The two pieces you need
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. An NVIDIA Build account and API key
&lt;/h3&gt;

&lt;p&gt;Start here:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://build.nvidia.com/models" rel="noopener noreferrer"&gt;NVIDIA Build model catalog&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://build.nvidia.com/settings/api-keys" rel="noopener noreferrer"&gt;NVIDIA Build API keys&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Create an account, go through NVIDIA’s developer sign-in flow, and generate an API key.  &lt;/p&gt;

&lt;p&gt;That key is what the proxy will use to talk to NVIDIA’s hosted model endpoints.  &lt;/p&gt;

&lt;h3&gt;
  
  
  2. The &lt;code&gt;free-claude-code&lt;/code&gt; proxy
&lt;/h3&gt;

&lt;p&gt;The project lives here:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/Alishahryar1/free-claude-code" rel="noopener noreferrer"&gt;Alishahryar1/free-claude-code&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What it does is simple in principle:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;it exposes an &lt;strong&gt;Anthropic-compatible API surface&lt;/strong&gt; locally
&lt;/li&gt;
&lt;li&gt;Claude Code points at that local server
&lt;/li&gt;
&lt;li&gt;the proxy translates and forwards those requests to another provider
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The project supports several providers, but for this post the NVIDIA path is the one that matters.  &lt;/p&gt;

&lt;h2&gt;
  
  
  How the NVIDIA route works
&lt;/h2&gt;

&lt;p&gt;Inside the project, NVIDIA is treated as a first-class provider.  &lt;/p&gt;

&lt;p&gt;The relevant bits are straightforward:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the provider name is &lt;code&gt;nvidia_nim&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;the API key variable is &lt;code&gt;NVIDIA_NIM_API_KEY&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;requests go to:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;https://integrate.api.nvidia.com/v1  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The sample configuration in the repo currently defaults to this model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;MODEL="nvidia_nim/z-ai/glm4.7"  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That’s an important detail because it shows the project is not just generically “NVIDIA-compatible” in theory—it ships with a concrete NVIDIA-backed model configuration out of the box.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Installing the proxy
&lt;/h2&gt;

&lt;p&gt;There are two ways to install it.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Option 1: Clone the repo
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/Alishahryar1/free-claude-code.git  
&lt;span class="nb"&gt;cd &lt;/span&gt;free-claude-code  
&lt;span class="nb"&gt;cp&lt;/span&gt; .env.example .env  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Option 2: Install it as a tool
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv tool &lt;span class="nb"&gt;install &lt;/span&gt;git+https://github.com/Alishahryar1/free-claude-code.git  
fcc-init  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That &lt;code&gt;fcc-init&lt;/code&gt; command creates a config file at:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;~/.config/free-claude-code/.env  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One thing to note: the current project configuration requires &lt;strong&gt;Python 3.14+&lt;/strong&gt;.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Configuring it for NVIDIA Build
&lt;/h2&gt;

&lt;p&gt;Open the generated &lt;code&gt;.env&lt;/code&gt; file and set at least these values:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;NVIDIA_NIM_API_KEY=your_nvidia_key_here  
MODEL="nvidia_nim/z-ai/glm4.7"  
VOICE_NOTE_ENABLED=false  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A few details are worth knowing:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the model value must include the provider prefix
&lt;/li&gt;
&lt;li&gt;for NVIDIA, that prefix is &lt;code&gt;nvidia_nim/...&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;the repo can also map different backends to Opus, Sonnet, and Haiku-style requests, but you can ignore that at first and just set &lt;code&gt;MODEL&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That’s enough for a simple initial setup.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Starting the local proxy
&lt;/h2&gt;

&lt;p&gt;If you cloned the repo, run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv run uvicorn server:app &lt;span class="nt"&gt;--host&lt;/span&gt; 0.0.0.0 &lt;span class="nt"&gt;--port&lt;/span&gt; 8082  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you installed it as a tool, you can usually just run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;free-claude-code  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At that point you have a local server that looks enough like Anthropic’s API for Claude Code to use it.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Pointing Claude Code at the proxy
&lt;/h2&gt;

&lt;p&gt;This is the key handoff.  &lt;/p&gt;

&lt;p&gt;Launch Claude Code like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;ANTHROPIC_AUTH_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"freecc"&lt;/span&gt; &lt;span class="nv"&gt;ANTHROPIC_BASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"http://localhost:8082"&lt;/span&gt; claude  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The subtle but important detail is the base URL.  &lt;/p&gt;

&lt;p&gt;It should point to the proxy root:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;http://localhost:8082  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;—not to &lt;code&gt;/v1&lt;/code&gt;.  &lt;/p&gt;

&lt;p&gt;That small detail is easy to get wrong, and if you do, the whole setup feels broken for no obvious reason.  &lt;/p&gt;

&lt;h2&gt;
  
  
  What you get out of it
&lt;/h2&gt;

&lt;p&gt;If everything is set up correctly, the result is pretty nice:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;you keep the &lt;strong&gt;Claude Code workflow&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;you use &lt;strong&gt;NVIDIA-hosted models&lt;/strong&gt; underneath
&lt;/li&gt;
&lt;li&gt;you don’t need an Anthropic API key for the model calls themselves
&lt;/li&gt;
&lt;li&gt;you can experiment without immediately committing to another paid API bill
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That makes this a good fit for people who:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;like Claude Code’s UX
&lt;/li&gt;
&lt;li&gt;want to try coding with alternative models
&lt;/li&gt;
&lt;li&gt;already have an NVIDIA Build account
&lt;/li&gt;
&lt;li&gt;want a lower-cost or free development setup
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What to expect in practice
&lt;/h2&gt;

&lt;p&gt;This is where expectations matter.  &lt;/p&gt;

&lt;p&gt;Using Claude Code with a non-Claude model is a bit like putting a different engine in a familiar car. The dashboard still looks the same, the steering wheel is where you expect it to be, but the feel changes.  &lt;/p&gt;

&lt;p&gt;Some models will be surprisingly good.&lt;br&gt;&lt;br&gt;
Some will be worse at tool use.&lt;br&gt;&lt;br&gt;
Some will feel faster.&lt;br&gt;&lt;br&gt;
Some will be noticeably less consistent.  &lt;/p&gt;

&lt;p&gt;That’s not a flaw in the proxy—it’s just the reality of using a frontend designed around one ecosystem with models from another.  &lt;/p&gt;

&lt;p&gt;So the right expectation is not:  &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“I now have free Claude.”  &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The right expectation is:  &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“I now have Claude Code’s interface connected to a different model provider.”  &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That’s still useful. It’s just a different claim.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Why I think this matters
&lt;/h2&gt;

&lt;p&gt;We’re heading toward a world where the interface layer and the model layer are increasingly interchangeable.  &lt;/p&gt;

&lt;p&gt;That’s good news.  &lt;/p&gt;

&lt;p&gt;It means tools people genuinely enjoy using don’t have to stay locked to a single backend forever. If you like a workflow, you should be able to keep it and swap the model depending on cost, speed, quality, or availability.  &lt;/p&gt;

&lt;p&gt;That’s exactly why projects like &lt;code&gt;free-claude-code&lt;/code&gt; are interesting.  &lt;/p&gt;

&lt;p&gt;They make the model layer more replaceable.  &lt;/p&gt;

&lt;p&gt;And NVIDIA Build makes that especially practical because it lowers the barrier to trying a bunch of hosted models without having to build your own inference setup first.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Final thoughts
&lt;/h2&gt;

&lt;p&gt;I wouldn’t pitch this as a magic loophole.  &lt;/p&gt;

&lt;p&gt;I’d pitch it as something more honest—and more useful:  &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;a practical way to use &lt;strong&gt;Claude Code as a frontend&lt;/strong&gt; for &lt;strong&gt;NVIDIA Build’s free hosted models&lt;/strong&gt;.  &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If you already enjoy Claude Code, that’s worth trying.  &lt;/p&gt;

&lt;p&gt;And even if you end up going back to Anthropic’s native stack later, this setup is a nice reminder that the future probably belongs to tools that treat model providers as swappable infrastructure rather than fixed destiny.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://build.nvidia.com/models" rel="noopener noreferrer"&gt;NVIDIA Build models&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://build.nvidia.com/settings/api-keys" rel="noopener noreferrer"&gt;NVIDIA Build API keys&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Alishahryar1/free-claude-code" rel="noopener noreferrer"&gt;free-claude-code on GitHub&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>nvidia</category>
      <category>free</category>
    </item>
    <item>
      <title>Building a Telegram Bot for Allen AI's Open-Source Models</title>
      <dc:creator>Endogen</dc:creator>
      <pubDate>Sat, 07 Mar 2026 01:33:48 +0000</pubDate>
      <link>https://forem.com/endogen/building-a-telegram-bot-for-allen-ais-open-source-models-68c</link>
      <guid>https://forem.com/endogen/building-a-telegram-bot-for-allen-ais-open-source-models-68c</guid>
      <description>&lt;p&gt;I wanted a Telegram bot that lets me chat with &lt;a href="https://allenai.org/" rel="noopener noreferrer"&gt;Allen AI's&lt;/a&gt; open-source language models — OLMo, Tülu, and Molmo 2 — without running any models locally. No GPU, no inference server, just a lightweight Python bot that talks to Allen AI's free public playground API.&lt;/p&gt;

&lt;p&gt;The result is &lt;a href="https://github.com/Endogen/olmo-bot" rel="noopener noreferrer"&gt;OLMo Bot&lt;/a&gt;, and it ended up with more capabilities than I initially planned: multi-model switching, web search, vision, and even visual object pointing with annotated image overlays.&lt;/p&gt;

&lt;h2&gt;
  
  
  Connecting to Allen AI
&lt;/h2&gt;

&lt;p&gt;Allen AI runs a &lt;a href="https://playground.allenai.org/" rel="noopener noreferrer"&gt;public playground&lt;/a&gt; with their latest models. There's no official API, but I built &lt;a href="https://dev.to/endogen/web2api-turning-websites-into-rest-apis-and-mcp-tools-be4"&gt;Web2API&lt;/a&gt; — a tool that turns websites into REST APIs — and created a recipe for it. The bot doesn't scrape anything itself; it just calls Web2API endpoints:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;query_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;file_path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;endpoint&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MODELS&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# e.g. "/allenai/olmo-32b"
&lt;/span&gt;    &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;WEB2API_URL&lt;/span&gt;&lt;span class="si"&gt;}{&lt;/span&gt;&lt;span class="n"&gt;endpoint&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;q&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;full_prompt&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;AsyncClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;120&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;items&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;items&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fields&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Allen AI recipe in Web2API uses a custom scraper that handles their streaming NDJSON chat API directly — no browser automation needed for this one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Model Switching
&lt;/h2&gt;

&lt;p&gt;The bot supports six text models and two vision models, switchable per user with simple commands:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Command&lt;/th&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/olmo32b&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;OLMo 3.1 32B Instruct (default)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/think&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;OLMo 3.1 32B Think (reasoning)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/olmo7b&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;OLMo 3 7B Instruct&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/tulu8b&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Tülu 3 8B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/tulu70b&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Tülu 3 70B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/molmo2&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Molmo 2 8B (vision)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/molmo2track&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Molmo 2 8B Tracking&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Each user's model choice is stored in memory. Send &lt;code&gt;/think&lt;/code&gt;, and all your subsequent messages go to the reasoning model until you switch again.&lt;/p&gt;

&lt;p&gt;The Think model is particularly interesting — it's Allen AI's chain-of-thought model that shows its reasoning process, similar to what you'd get from o1 or DeepSeek R1, but fully open-source.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conversation Memory
&lt;/h2&gt;

&lt;p&gt;Memory is off by default (stateless, each message is independent) but can be toggled with &lt;code&gt;/memory&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;mem_on&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Build context from history
&lt;/span&gt;    &lt;span class="n"&gt;parts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;role&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;User&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;role&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Assistant&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;User: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;full_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When enabled, the bot maintains up to 20 turns of conversation per user. The full history is prepended to each prompt so the model has context. &lt;code&gt;/clear&lt;/code&gt; wipes it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Web Search via Tool Calling
&lt;/h2&gt;

&lt;p&gt;This is where Web2API's &lt;a href="https://dev.to/endogen/web2api-turning-websites-into-rest-apis-and-mcp-tools-be4"&gt;MCP bridge&lt;/a&gt; comes in. Allen AI's models support tool calling — you pass a &lt;code&gt;tools_url&lt;/code&gt; parameter pointing to a tool endpoint, and the model can decide to call those tools during generation.&lt;/p&gt;

&lt;p&gt;I configured the bot to always pass the Brave Search tool:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# config.py
&lt;/span&gt;&lt;span class="n"&gt;DEFAULT_TOOLS_URL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OLMO_TOOLS_URL&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://127.0.0.1:8000/mcp/only/brave-search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# bot.py — included in every text model request
&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;q&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;full_prompt&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;DEFAULT_TOOLS_URL&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;VISION_MODELS&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tools_url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;DEFAULT_TOOLS_URL&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The flow works like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;User asks "What's the weather in Berlin?"&lt;/li&gt;
&lt;li&gt;Bot sends the prompt to Web2API with &lt;code&gt;tools_url&lt;/code&gt; pointing to the Brave Search bridge&lt;/li&gt;
&lt;li&gt;Web2API's Allen AI scraper passes the tool definition to the model&lt;/li&gt;
&lt;li&gt;OLMo decides it needs current data, calls &lt;code&gt;web_search&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;The scraper executes the search via the MCP bridge, feeds results back to the model&lt;/li&gt;
&lt;li&gt;OLMo generates a response incorporating the search results&lt;/li&gt;
&lt;li&gt;Bot sends the answer to the user&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The model decides autonomously whether to search — if you ask "What is 2+2?", it just answers directly. If you ask about current events, it searches. All of this happens inside Web2API's Docker container.&lt;/p&gt;

&lt;p&gt;One detail worth mentioning: the &lt;code&gt;tools_url&lt;/code&gt; points to &lt;code&gt;http://127.0.0.1:8000&lt;/code&gt; (container-internal port), not the external &lt;code&gt;8010&lt;/code&gt;. Since the Allen AI scraper runs inside the same Docker container as the MCP bridge, it can reach it on localhost without going through nginx.&lt;/p&gt;

&lt;p&gt;Vision models skip the tools parameter — Molmo 2 doesn't need web search.&lt;/p&gt;

&lt;h2&gt;
  
  
  Vision: Image and Video Analysis
&lt;/h2&gt;

&lt;p&gt;Send a photo or video to the bot with a caption, and it analyzes it using Molmo 2:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Auto-switch to molmo2 if current model doesn't support vision
&lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;VISION_MODELS&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;molmo2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The bot downloads the file from Telegram, sends it as a multipart POST to Web2API, and returns the model's analysis. If no caption is provided, it defaults to "Describe this image in detail."&lt;/p&gt;

&lt;p&gt;The auto-switch is key for usability — you don't have to manually switch to Molmo 2 before sending a photo. Send an image on any model, and the bot temporarily uses Molmo 2 for that message, then stays on your selected text model for the next.&lt;/p&gt;

&lt;h2&gt;
  
  
  Point Overlay: "Show Me Where"
&lt;/h2&gt;

&lt;p&gt;This was the feature I didn't plan but couldn't resist building. Molmo 2 has a pointing capability — ask it to point at objects, and it returns coordinates in a normalized 0–1000 coordinate space:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User: "Point to the eyes" (with photo attached)
Molmo 2: &amp;lt;points coords="1 1 421 430 2 633 352"&amp;gt;eyes&amp;lt;/points&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The response format encodes multiple points: the first point has a two-number prefix plus x,y coordinates, subsequent points have an index plus x,y. All values are in a 0–1000 space relative to image dimensions.&lt;/p&gt;

&lt;p&gt;The bot parses these coordinates and draws colored markers on the original image using Pillow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_make_marker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;radius&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Render an anti-aliased marker via 4× supersampling.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;sr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;radius&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;
    &lt;span class="n"&gt;marker&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;RGBA&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;draw&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ImageDraw&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Draw&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;marker&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# White border ring
&lt;/span&gt;    &lt;span class="n"&gt;draw&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ellipse&lt;/span&gt;&lt;span class="p"&gt;([...],&lt;/span&gt; &lt;span class="n"&gt;fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;240&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="c1"&gt;# Colored circle
&lt;/span&gt;    &lt;span class="n"&gt;draw&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ellipse&lt;/span&gt;&lt;span class="p"&gt;([...],&lt;/span&gt; &lt;span class="n"&gt;fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;230&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="c1"&gt;# Centered number label
&lt;/span&gt;    &lt;span class="n"&gt;draw&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;cx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cy&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;font&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;font&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;anchor&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mm&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Downscale for smooth anti-aliasing
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;marker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;resize&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;final_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;final_size&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LANCZOS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The markers are rendered at 4× resolution and downscaled with LANCZOS filtering for smooth, anti-aliased edges — no jagged circles or pixel artifacts. Each point gets a distinct color (red, blue, green, orange...) with a white border and a numbered label.&lt;/p&gt;

&lt;p&gt;The bot sends the annotated image back as a photo with a caption like "📍 eyes (2 points)". Prompts that trigger pointing include variations of "Point to...", "Find the...", "Where is the...", and "Locate the...".&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1m7h36gy8umvrs0bhbzr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1m7h36gy8umvrs0bhbzr.png" alt=" " width="800" height="344"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Setup
&lt;/h2&gt;

&lt;p&gt;The bot is a single &lt;code&gt;bot.py&lt;/code&gt; file plus a config and the pointing module. Dependencies are minimal: &lt;code&gt;python-telegram-bot&lt;/code&gt;, &lt;code&gt;httpx&lt;/code&gt;, and &lt;code&gt;Pillow&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/Endogen/olmo-bot.git
&lt;span class="nb"&gt;cd &lt;/span&gt;olmo-bot
python3 &lt;span class="nt"&gt;-m&lt;/span&gt; venv .venv &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;source&lt;/span&gt; .venv/bin/activate
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
&lt;span class="nb"&gt;cp&lt;/span&gt; .env.example .env  &lt;span class="c"&gt;# set OLMO_BOT_TOKEN&lt;/span&gt;
python bot.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It requires a running Web2API instance with the &lt;code&gt;allenai&lt;/code&gt; recipe (and optionally &lt;code&gt;brave-search&lt;/code&gt; for web search). Access can be restricted to specific Telegram user IDs via the &lt;code&gt;OLMO_ALLOWED_USERS&lt;/code&gt; env var.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;The main limitation is Allen AI's native tool calling — while the model acknowledges tools and can call them, it doesn't always do so proactively. A bot-side tool loop (parsing tool-call JSON from the model output and executing tools locally) would make this more reliable.&lt;/p&gt;

&lt;p&gt;The pointing coordinate format from Molmo 2 also isn't officially documented — I reverse-engineered it from testing. It works reliably, but the format could change.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Links:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/Endogen/olmo-bot" rel="noopener noreferrer"&gt;OLMo Bot on GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Endogen/web2api" rel="noopener noreferrer"&gt;Web2API on GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/endogen/web2api-turning-websites-into-rest-apis-and-mcp-tools-be4"&gt;Web2API blog post&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://playground.allenai.org/" rel="noopener noreferrer"&gt;Allen AI Playground&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>python</category>
      <category>api</category>
      <category>mcp</category>
      <category>ai</category>
    </item>
    <item>
      <title>Web2API — Turning Websites into REST APIs (and MCP Tools)</title>
      <dc:creator>Endogen</dc:creator>
      <pubDate>Sat, 07 Mar 2026 01:13:57 +0000</pubDate>
      <link>https://forem.com/endogen/web2api-turning-websites-into-rest-apis-and-mcp-tools-be4</link>
      <guid>https://forem.com/endogen/web2api-turning-websites-into-rest-apis-and-mcp-tools-be4</guid>
      <description>&lt;p&gt;I needed data from websites that don't have APIs. Not once, not as a quick scrape, but as persistent, queryable endpoints I could hit programmatically. So I built &lt;a href="https://github.com/Endogen/web2api" rel="noopener noreferrer"&gt;Web2API&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;Most useful data on the internet lives behind HTML. Some sites offer APIs, many don't. The typical approach is writing one-off scrapers — fragile scripts that break whenever the site changes a CSS class. I wanted something different:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Declarative&lt;/strong&gt; — define what to extract, not how to click through pages&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Persistent&lt;/strong&gt; — a running service with stable endpoints, not a script I run manually&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Modular&lt;/strong&gt; — add new sites without touching the core codebase&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI-ready&lt;/strong&gt; — expose scraped data as tools that language models can call&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Solution
&lt;/h2&gt;

&lt;p&gt;Web2API is a FastAPI service backed by Playwright (headless Chromium). You define &lt;strong&gt;recipes&lt;/strong&gt; in YAML — each recipe describes a website, its endpoints, and what data to extract. The service runs continuously and serves the scraped data as clean JSON REST endpoints.&lt;/p&gt;

&lt;h3&gt;
  
  
  A Recipe Looks Like This
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hacker&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;News"&lt;/span&gt;
&lt;span class="na"&gt;slug&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hackernews"&lt;/span&gt;
&lt;span class="na"&gt;base_url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://news.ycombinator.com"&lt;/span&gt;
&lt;span class="na"&gt;endpoints&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;read&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://news.ycombinator.com/news?p={page}"&lt;/span&gt;
    &lt;span class="na"&gt;items&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;container&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tr.athing"&lt;/span&gt;
      &lt;span class="na"&gt;fields&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.titleline&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;gt;&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;a"&lt;/span&gt;
          &lt;span class="na"&gt;attribute&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text"&lt;/span&gt;
        &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.titleline&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;gt;&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;a"&lt;/span&gt;
          &lt;span class="na"&gt;attribute&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;href"&lt;/span&gt;
          &lt;span class="na"&gt;transform&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;absolute_url"&lt;/span&gt;
        &lt;span class="na"&gt;score&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.score"&lt;/span&gt;
          &lt;span class="na"&gt;context&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;next_sibling"&lt;/span&gt;
          &lt;span class="na"&gt;attribute&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text"&lt;/span&gt;
          &lt;span class="na"&gt;transform&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;regex_int"&lt;/span&gt;
          &lt;span class="na"&gt;optional&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="na"&gt;pagination&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;page_param"&lt;/span&gt;
      &lt;span class="na"&gt;param&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;p"&lt;/span&gt;
      &lt;span class="na"&gt;start&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. No Python code. Install the recipe, and you get:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl http://localhost:8010/hackernews/read?page&lt;span class="o"&gt;=&lt;/span&gt;1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"items"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Show HN: I built a thing"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://example.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"fields"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"score"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;153&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"pagination"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"current_page"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"has_next"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  When YAML Isn't Enough
&lt;/h2&gt;

&lt;p&gt;Some sites require actual interaction — typing into fields, waiting for dynamic content, handling streaming responses. For those, recipes can include a custom Python scraper alongside the YAML:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;recipes/
  allenai/
    recipe.yaml     # endpoint definitions
    scraper.py      # custom interaction logic
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The scraper gets a blank Playwright page and full control:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Scraper&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseScraper&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;supports&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;endpoint&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chat&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;olmo-32b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;scrape&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# Navigate, interact, parse streaming responses...
&lt;/span&gt;        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;ScrapeResult&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[...])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Endpoints not handled by the scraper fall back to declarative YAML extraction. This hybrid approach means simple sites stay simple, and complex ones get the flexibility they need.&lt;/p&gt;

&lt;h2&gt;
  
  
  Recipe Management
&lt;/h2&gt;

&lt;p&gt;Recipes live in a catalog — a git repository with available integrations. The service has a CLI and web UI for managing them:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# See what's available&lt;/span&gt;
web2api recipes catalog list

&lt;span class="c"&gt;# Install one&lt;/span&gt;
web2api recipes catalog add hackernews &lt;span class="nt"&gt;--yes&lt;/span&gt;

&lt;span class="c"&gt;# Check dependencies&lt;/span&gt;
web2api recipes doctor hackernews
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can also install recipes from local paths, custom git repos, or just drop a folder into the recipes directory. The web UI shows both the catalog and installed recipes with one-click install/uninstall.&lt;/p&gt;

&lt;h2&gt;
  
  
  The MCP Server
&lt;/h2&gt;

&lt;p&gt;This is where it gets interesting. Web2API includes a built-in &lt;a href="https://modelcontextprotocol.io/" rel="noopener noreferrer"&gt;MCP (Model Context Protocol)&lt;/a&gt; server that automatically exposes every recipe endpoint as a native tool for AI assistants.&lt;/p&gt;

&lt;p&gt;Install a recipe → it's immediately available as an MCP tool. Uninstall it → the tool disappears. No configuration, no restart needed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"web2api"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mcp-remote"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://your-host/mcp/"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add that to your Claude Desktop config, and suddenly Claude can search the web, translate text, query Hacker News — whatever recipes you have installed.&lt;/p&gt;

&lt;h3&gt;
  
  
  How Tools Are Built
&lt;/h3&gt;

&lt;p&gt;Each recipe endpoint becomes its own MCP tool with a proper name, description, and typed parameters. The tool registration happens dynamically — when recipes change, tools rebuild automatically:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Inside _ToolRegistry
&lt;/span&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_fn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;execute_recipe_endpoint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;recipe&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;recipe&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;endpoint_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;q&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;query_params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;format_tool_result&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;model_dump&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Tools execute recipes &lt;strong&gt;in-process&lt;/strong&gt; — no HTTP self-calls, no overhead. The function signatures are built dynamically with &lt;code&gt;inspect.Signature&lt;/code&gt; so MCP clients get proper parameter schemas.&lt;/p&gt;

&lt;p&gt;Recipes can also define custom &lt;code&gt;tool_name&lt;/code&gt; values for AI-friendly naming:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;endpoints&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;search&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;tool_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;web_search"&lt;/span&gt;  &lt;span class="c1"&gt;# instead of the default "brave-search__search"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This matters more than you'd think — some models struggle with names containing dashes or double underscores.&lt;/p&gt;

&lt;h3&gt;
  
  
  HTTP Bridge
&lt;/h3&gt;

&lt;p&gt;For non-MCP clients, there's also a simpler HTTP bridge:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# List available tools&lt;/span&gt;
curl https://your-host/mcp/tools

&lt;span class="c"&gt;# Call a tool&lt;/span&gt;
curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://your-host/mcp/tools/web_search &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"q": "latest news"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The bridge supports filtering by recipe slug — useful when you want to expose only specific tools to a particular consumer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET /mcp/only/brave-search/tools     # only brave-search tools
GET /mcp/exclude/allenai/tools       # everything except allenai
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  File Uploads
&lt;/h2&gt;

&lt;p&gt;Some recipes need files — vision models that analyze images, document processors, etc. Web2API handles multipart uploads:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST &lt;span class="s2"&gt;"http://localhost:8010/allenai/molmo2?q=Describe+this+image"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-F&lt;/span&gt; &lt;span class="s2"&gt;"files=@photo.jpg"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Files are saved to a temp directory, passed to the scraper, and cleaned up after the response. Upload filenames are sanitized against path traversal.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;

&lt;p&gt;The stack is deliberately simple:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;FastAPI&lt;/strong&gt; for the HTTP layer&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Playwright&lt;/strong&gt; (Chromium) for browser automation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pydantic&lt;/strong&gt; for config validation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Docker&lt;/strong&gt; for deployment&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A shared browser pool manages Playwright contexts with configurable concurrency and TTL. An in-memory response cache with stale-while-revalidate keeps things fast for repeated queries.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Request → Cache check → Browser pool → Playwright page → Extract → Cache store → Response
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What I Use It For
&lt;/h2&gt;

&lt;p&gt;I run Web2API on a VPS behind nginx with a handful of recipes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Allen AI&lt;/strong&gt; — chat with OLMo and Tülu models, analyze images with Molmo 2&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Brave Search&lt;/strong&gt; — web search that my AI tools can call&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DeepL&lt;/strong&gt; — translation between German and English&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hacker News&lt;/strong&gt; — front page and search&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Wikipedia&lt;/strong&gt; — article search and full content extraction&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The MCP server feeds into Claude Desktop for direct tool use, and the HTTP bridge provides web search capabilities to a &lt;a href="https://github.com/Endogen/olmo-bot" rel="noopener noreferrer"&gt;Telegram bot&lt;/a&gt; I built on top of it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/Endogen/web2api.git
&lt;span class="nb"&gt;cd &lt;/span&gt;web2api
docker compose up &lt;span class="nt"&gt;--build&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt;

&lt;span class="c"&gt;# Install a recipe&lt;/span&gt;
docker compose &lt;span class="nb"&gt;exec &lt;/span&gt;web2api web2api recipes catalog add hackernews &lt;span class="nt"&gt;--yes&lt;/span&gt;

&lt;span class="c"&gt;# Query it&lt;/span&gt;
curl &lt;span class="nt"&gt;-s&lt;/span&gt; http://localhost:8010/hackernews/read?page&lt;span class="o"&gt;=&lt;/span&gt;1 | jq &lt;span class="s1"&gt;'.items[:3]'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;a href="https://github.com/Endogen/web2api-recipes" rel="noopener noreferrer"&gt;recipe catalog&lt;/a&gt; is open — contributions welcome.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Links:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/Endogen/web2api" rel="noopener noreferrer"&gt;Web2API on GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Endogen/web2api-recipes" rel="noopener noreferrer"&gt;Recipe Catalog&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/Endogen/olmo-bot" rel="noopener noreferrer"&gt;OLMo Telegram Bot&lt;/a&gt; (built on Web2API)&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>python</category>
      <category>api</category>
      <category>mcp</category>
      <category>web</category>
    </item>
  </channel>
</rss>
