<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Varshith V Hegde</title>
    <description>The latest articles on Forem by Varshith V Hegde (@varshithvhegde).</description>
    <link>https://forem.com/varshithvhegde</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F885064%2F4ab304f4-a3f3-409c-8217-9ce130e57c18.jpeg</url>
      <title>Forem: Varshith V Hegde</title>
      <link>https://forem.com/varshithvhegde</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/varshithvhegde"/>
    <language>en</language>
    <item>
      <title>GitHub Broke Git: The Merge Queue Bug That Silently Deleted Your Code</title>
      <dc:creator>Varshith V Hegde</dc:creator>
      <pubDate>Sun, 03 May 2026 03:09:35 +0000</pubDate>
      <link>https://forem.com/varshithvhegde/github-broke-git-the-merge-queue-bug-that-silently-deleted-your-code-4f7i</link>
      <guid>https://forem.com/varshithvhegde/github-broke-git-the-merge-queue-bug-that-silently-deleted-your-code-4f7i</guid>
      <description>&lt;p&gt;If you use GitHub's merge queue and had a rough week around April 23rd, 2026, you were not imagining things. Your code actually disappeared. Not because of a bad commit, not because of a rogue team member, but because GitHub itself quietly deleted it.&lt;/p&gt;

&lt;p&gt;This is the story of what happened, why it was way worse than the official numbers suggest, and what it means for the way we all trust the tools we build on.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Day GitHub Stopped Being Git
&lt;/h2&gt;

&lt;p&gt;At 16:05 UTC on April 23rd, 2026, a regression crept into GitHub's merge queue. For the next three and a half hours, engineers around the world were reviewing pull requests, clicking "merge," and watching everything look completely fine. Green checks. Clean diffs. No warnings.&lt;/p&gt;

&lt;p&gt;What was actually happening behind the scenes was quietly horrifying.&lt;/p&gt;

&lt;p&gt;A PR with a perfectly reasonable &lt;code&gt;+29 / -34&lt;/code&gt; diff would get approved and queued. What landed on &lt;code&gt;main&lt;/code&gt; was a commit worth &lt;code&gt;+245 / -1,137&lt;/code&gt;. Thousands of lines of code that other engineers had already shipped, reviewed, and moved on from, just gone. And every merge that came after went in on top of that broken history.&lt;/p&gt;

&lt;p&gt;The UI showed zero problems. The status page showed no outage. The platform was lying to everyone's faces.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4z56m261nmof6a1ed34z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4z56m261nmof6a1ed34z.png" alt="Git commit graph showing incorrect merge base" width="800" height="391"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What Actually Went Wrong Under the Hood
&lt;/h2&gt;

&lt;p&gt;GitHub's merge queue works by creating a temporary branch for each PR in the queue. Normally, that temp branch starts from the tip of &lt;code&gt;main&lt;/code&gt; plus the PR's diff. CI runs against it, it passes, it lands.&lt;/p&gt;

&lt;p&gt;On April 23rd, the queue started building those temp branches from the wrong starting point. Instead of branching from the current tip of &lt;code&gt;main&lt;/code&gt;, it was branching from wherever the feature branch had originally diverged from main, potentially dozens or hundreds of commits back.&lt;/p&gt;

&lt;p&gt;Then it pushed the entire contents of that temp branch to &lt;code&gt;main&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;So if your feature branch was 50 commits behind main when it hit the queue, the "merge" silently removed those 50 commits of other people's work as a side effect of landing yours. CI passed because the temp branch on its own was internally consistent. &lt;code&gt;main&lt;/code&gt; blew up because the temp branch had nothing to do with the current state of &lt;code&gt;main&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The root cause? A new code path that adjusted merge base computation was meant to be gated behind a feature flag for an unreleased feature. The gating was incomplete. The new behavior leaked into production and applied to all squash merge groups.&lt;/p&gt;

&lt;p&gt;Three things made this bug particularly nasty:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. The PR UI lied.&lt;/strong&gt; You reviewed &lt;code&gt;+29/-34&lt;/code&gt;. The commit that landed was &lt;code&gt;+245/-1,137&lt;/code&gt;. The thing engineers approved was not the thing that merged. That breaks the most fundamental contract of a code review system.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. It was completely silent.&lt;/strong&gt; No merge conflict. No failed check. No banner on the PR. Teams only found out when someone noticed code on &lt;code&gt;main&lt;/code&gt; that should have been there simply was not.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. It scaled with repo activity.&lt;/strong&gt; The faster a repo was merging, the further feature branches had drifted from &lt;code&gt;main&lt;/code&gt;, and the more damage each bad merge did. The teams that relied most on merge queue got hit the hardest.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Human Cost
&lt;/h2&gt;

&lt;p&gt;This was not a theoretical problem. Engineering teams spent entire afternoons in incident mode: combing through commit graphs, reconstructing deleted code by hand, coordinating recovery across multiple repos, and filing support tickets that would take days to hear back on.&lt;/p&gt;

&lt;p&gt;One organization reported that every single team running on GitHub's merge queue got hit, with dozens of bad commits each and hundreds of existing commits clobbered before anyone noticed. One company alone claimed to have experienced over 200 ruined PRs.&lt;/p&gt;

&lt;p&gt;GitHub later said 2,092 pull requests across 230 repositories were affected during the impact window of April 22 to 23. Earlier messaging from GitHub's COO on X had put the number at 2,804 PRs, and some community members pushed back hard on both figures given what individual companies were experiencing.&lt;/p&gt;

&lt;p&gt;The incident was not detected by GitHub's automated monitoring because it affected merge commit correctness rather than availability. GitHub only became aware of the regression at 19:38 UTC, following an increase in customer support inquiries. The fix, a revert and force-deploy, was complete by 20:43 UTC. Three hours and thirty-three minutes of silent corruption.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why the Status Page Was Useless
&lt;/h2&gt;

&lt;p&gt;Here is the part that stings. If you checked GitHub's status page on April 23rd, you probably saw nothing alarming. There was no major outage reported. No partial outage.&lt;/p&gt;

&lt;p&gt;That is because GitHub's status page calculus specifically excludes "Degraded Performance" from downtime numbers. The platform itself never went down. Developers could still push code, open PRs, and click merge. The fact that clicking merge was silently destroying their codebase did not register as an incident on the dashboard.&lt;/p&gt;

&lt;p&gt;This is a telling gap. Uptime and correctness are not the same thing. A bank that processes your transactions but records them incorrectly is not "up." GitHub processed the merges. It just produced wrong results. The status page was not built to catch that kind of failure.&lt;/p&gt;




&lt;h2&gt;
  
  
  This Was Not an Isolated Bad Day
&lt;/h2&gt;

&lt;p&gt;It would be easier to move on from this if it were a one-off. But April 2026 was a genuinely rough stretch for GitHub.&lt;/p&gt;

&lt;p&gt;Four days after the merge queue incident, on April 27th, GitHub's Elasticsearch cluster became overloaded, likely from a botnet attack, and search-backed UI surfaces stopped returning results. Pull request lists went blank. Issues disappeared from view. Projects and Actions workflow pages showed nothing. The underlying data was still there, but developers could not see it.&lt;/p&gt;

&lt;p&gt;And then, on April 28th, the same morning GitHub's CTO published an apology post about reliability, a separate security disclosure dropped: researchers at Wiz had found a critical remote code execution vulnerability in GitHub's &lt;code&gt;git push&lt;/code&gt; pipeline (CVE-2026-3854, CVSS 8.7). A single crafted &lt;code&gt;git push&lt;/code&gt; with injected options could reach unsandboxed code execution on GitHub's servers. It was patched in 75 minutes on github.com, but the timing was brutal.&lt;/p&gt;

&lt;p&gt;Three significant failures in five days. Merge queue correctness. Search collapse. An RCE in the core git push path.&lt;/p&gt;

&lt;p&gt;GitHub's CTO, Vlad Fedorov, acknowledged in the April 28th post that none of this is acceptable. He also revealed the scale of what GitHub is dealing with: the company had planned to scale capacity by 10x in October 2025. By February 2026, projections driven by agentic development workflows (AI coding tools like Copilot, Cursor, and Codex flooding the platform with automated PRs) forced a rethink to a 30x redesign. GitHub is now hitting peaks of 90 million merged PRs and 1.4 billion commits.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.unsplash.com%2Fphoto-1555099962-4199c345e5dd%3Fw%3D1200%26q%3D80" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.unsplash.com%2Fphoto-1555099962-4199c345e5dd%3Fw%3D1200%26q%3D80" alt="Developer incident response" width="1200" height="800"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Deeper Architectural Problem
&lt;/h2&gt;

&lt;p&gt;There is a reason this specific failure mode existed. GitHub's merge queue constructs merge commits through a code path that is separate from how a regular PR merge works. Two code paths, two places where behavior can quietly diverge.&lt;/p&gt;

&lt;p&gt;This is the danger that comes with delegation. A merge queue is supposed to automate exactly what a human would do when clicking "Merge pull request." The moment it does something a human would not do, because it has its own logic for building the merge commit, it can silently produce commits nobody wrote and nobody approved.&lt;/p&gt;

&lt;p&gt;This is not just a GitHub problem. It is a pattern that shows up every time we give automated systems write access to things that matter. Queues, bots, AI agents. As long as those systems are doing something equivalent to what a human would do, the failure modes are familiar. When they start doing things a human would not do, the failures become invisible until the damage is already done.&lt;/p&gt;

&lt;p&gt;The lesson is not to avoid merge queues. It is to make sure that whatever writes to &lt;code&gt;main&lt;/code&gt; stays as close as possible to boring, well-understood git operations, with no novel logic in the merge commit path that reviewers cannot audit.&lt;/p&gt;




&lt;h2&gt;
  
  
  Will Anyone Actually Leave?
&lt;/h2&gt;

&lt;p&gt;After something like this, the obvious question is whether developers will migrate off GitHub. And the honest answer is: probably not in any significant numbers.&lt;/p&gt;

&lt;p&gt;GitHub is deeply embedded. CI pipelines, webhook integrations, RBAC policies, Actions workflows, third-party app permissions, team structures, pull request history. Migration is not just switching a remote URL. It is months of work and coordination.&lt;/p&gt;

&lt;p&gt;That stickiness is real and it is not purely irrational. GitHub is still where most open source lives. It is still where most integrations point. It is still the default. The addictive hold it has over the development ecosystem is less like a premium SaaS product and more like a utility. You do not switch utilities because of a bad week.&lt;/p&gt;

&lt;p&gt;But what this incident should change is the baseline of trust. GitHub is infrastructure. And infrastructure that silently corrupts your data, even for a few hours, with no visible error, is infrastructure you need to have a recovery plan for.&lt;/p&gt;

&lt;p&gt;The minimum response is not migration. It is verification. Audit squash merges in merge queue groups of two or more PRs from the April 22 to 23 window. Write down which parts of your build and deploy pipeline silently assume git history is correct. Then make that assumption visible somewhere it can be challenged.&lt;/p&gt;




&lt;h2&gt;
  
  
  What GitHub Says It Is Doing About It
&lt;/h2&gt;

&lt;p&gt;GitHub's post-incident response included a few concrete commitments:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Expanding test coverage for merge correctness validation&lt;/li&gt;
&lt;li&gt;Adding regression checks that validate resulting git contents across supported merge configurations before reaching production&lt;/li&gt;
&lt;li&gt;Migrating performance-sensitive code from its older Ruby codebase to Go&lt;/li&gt;
&lt;li&gt;Moving systems to public cloud infrastructure to handle the 30x scale requirement&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The April 23rd bug specifically was caused by incomplete feature flagging on a new code path. The fix was a revert. The longer-term fix is better test coverage for multi-PR merge queue groups, which were apparently underrepresented in existing test suites.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Takeaway
&lt;/h2&gt;

&lt;p&gt;GitHub's merge queue, for a few hours on April 23rd, 2026, broke the most fundamental contract of version control: that what you approve is what merges. It did it silently, with clean green UI, no errors, and no status page entry.&lt;/p&gt;

&lt;p&gt;The code was still there in Git object storage. But the branch history was wrong, and no automated system could safely repair it across every affected repository. Engineers had to do it by hand.&lt;/p&gt;

&lt;h2&gt;
  
  
  That is the thing that lingers. Git is supposed to be the boring, reliable layer that everything else is built on. When the boring layer gets interesting, it gets interesting in the worst possible way.
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;If you found this useful, drop a comment below or follow for more deep dives into the tools we trust (sometimes too much).&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>git</category>
      <category>github</category>
      <category>programming</category>
    </item>
    <item>
      <title>7 AI Gateways That Actually Work in Production (2026 Guide)</title>
      <dc:creator>Varshith V Hegde</dc:creator>
      <pubDate>Wed, 29 Apr 2026 12:19:13 +0000</pubDate>
      <link>https://forem.com/varshithvhegde/7-ai-gateways-that-actually-work-in-production-2026-guide-2p4d</link>
      <guid>https://forem.com/varshithvhegde/7-ai-gateways-that-actually-work-in-production-2026-guide-2p4d</guid>
      <description>&lt;p&gt;Let me start with an admission. I resisted using an AI gateway for longer than I should have.&lt;/p&gt;

&lt;p&gt;My reasoning was the kind engineers convince themselves is pragmatic. "I'll just call the APIs directly, it's faster to ship, I'll add abstraction later." And for a while, it worked. Until the night an Anthropic outage knocked my app offline for two hours. Until the morning a recursive agent loop racked up thousands of dollars in charges before anyone woke up. Until the security audit flagged raw API keys scattered across four different repos.&lt;/p&gt;

&lt;p&gt;At that point, "later" arrived.&lt;/p&gt;

&lt;p&gt;I've spent the past several months evaluating AI gateways seriously. Not as a researcher, but as someone trying to put them in front of real production workloads. This is what I found.&lt;/p&gt;




&lt;h2&gt;
  
  
  First: What Does an AI Gateway Actually Do?
&lt;/h2&gt;

&lt;p&gt;Before the list, let me be specific about what we're talking about, because the category name is increasingly used to mean very different things.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh8pw99qdxzzwwkwoep1u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh8pw99qdxzzwwkwoep1u.png" alt="LLM API gateway architecture diagram" width="800" height="344"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Gartner defines an AI gateway as "a technology or platform that acts as an intermediary between applications and various AI services or models." That is the clean academic definition. In practice, a good AI gateway is the layer that keeps your AI app running when things break. And things always break.&lt;/p&gt;

&lt;p&gt;Concretely, that means handling:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Routing&lt;/strong&gt; - intelligently directing requests to the right model based on cost, latency, or availability&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Failover&lt;/strong&gt; - automatically switching providers when one goes down, often in under 50ms&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost controls&lt;/strong&gt; - per-team or per-key budget limits so no single runaway agent bankrupts you&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Key management&lt;/strong&gt; - one secure central store for credentials instead of env vars scattered across repos&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Observability&lt;/strong&gt; - request-level traces, latency metrics, and token usage across every provider in a single dashboard&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compliance&lt;/strong&gt; - audit logs, role-based access control, and data residency guarantees&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Different gateways prioritize different things. Some are razor-thin proxies optimized for speed. Others are full control planes designed to govern how an entire organization uses AI. The right choice depends entirely on where your pain is.&lt;/p&gt;

&lt;p&gt;Here are the seven worth knowing in 2026.&lt;/p&gt;




&lt;h2&gt;
  
  
  Quick Comparison
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Gateway&lt;/th&gt;
&lt;th&gt;Latency&lt;/th&gt;
&lt;th&gt;MCP Support&lt;/th&gt;
&lt;th&gt;On-Prem/VPC&lt;/th&gt;
&lt;th&gt;Compliance&lt;/th&gt;
&lt;th&gt;Gartner Recognized&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;TrueFoundry&lt;/td&gt;
&lt;td&gt;~3-4ms&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;VPC, On-Prem, Air-Gapped&lt;/td&gt;
&lt;td&gt;SOC 2, HIPAA, ITAR&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Enterprise with compliance + deployment needs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Helicone&lt;/td&gt;
&lt;td&gt;under 5ms P95&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Self-hosted option&lt;/td&gt;
&lt;td&gt;SOC 2&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Observability-first teams&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenRouter&lt;/td&gt;
&lt;td&gt;~15ms&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Managed only&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Prototyping, widest model access&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Requesty&lt;/td&gt;
&lt;td&gt;~8ms P50&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;GDPR (EU endpoint)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Fast multi-model routing with analytics&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Singulr AI&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;Partial&lt;/td&gt;
&lt;td&gt;Limited&lt;/td&gt;
&lt;td&gt;In progress&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;AI governance-focused orgs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Inworld Router&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Quality-weighted routing experiments&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Braintrust Gateway&lt;/td&gt;
&lt;td&gt;Cached under 100ms&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Enterprise tier only&lt;/td&gt;
&lt;td&gt;SOC 2&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Eval + routing in one workflow&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  1. TrueFoundry AI Gateway
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Enterprise Production Pick
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fspugnj4tfrc14bdl6xt0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fspugnj4tfrc14bdl6xt0.png" alt="TrueFoundry AI Gateway enterprise platform" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I'll be honest. TrueFoundry was not the first gateway I tried. It kept coming up in conversations with platform engineers at companies doing serious AI at scale, and once I actually dug in, the reason became clear.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.truefoundry.com/ai-gateway" rel="noopener noreferrer"&gt;TrueFoundry is an enterprise AI gateway&lt;/a&gt; and more specifically, it is the only &lt;a href="https://www.businesswire.com/news/home/20260220396246/en/CORRECTING-and-REPLACING-TrueFoundry-Recognized-as-a-Representative-Vendor-in-Gartner-Market-Guide-for-AI-Gateways" rel="noopener noreferrer"&gt;Gartner-recognized AI gateway&lt;/a&gt; that also handles model deployment and GPU orchestration in the same platform. Most gateways on this list are proxies with dashboards. TrueFoundry is closer to a full AI control plane, the kind of thing a platform team would build internally at a large company, except you do not have to build it yourself.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The numbers that matter&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The platform handles over &lt;strong&gt;10 billion requests per month&lt;/strong&gt; for Fortune 1000 customers including NVIDIA and Siemens Healthineers. The gateway adds roughly 3-4ms latency overhead per request and can sustain 350+ RPS on a single vCPU. These are not lab benchmarks. They are the numbers that show up in production.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where it genuinely stands apart on compliance&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;SOC 2, HIPAA, and ITAR certified. For anyone in healthcare, financial services, defense, or any regulated industry, this is often the conversation that ends competitor evaluations. Most other gateways on this list have none of these certifications, or are still working toward them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The deployment flexibility is real&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;VPC, on-premises, and air-gapped deployments are all supported. If your security posture means data cannot touch a public cloud, TrueFoundry actually works. Not as an afterthought, but as a first-class deployment mode.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The MCP piece deserves its own moment&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;As AI agents multiply, teams are suddenly managing not just LLM calls but tool access: MCP servers for code execution, database queries, web search, enterprise integrations. TrueFoundry unifies LLM routing and MCP governance in the same control plane, with OAuth2, RBAC, and audit logging applied to every tool call. You can &lt;a href="https://www.truefoundry.com/ai-gateway" rel="noopener noreferrer"&gt;register internal MCP servers&lt;/a&gt;, define who can access what, and monitor agent tool usage alongside your LLM traffic, all in one place. No other gateway on this list does that.&lt;/p&gt;

&lt;p&gt;On Gartner Peer Insights, one enterprise customer said: "AI Gateway is a single pane where I can see all the models, their associated cost, track requests... it provides an easy way to integrate with MCP servers which does a very heavy lift." That lines up with what I have heard from teams using it at scale.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where it genuinely falls short&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;TrueFoundry is a heavier platform. If your requirement is "I need a quick proxy to route between GPT-4 and Claude," this is more infrastructure than you need. It is also strongest when there is a dedicated platform or infra team who can own it. Solo developers or very small teams will find the setup investment harder to justify compared to lighter alternatives.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The bottom line&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;TrueFoundry is the only Gartner-recognized AI gateway on this list and the only one that unifies LLM routing, MCP governance, and model deployment in a single control plane. If you are running production AI for an enterprise with compliance requirements, it is in a different category from the proxies below.&lt;/p&gt;

&lt;p&gt;Website: &lt;a href="https://www.truefoundry.com/ai-gateway" rel="noopener noreferrer"&gt;truefoundry.com/ai-gateway&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Helicone AI Gateway
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Observability-First Pick
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgs0d1phpum5zihjxzz8u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgs0d1phpum5zihjxzz8u.png" alt="Helicone LLM observability and analytics dashboard" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Helicone has earned genuine respect in the developer community for a specific reason. If you want to understand what your AI application is actually doing, it is excellent.&lt;/p&gt;

&lt;p&gt;It is Rust-based, open-source, and fast. The team describes it as "the NGINX of LLMs," and that is not just marketing. The architecture reflects it. You get a unified API for 100+ providers through a single OpenAI-compatible endpoint, with automatic failover, load balancing, and per-request logging built in from the start.&lt;/p&gt;

&lt;p&gt;The analytics dashboard is one of the more useful ones I have seen: per-request cost tracking, model comparison, session-level traces, and usage patterns broken out by team, model, or environment. For understanding where your AI spend is actually going, Helicone is hard to beat.&lt;/p&gt;

&lt;p&gt;It is also SOC 2 certified and GDPR compliant, with a self-hosting option for teams that need infrastructure control. That is a meaningful step up from pure managed-only options.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where it falls short&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;No MCP gateway support. If you are building agents that need governed tool access, you will need to look elsewhere for that layer. Governance features like RBAC depth and policy enforcement are more basic compared to enterprise platforms. It is primarily an observability platform with routing layered on, not a full deployment and governance story.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for&lt;/strong&gt; teams where LLM observability and cost analytics are the primary pain point. If you already have routing handled but want real visibility into what is happening across your models, Helicone is a solid, developer-friendly choice.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. OpenRouter
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Widest Model Access, Fastest to Start
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzkk7s8kt4d3spl12u2qf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzkk7s8kt4d3spl12u2qf.png" alt="OpenRouter unified AI model API interface" width="800" height="453"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;OpenRouter is how I reach 300+ models through one API when I am prototyping. No infrastructure to manage, unified billing across providers, and instant access to everything from GPT-5 to Llama to Mistral variants through a single OpenAI-compatible endpoint.&lt;/p&gt;

&lt;p&gt;The pricing model is worth understanding correctly. OpenRouter actually passes through provider pricing at or near cost. It is a 5.5% platform fee on credit purchases, not a per-token markup on inference. For most use cases, you are paying what you would pay the provider directly, plus a small convenience fee for the unified access. They do not train on your data, and there is a growing free tier with 25+ zero-cost models for getting started.&lt;/p&gt;

&lt;p&gt;For prototyping, experimenting with different models, or any project where you need breadth over depth, OpenRouter is genuinely hard to beat on speed of getting started.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where it falls short&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Managed only, no self-hosting option. No MCP support. Governance features are minimal: no RBAC, no compliance certifications, no fine-grained access controls built for regulated industries. The 100 API calls per 60 seconds default throttling can become a real constraint for high-volume agent pipelines.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for&lt;/strong&gt; prototyping, side projects, or teams that need fast access to the widest range of models and are not yet in a compliance conversation.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Requesty
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsbwraegix52f5c8o2akm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsbwraegix52f5c8o2akm.png" alt="Requesty AI Gateway" width="800" height="462"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Smarter Than It Looks
&lt;/h3&gt;

&lt;p&gt;Requesty is a gateway I underestimated at first glance. The website looks simple. That turned out to be a mistake.&lt;/p&gt;

&lt;p&gt;Requesty is a unified LLM gateway for 400+ models, and what sets it apart from pure model-access tools is the routing intelligence. It includes smart routing that analyzes request type and auto-selects the cheapest viable model, cross-provider semantic caching (which can cut token costs by up to 80% on repeated queries), real-time PII redaction, and sub-50ms automatic failover when a provider goes down.&lt;/p&gt;

&lt;p&gt;According to their own data, 70,000+ developers use it and it processes 90+ billion tokens daily. Those are numbers that suggest it is more battle-tested than its marketing implies. There is an EU endpoint for GDPR compliance, per-key spending limits, and a genuinely useful analytics dashboard.&lt;/p&gt;

&lt;p&gt;Setup is three lines of code. Swap the base URL. Done.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://router.requesty.ai/v1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-requesty-key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Where it falls short&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Managed only, no self-hosting or VPC deployment. No MCP governance. No enterprise compliance certifications beyond GDPR. For teams in regulated industries or those needing air-gapped deployment, it does not get you there.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for&lt;/strong&gt; developers who want a capable, managed multi-model gateway with smart routing and cost optimization, without the infrastructure overhead of a full enterprise platform.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Singulr AI
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fckka6m94osvbkfjz5vaq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fckka6m94osvbkfjz5vaq.png" alt="Singulr AI Gateway" width="800" height="417"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The Governance-Focused Newcomer
&lt;/h3&gt;

&lt;p&gt;Singulr AI is an enterprise AI governance platform backed by Nexus Venture Partners and Dell Technologies Capital. It raised $10M in early 2025 with a specific focus: helping security, IT, privacy, and compliance teams gain visibility and control over how AI is being used across an organization.&lt;/p&gt;

&lt;p&gt;The approach is distinctive. It includes a continuously updated AI risk intelligence system that profiles models and agents, classifies them in real time, and recommends safer alternatives. It also offers application-aware red teaming that simulates real-world threats before deployment.&lt;/p&gt;

&lt;p&gt;For CISOs and compliance teams, this is interesting. It is a governance-first angle that most gateway vendors leave to someone else.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where it falls short&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It is a newer entrant with limited public production track record at Fortune 1000 scale. The feature set is narrower than full gateway platforms. It is primarily governance and security, not a complete routing, failover, and deployment story. Pricing is not public.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for&lt;/strong&gt; organizations where AI governance, risk scoring, and compliance team enablement are the primary requirements, and who are comfortable evaluating a platform that is still building its enterprise reference base.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. Inworld Router
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqk2vfuccbayd3w1ip1bl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqk2vfuccbayd3w1ip1bl.png" alt="Inworld Router AI Gateway" width="800" height="457"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  An Interesting Idea Worth Watching
&lt;/h3&gt;

&lt;p&gt;Inworld Router takes a genuinely different approach to the routing problem. Instead of routing based purely on cost or availability, it routes on business-level metrics: cost per output quality, task complexity, latency targets. The idea is that not every request needs the smartest and most expensive model, and a router that understands the nature of a request can make smarter tradeoffs than one that just round-robins.&lt;/p&gt;

&lt;p&gt;That is a legitimate insight, and as a concept it points toward where sophisticated AI infrastructure is heading.&lt;/p&gt;

&lt;p&gt;In practice today, it is primarily built for Inworld's own gaming and character AI use case. The ecosystem is small, community support is limited, and it is not a general-purpose enterprise gateway.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for&lt;/strong&gt; teams in gaming or character AI who want to experiment with quality-weighted routing. Worth keeping an eye on as the concept matures.&lt;/p&gt;




&lt;h2&gt;
  
  
  7. Braintrust Gateway
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl16lkuzvdxdd2g4a6jy4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl16lkuzvdxdd2g4a6jy4.png" alt="Braintrust Gateway" width="800" height="343"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The Eval-First Option
&lt;/h3&gt;

&lt;p&gt;Braintrust is fundamentally an evaluation and observability platform that also includes a capable gateway. The integration between the two is the real story. Requests that flow through the gateway automatically feed into Braintrust's tracing and evaluation pipeline. You can run evaluations against production traffic, compare model performance across experiments, and catch regressions in CI/CD before they reach users.&lt;/p&gt;

&lt;p&gt;The gateway supports 100+ models including GPT-5, Claude 4, and Gemini 2.5. Caching is encrypted per-API-key using AES-GCM, with sub-100ms response times for cached requests. There is a generous free tier (1M trace spans, 10k evaluation scores) and SOC 2 Type II certification on the enterprise side.&lt;/p&gt;

&lt;p&gt;One important note: their original AI proxy is now deprecated. They have migrated to a full gateway product, which is a meaningful upgrade for production reliability.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where it falls short&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The gateway features are secondary to the eval platform, which is by design, but means it is not a full story for failover, MCP governance, or compliance-heavy deployments. Self-hosting is only available on the enterprise tier. At $249/month for the Pro plan, it is not the lightest option for teams that only need routing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for&lt;/strong&gt; engineering teams doing active prompt optimization and model comparison who want routing and evaluation tightly integrated, and do not want to stitch together separate tools for each.&lt;/p&gt;




&lt;h2&gt;
  
  
  How to Actually Choose
&lt;/h2&gt;

&lt;p&gt;After spending real time with all of these, here is my honest decision framework.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The compliance conversation is the first filter.&lt;/strong&gt; If your security team needs SOC 2, HIPAA, or ITAR, or if data cannot leave your cloud, the list immediately narrows to one serious option: TrueFoundry. This is not a sales pitch. It is just where the certifications are.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The MCP question is the second filter.&lt;/strong&gt; If you are building agents that need governed tool access, only TrueFoundry covers this layer natively today.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you clear both of those, the rest is about fit:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pick &lt;strong&gt;&lt;a href="https://www.truefoundry.com/ai-gateway" rel="noopener noreferrer"&gt;TrueFoundry&lt;/a&gt;&lt;/strong&gt; if you need enterprise governance, compliance, and model deployment in one platform&lt;/li&gt;
&lt;li&gt;Pick &lt;strong&gt;Helicone&lt;/strong&gt; if observability and cost analytics are your primary pain and you want something developer-friendly and open-source&lt;/li&gt;
&lt;li&gt;Pick &lt;strong&gt;OpenRouter&lt;/strong&gt; if you are prototyping and want the fastest possible access to the widest range of models&lt;/li&gt;
&lt;li&gt;Pick &lt;strong&gt;Requesty&lt;/strong&gt; if you want a capable managed gateway with smart routing and you are not in a compliance-heavy environment&lt;/li&gt;
&lt;li&gt;Pick &lt;strong&gt;Braintrust&lt;/strong&gt; if prompt evaluation and model quality monitoring are central to your workflow&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Where This Category Is Going
&lt;/h2&gt;

&lt;p&gt;Something I have noticed in 2026 is that the definition of "AI gateway" keeps expanding. A year ago it meant a proxy with routing logic. Now teams are asking their gateway to handle agent tool access via MCP, govern agent-to-agent communication, manage model deployment, and provide compliance audit trails across all of it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feevb9oq9vvwlvbfb7oph.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feevb9oq9vvwlvbfb7oph.png" alt="MCP gateway agent tool orchestration architecture" width="800" height="642"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That is a lot to ask of a single layer. Most of the lighter options on this list handle one or two of these well. TrueFoundry is the only one I have seen genuinely attempting the full stack, and it has the production evidence to back that up: 10B+ requests per month, Fortune 1000 customers, and Gartner recognition.&lt;/p&gt;

&lt;p&gt;Whether you want one vendor for all of that, or best-of-breed at each layer, is a real architectural choice. Either can work. The important thing is making it deliberately, rather than discovering two years in that your "lightweight proxy" cannot support what your AI stack has become.&lt;/p&gt;




&lt;p&gt;What is your experience been? I am especially curious if anyone has moved from a lighter gateway to something heavier, or the other direction, and what triggered that switch. Drop a comment below.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>devops</category>
      <category>programming</category>
      <category>llm</category>
    </item>
    <item>
      <title>I Spent 3 Days Debugging Our LLM Setup. Turns Out We Needed an AI Gateway the Whole Time.</title>
      <dc:creator>Varshith V Hegde</dc:creator>
      <pubDate>Wed, 15 Apr 2026 08:46:08 +0000</pubDate>
      <link>https://forem.com/varshithvhegde/i-spent-3-days-debugging-our-llm-setup-turns-out-we-needed-an-ai-gateway-the-whole-time-50a2</link>
      <guid>https://forem.com/varshithvhegde/i-spent-3-days-debugging-our-llm-setup-turns-out-we-needed-an-ai-gateway-the-whole-time-50a2</guid>
      <description>&lt;p&gt;Let me tell you about a Friday afternoon I'd rather forget.&lt;/p&gt;

&lt;p&gt;Three teams, four models, six API keys living in different &lt;code&gt;.env&lt;/code&gt; files, one very angry compliance officer, and me just staring at a terminal trying to figure out why we got a $1,400 OpenAI bill for a feature that was supposed to cost fifty bucks.&lt;/p&gt;

&lt;p&gt;That was my "okay something is genuinely broken here" moment.&lt;/p&gt;

&lt;p&gt;Not some big insight. Just a $1,400 invoice and dead silence on a Slack thread for about ten minutes.&lt;/p&gt;

&lt;p&gt;If you've felt even a small version of that, this post is for you.&lt;/p&gt;




&lt;h2&gt;
  
  
  So what actually is an AI Gateway?
&lt;/h2&gt;

&lt;p&gt;Not the textbook answer. That one goes something like "middleware that abstracts your LLM provider calls." Technically fine, tells you nothing.&lt;/p&gt;

&lt;p&gt;Here's how I actually think about it.&lt;/p&gt;

&lt;p&gt;You know how bigger engineering orgs eventually build out a platform team? Before that team exists, every squad is doing their own thing. Their own CI setup, their own infra configs, their own credentials. It mostly works. Until it doesn't. And then it catastrophically doesn't all at once.&lt;/p&gt;

&lt;p&gt;An AI Gateway is basically that platform layer, except it's for LLMs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkjuor74bks4xvbsncv14.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkjuor74bks4xvbsncv14.webp" alt="AI Gateway" width="800" height="448"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Every single request your app makes to any model (OpenAI, Anthropic, a self-hosted Llama, whatever you're running) goes through it. Because everything flows through one place, you finally get:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One set of credentials instead of keys scattered across five repos&lt;/li&gt;
&lt;li&gt;Rate limits and budgets that are actually enforced&lt;/li&gt;
&lt;li&gt;Cost tracking per team, per model, per request&lt;/li&gt;
&lt;li&gt;Guardrails that catch PII before it leaves your infra&lt;/li&gt;
&lt;li&gt;One place to look when something blows up&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One control plane. Every team. Every model.&lt;/p&gt;




&lt;h2&gt;
  
  
  The architecture is simpler than it sounds
&lt;/h2&gt;

&lt;p&gt;Here's what happens when you put a gateway in the middle:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhjgbzpismfi3ilj62wmy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhjgbzpismfi3ilj62wmy.png" alt="Excalidraw AI gateway" width="800" height="567"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Request comes in from your app, gateway catches it, validates auth, checks rate limits, applies input guardrails, picks the right provider, logs everything, checks the response output, sends it back. That's the whole flow.&lt;/p&gt;

&lt;p&gt;Your application code doesn't change. You stop pointing at &lt;code&gt;api.openai.com&lt;/code&gt; directly and point at your gateway instead. That's literally it from your team's perspective.&lt;/p&gt;

&lt;p&gt;The control layer just sits there doing its job quietly.&lt;/p&gt;




&lt;h2&gt;
  
  
  "But I already have an API gateway. Isn't that enough?"
&lt;/h2&gt;

&lt;p&gt;This is where most people get confused. Including me when I first looked into this.&lt;/p&gt;

&lt;p&gt;Quick answer: no. Here's why.&lt;/p&gt;

&lt;p&gt;Your API gateway (Kong, AWS API Gateway, Nginx, take your pick) understands traffic. It knows Team A sent 10,000 HTTP requests. It can enforce rate limits, handle auth tokens. That's useful.&lt;/p&gt;

&lt;p&gt;Your AI gateway understands what's actually inside those requests. It knows Team A sent &lt;strong&gt;4.2 million tokens to GPT-4o&lt;/strong&gt;, it cost &lt;strong&gt;$84&lt;/strong&gt;, average latency was &lt;strong&gt;340ms&lt;/strong&gt;, and &lt;strong&gt;3 of those requests triggered the PII guardrail&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;One sees requests. The other sees meaning. That's not a small difference.&lt;/p&gt;

&lt;p&gt;For stateless REST APIs, a regular API gateway is totally fine. For LLM workloads where tokens equal money and every prompt is a potential compliance issue, you need something that actually speaks the language.&lt;/p&gt;




&lt;h2&gt;
  
  
  Do you actually need one right now though?
&lt;/h2&gt;

&lt;p&gt;Let me skip the usual "it depends" and be direct.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You're probably fine without one if:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One team, one model, one use case&lt;/li&gt;
&lt;li&gt;Nobody is asking about costs yet&lt;/li&gt;
&lt;li&gt;Zero compliance requirements&lt;/li&gt;
&lt;li&gt;It's a POC or side project&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Don't add infrastructure you don't need. Raw SDK calls are fast to ship. Keep it simple when simple works.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You've outgrown the simple setup if:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Multiple teams are calling models independently with no visibility into what they're doing&lt;/li&gt;
&lt;li&gt;Swapping providers requires actual code changes&lt;/li&gt;
&lt;li&gt;Someone from legal or security or finance asked a question you couldn't answer&lt;/li&gt;
&lt;li&gt;You've had an API key accidentally committed to a public repo (or almost did)&lt;/li&gt;
&lt;li&gt;You can't answer "what did we spend on AI last month, by team?" without going on a scavenger hunt through billing dashboards&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last point is genuinely the biggest tell. If someone asks that question and you have to go digging, you already needed this.&lt;/p&gt;

&lt;h3&gt;
  
  
  What actually pushes teams over the edge
&lt;/h3&gt;

&lt;p&gt;It's never one thing. It's always a pile of smaller things that suddenly feel heavy together.&lt;/p&gt;

&lt;p&gt;DevOps realizes they can't track spend because keys are everywhere. Someone commits a key to a public repo. A team uses GPT-4 Turbo for tasks that GPT-4 Mini handles just fine, and you find out after they've burned $2K. Compliance asks for an audit trail and you have nothing.&lt;/p&gt;

&lt;p&gt;Each of those individually, fine, you deal with it. All of them stacking up at the same time? That's when the "simple" setup reveals it was never actually simple. You were just deferring the complexity.&lt;/p&gt;




&lt;h2&gt;
  
  
  What a production gateway actually looks like
&lt;/h2&gt;

&lt;p&gt;Okay enough talking around it. Here's what it gives you in practice, using TrueFoundry as the concrete example.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe7wqj5il9c2jcll6z8a9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe7wqj5il9c2jcll6z8a9.png" alt="TrueFoundry MainPage" width="800" height="497"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;One API key across all providers&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpzba6qd46w4cqyrq3nse.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpzba6qd46w4cqyrq3nse.png" alt="Model Unify" width="800" height="331"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Your teams stop touching raw OpenAI or Anthropic keys entirely. One key, routed through the gateway, with access to every approved model. Rotate it in one place. Done.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Per-team budgets with real enforcement&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh4l0e4t4h7tccj3rwm7r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh4l0e4t4h7tccj3rwm7r.png" alt="team" width="800" height="434"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Not "we log it and send you a Slack alert." Actual hard limits. Team hits their monthly budget, the next request gets rejected with a clear error. No surprise bills, no awkward retros about where the spend went.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Automatic failover&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;OpenAI goes down. It happens. Your app doesn't go down with it because requests automatically route to Anthropic or your self-hosted model. No code changes. No one gets paged. It just keeps working.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Full request tracing&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhlrnranv1rkxfvv22jyt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhlrnranv1rkxfvv22jyt.png" alt="request tracing" width="800" height="441"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Every prompt, every response, every token count, every cost attribution. Logged and queryable. Pull a request from six months ago and reconstruct exactly what happened. This feature alone has saved me more debugging time than I can measure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Guardrails that actually run everywhere&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F42qwsft6nxn4x574gv87.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F42qwsft6nxn4x574gv87.png" alt="Guardrails" width="800" height="454"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;PII filtering, prompt injection detection, custom output policies. You define the rule once and it applies across every team and every model. No per-team implementation, no "oops we forgot to add the check in this service."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Runs inside your own environment&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;VPC, on-prem, air-gapped. Data doesn't leave your infra. SOC 2, HIPAA, GDPR compliant. If your compliance team has ever asked "but where does the data actually go," this is finally a clean answer.&lt;/p&gt;

&lt;p&gt;Performance-wise it handles 350+ RPS on a single vCPU with sub-3ms latency so you're not adding meaningful overhead to your request path.&lt;/p&gt;

&lt;p&gt;TrueFoundry is in the 2026 Gartner Market Guide for AI Gateways and processes 10B+ requests per month for companies like Siemens Healthineers, NVIDIA, Resmed, and Automation Anywhere. Mentioning it not as a flex but as a sense of scale.&lt;/p&gt;




&lt;h2&gt;
  
  
  The question that actually helped me decide
&lt;/h2&gt;

&lt;p&gt;Forget "do I need an AI gateway."&lt;/p&gt;

&lt;p&gt;Ask this instead: when does the cost of NOT having one start to exceed the cost of setting one up?&lt;/p&gt;

&lt;p&gt;For most teams that crossover happens way earlier than expected. For us it wasn't one event. It was the accumulation. The audit trail we didn't have. The $1,400 bill nobody could explain. The near-miss with a key in a public repo.&lt;/p&gt;

&lt;p&gt;Setting up TrueFoundry honestly took less time than the post-mortem meeting for that billing incident.&lt;/p&gt;




&lt;p&gt;Try TrueFoundry free at &lt;strong&gt;&lt;a href="https://truefoundry.com" rel="noopener noreferrer"&gt;truefoundry.com&lt;/a&gt;&lt;/strong&gt; (no credit card required, deploys on your cloud in under 10 minutes).&lt;/p&gt;




&lt;p&gt;&lt;em&gt;What does your current setup look like? Still on raw SDK calls or have you already hit the wall? Drop a comment, genuinely curious where people are when they start asking this question.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>beginners</category>
      <category>performance</category>
    </item>
    <item>
      <title>The Great Claude Code Leak of 2026: Accident, Incompetence, or the Best PR Stunt in AI History?</title>
      <dc:creator>Varshith V Hegde</dc:creator>
      <pubDate>Wed, 01 Apr 2026 02:29:18 +0000</pubDate>
      <link>https://forem.com/varshithvhegde/the-great-claude-code-leak-of-2026-accident-incompetence-or-the-best-pr-stunt-in-ai-history-3igm</link>
      <guid>https://forem.com/varshithvhegde/the-great-claude-code-leak-of-2026-accident-incompetence-or-the-best-pr-stunt-in-ai-history-3igm</guid>
      <description>&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; On March 31, 2026, Anthropic accidentally shipped the &lt;em&gt;entire source code&lt;/em&gt; of Claude Code to the public npm registry via a single misconfigured debug file. 512,000 lines. 1,906 TypeScript files. 44 hidden feature flags. A Tamagotchi pet. And one very uncomfortable question: was it really an accident?&lt;/p&gt;

&lt;h2&gt;
  
  
  1. What Actually Happened
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Root Cause: One Missing Line in &lt;code&gt;.npmignore&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;This is both the most embarrassing and most instructive part of the story. Let me walk through the technical chain of events.&lt;/p&gt;

&lt;p&gt;When you publish a JavaScript/TypeScript package to npm, your build toolchain (Webpack, esbuild, Bun, etc.) optionally generates &lt;strong&gt;source map files&lt;/strong&gt;, which have a &lt;code&gt;.map&lt;/code&gt; extension. Their entire purpose is debugging: they bridge the gap between the minified, bundled production code and your original readable source. When a crash happens, a source map lets the stack trace point to your actual TypeScript file at line 47 rather than &lt;code&gt;main.js:1:284729&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Source maps are strictly for internal debugging. They should never ship to users.&lt;/p&gt;

&lt;p&gt;The way you exclude them from npm packages is with an &lt;code&gt;.npmignore&lt;/code&gt; file, or a &lt;code&gt;files&lt;/code&gt; field in &lt;code&gt;package.json&lt;/code&gt;. Here's the mistake in plain English:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# What Claude Code's .npmignore should have had:&lt;/span&gt;
&lt;span class="k"&gt;*&lt;/span&gt;.map
dist/&lt;span class="k"&gt;*&lt;/span&gt;.map

&lt;span class="c"&gt;# What it apparently had:&lt;/span&gt;
&lt;span class="c"&gt;# (nothing about .map files)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. That's the whole disaster.&lt;/p&gt;

&lt;p&gt;But it gets worse. The source map didn't contain the source code directly. It &lt;em&gt;referenced&lt;/em&gt; it, pointing to a URL of a &lt;code&gt;.zip&lt;/code&gt; file hosted on Anthropic's own Cloudflare R2 storage bucket. A publicly accessible one, with no authentication required.&lt;/p&gt;

&lt;p&gt;So the full chain looked like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;npm install @anthropic-ai/claude-code
  → downloads package including main.js.map (59.8 MB)
    → .map file contains URL pointing to src.zip
      → src.zip is hosted publicly on Anthropic's R2 bucket
        → anyone can download and unzip 512,000 lines of TypeScript
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two separate configuration failures, stacked on top of each other.&lt;/p&gt;

&lt;p&gt;As software engineer Gabriel Anhaia put it in his &lt;a href="https://dev.to/gabrielanhaia/claude-codes-entire-source-code-was-just-leaked-via-npm-source-maps-heres-whats-inside-cjo"&gt;deep dive&lt;/a&gt;: "A single misconfigured &lt;code&gt;.npmignore&lt;/code&gt; or &lt;code&gt;files&lt;/code&gt; field in &lt;code&gt;package.json&lt;/code&gt; can expose everything."&lt;/p&gt;

&lt;h3&gt;
  
  
  The Bun Factor
&lt;/h3&gt;

&lt;p&gt;There's a third layer. Anthropic acquired the &lt;strong&gt;Bun JavaScript runtime&lt;/strong&gt; at the end of 2025, and Claude Code is built on top of it. A known Bun bug (&lt;a href="https://github.com/oven-sh/bun/issues/28001" rel="noopener noreferrer"&gt;issue #28001&lt;/a&gt;, filed on March 11, 2026) reports that source maps are served in production builds even when the documentation says they shouldn't be.&lt;/p&gt;

&lt;p&gt;The bug was open for 20 days before this happened. Nobody caught it. Anthropic's own acquired toolchain contributed to exposing Anthropic's own product.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. The Timeline
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;00:21 UTC — March 31, 2026
Malicious axios versions (1.14.1 / 0.30.4) appear on npm
with an embedded Remote Access Trojan. Unrelated to Anthropic,
but catastrophically bad timing.

~04:00 UTC
Claude Code v2.1.88 is pushed to npm. The 59.8 MB source map
ships with it. The R2 bucket containing all source code is live
and publicly accessible.

04:23 UTC
Chaofan Shou (@Fried_rice), an intern at Solayer Labs,
tweets the discovery with a direct download link.
16 million people descend on the thread.

Next 2 hours
GitHub repositories spring up. The fastest repo in history
to hit 50,000 stars does it in under 2 hours.
41,500+ forks proliferate. DMCA requests begin.

~08:00 UTC
Anthropic pulls the npm package from the registry.
Issues the "human error, not a security breach" statement
to VentureBeat, The Register, CNBC, Fortune, Axios, Decrypt.

Same day
A Python clean-room rewrite appears, legally DMCA-proof.
Decentralized mirrors on Gitlawb go live with the message:
"Will never be taken down."
The code is permanently in the wild.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  By the Numbers
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Lines of code exposed&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;512,000+&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TypeScript files&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1,906&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Source map file size&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;59.8 MB&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GitHub forks (peak)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;41,500+&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Stars on fastest repo&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;50,000 in 2 hours&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hidden feature flags&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;44&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Claude Code ARR&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$2.5 billion&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Anthropic total ARR&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$19 billion&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Views on original tweet&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;16 million&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  3. SECURITY ALERT: The axios RAT
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Stop. Read this before anything else if you updated Claude Code that morning.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Coinciding with the leak, but entirely unrelated to it, was a real supply chain attack on npm. Malicious versions of the widely-used &lt;code&gt;axios&lt;/code&gt; HTTP library were published:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;axios@1.14.1&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;axios@0.30.4&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both contain an embedded &lt;strong&gt;Remote Access Trojan (RAT)&lt;/strong&gt;. The malicious dependency is called &lt;code&gt;plain-crypto-js&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you ran &lt;code&gt;npm install&lt;/code&gt; or updated Claude Code between 00:21 UTC and 03:29 UTC on March 31, 2026:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Check your lockfiles immediately:&lt;/span&gt;
&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s2"&gt;"1.14.1&lt;/span&gt;&lt;span class="se"&gt;\|&lt;/span&gt;&lt;span class="s2"&gt;0.30.4&lt;/span&gt;&lt;span class="se"&gt;\|&lt;/span&gt;&lt;span class="s2"&gt;plain-crypto-js"&lt;/span&gt; package-lock.json
&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s2"&gt;"1.14.1&lt;/span&gt;&lt;span class="se"&gt;\|&lt;/span&gt;&lt;span class="s2"&gt;0.30.4&lt;/span&gt;&lt;span class="se"&gt;\|&lt;/span&gt;&lt;span class="s2"&gt;plain-crypto-js"&lt;/span&gt; yarn.lock
&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s2"&gt;"1.14.1&lt;/span&gt;&lt;span class="se"&gt;\|&lt;/span&gt;&lt;span class="s2"&gt;0.30.4&lt;/span&gt;&lt;span class="se"&gt;\|&lt;/span&gt;&lt;span class="s2"&gt;plain-crypto-js"&lt;/span&gt; bun.lockb
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you find a match:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Treat the machine as fully compromised&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Rotate all credentials, API keys, and secrets immediately&lt;/li&gt;
&lt;li&gt;Perform a clean OS reinstallation&lt;/li&gt;
&lt;li&gt;File incident reports for any organizational data&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Going forward, Anthropic has designated the &lt;strong&gt;Native Installer&lt;/strong&gt; as the recommended installation method:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://claude.ai/install.sh | bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The native installer uses a standalone binary that doesn't rely on the npm dependency chain.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. What Was Inside: The Full Breakdown
&lt;/h2&gt;

&lt;p&gt;The leaked codebase is the &lt;code&gt;src/&lt;/code&gt; directory of Claude Code, the "agentic harness" that wraps the underlying Claude model and gives it the ability to use tools, manage files, run bash commands, and orchestrate multi-agent workflows. This is not the model weights (those weren't exposed), but in many ways this is &lt;em&gt;more&lt;/em&gt; strategically valuable.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Architecture
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;The Tool System (~40 tools, ~29,000 lines)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Claude Code isn't a chat wrapper. It's a plugin-style architecture where every capability is a discrete, permission-gated tool:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;BashTool&lt;/code&gt; — shell command execution with safety guards&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;FileReadTool&lt;/code&gt;, &lt;code&gt;FileWriteTool&lt;/code&gt;, &lt;code&gt;FileEditTool&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;WebFetchTool&lt;/code&gt; — live web access&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;LSPTool&lt;/code&gt; — Language Server Protocol integration for IDE features&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;GlobTool&lt;/code&gt;, &lt;code&gt;GrepTool&lt;/code&gt; — codebase search&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;NotebookReadTool&lt;/code&gt;, &lt;code&gt;NotebookEditTool&lt;/code&gt; — Jupyter support&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;MultiEditTool&lt;/code&gt; — atomic multi-file edits&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;TodoReadTool&lt;/code&gt;, &lt;code&gt;TodoWriteTool&lt;/code&gt; — task tracking&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each tool has its own permission model, validation logic, and output formatting. The base tool definition alone spans 29,000 lines.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Query Engine (46,000 lines)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Labeled "the brain of the operation" in Gabriel Anhaia's &lt;a href="https://dev.to/gabrielanhaia"&gt;analysis&lt;/a&gt;. It handles all LLM API calls and response streaming, token caching and context management, multi-agent orchestration, and retry logic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Memory Architecture&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is what competitors will study most carefully. Anthropic built a solution to "context entropy," the tendency for long-running AI sessions to degrade into hallucination as the context grows. Their answer is a three-layer memory system:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;Layer 1: MEMORY.md
  → A lightweight index of pointers (~150 chars per entry)
  → Always loaded in context
  → Stores LOCATIONS, not data

Layer 2: Topic Files
  → Actual project knowledge, fetched on-demand
  → Never fully in context simultaneously

Layer 3: Raw Transcripts
  → Never re-read fully
  → Only grep'd for specific identifiers when needed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key insight is what they call &lt;strong&gt;Strict Write Discipline&lt;/strong&gt;. The agent can only update its memory index after a confirmed successful file write. This prevents the agent from polluting its context with failed attempts. The agent also treats its own memory as a "hint" and verifies facts against the actual codebase before acting, rather than trusting its stored beliefs.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Hidden Features Anthropic Never Meant to Ship
&lt;/h2&gt;

&lt;h3&gt;
  
  
  KAIROS: Always-On Autonomous Agent
&lt;/h3&gt;

&lt;p&gt;KAIROS (from the Ancient Greek for "the right moment") is mentioned 150+ times in the source. It's an unreleased autonomous background daemon mode that runs background sessions while you're idle, executes a process called &lt;code&gt;autoDream&lt;/code&gt; for nightly memory consolidation, merges disparate observations, removes logical contradictions, and converts vague insights into verified facts. It also has a special &lt;code&gt;Brief&lt;/code&gt; output mode designed for a persistent assistant and access to tools regular Claude Code doesn't have.&lt;/p&gt;

&lt;p&gt;Think of it as Claude Code actively maintaining its understanding of your project while you sleep, not just sitting there waiting.&lt;/p&gt;

&lt;h3&gt;
  
  
  ULTRAPLAN: 30-Minute Remote Planning Sessions
&lt;/h3&gt;

&lt;p&gt;ULTRAPLAN offloads a complex planning task to a remote Cloud Container Runtime (CCR) session running Opus, gives it up to 30 minutes to think, and lets you approve the result from your phone or browser. When approved, a special sentinel value &lt;code&gt;__ULTRAPLAN_TELEPORT_LOCAL__&lt;/code&gt; brings the result back to your local terminal. Remote cloud-powered reasoning, delivered locally.&lt;/p&gt;

&lt;h3&gt;
  
  
  Coordinator Mode: Multi-Agent Orchestration
&lt;/h3&gt;

&lt;p&gt;One Claude spawning and managing multiple worker Claude agents in parallel. The Coordinator handles task distribution, result aggregation, and conflicts between worker outputs. It's infrastructure for AI teams, not just AI assistants.&lt;/p&gt;

&lt;h3&gt;
  
  
  BUDDY: The Part Nobody Expected
&lt;/h3&gt;

&lt;p&gt;The most talked-about find, not for its strategic implications but because it's genuinely fun.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;buddy/companion.ts&lt;/code&gt; implements a full Tamagotchi-style AI pet that lives in a speech bubble next to your terminal input.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Species (18 total, hidden via String.fromCharCode() arrays):
duck, dragon, axolotl, capybara, mushroom, ghost, nebulynx...

Rarity tiers:
Common &amp;gt; Uncommon &amp;gt; Rare &amp;gt; Epic &amp;gt; Legendary
1% shiny chance, independent of rarity

Stats:
DEBUGGING / PATIENCE / CHAOS / WISDOM / SNARK

Determined by:
Mulberry32 PRNG seeded from your userId hash + salt 'friend-2026-401'
(Same user always gets the same buddy species -- deterministic)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Claude generates a custom name and personality description for your buddy on first hatch. There are sprite animations and a floating heart effect. The planned rollout window in the source code: &lt;strong&gt;April 1-7, 2026&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Someone at Anthropic is clearly having a very good time.&lt;/p&gt;

&lt;h3&gt;
  
  
  Anti-Distillation: Poisoning Competitor Training Data
&lt;/h3&gt;

&lt;p&gt;In &lt;code&gt;claude.ts&lt;/code&gt; (lines 301-313), a flag called &lt;code&gt;ANTI_DISTILLATION_CC&lt;/code&gt;, when enabled, sends &lt;code&gt;anti_distillation: ['fake_tools']&lt;/code&gt; in API requests. This tells the server to inject decoy tool definitions into the system prompt. The idea: if a competitor is recording Claude Code's API traffic to train their own model, the fake tool definitions corrupt that training data.&lt;/p&gt;

&lt;p&gt;There's a second mechanism in &lt;code&gt;betas.ts&lt;/code&gt; (lines 279-298): server-side connector-text summarization. When enabled, the API buffers the assistant's reasoning between tool calls, returns only summaries, and cryptographically signs them. Competitors recording traffic get the summaries, not the full reasoning chain.&lt;/p&gt;

&lt;p&gt;As Alex Kim &lt;a href="https://alex000kim.com/posts/2026-03-31-claude-code-source-leak/" rel="noopener noreferrer"&gt;notes in his analysis&lt;/a&gt;: "Anyone serious about distilling from Claude Code traffic would find the workarounds in about an hour of reading the source. The real protection is probably legal, not technical."&lt;/p&gt;

&lt;h3&gt;
  
  
  Frustration Detection via Regex
&lt;/h3&gt;

&lt;p&gt;Found in &lt;code&gt;userPromptKeywords.ts&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="nf"&gt;b&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;wtf&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;wth&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;ffs&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;omfg&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nf"&gt;shit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ty&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;tiest&lt;/span&gt;&lt;span class="p"&gt;)?&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;dumbass&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;horrible&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;awful&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
&lt;span class="nf"&gt;piss&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ed&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;ing&lt;/span&gt;&lt;span class="p"&gt;)?&lt;/span&gt; &lt;span class="nx"&gt;off&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;piece&lt;/span&gt; &lt;span class="k"&gt;of &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;shit&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;crap&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;junk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;what&lt;/span&gt; &lt;span class="nf"&gt;the &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;fuck&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;hell&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
&lt;span class="nx"&gt;fucking&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;broken&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;useless&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;terrible&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;awful&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;horrible&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;fuck&lt;/span&gt; &lt;span class="nx"&gt;you&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
&lt;span class="nf"&gt;screw &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;you&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;so&lt;/span&gt; &lt;span class="nx"&gt;frustrating&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt; &lt;span class="nx"&gt;sucks&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;damn&lt;/span&gt; &lt;span class="nx"&gt;it&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="nx"&gt;b&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A multi-billion-dollar AI company is detecting user frustration with a regex. The Hacker News thread lost it. To be fair though, it's faster, cheaper, and more predictable than running an LLM inference every time to check if the user is angry at the tool.&lt;/p&gt;

&lt;h3&gt;
  
  
  250,000 Wasted API Calls Per Day
&lt;/h3&gt;

&lt;p&gt;The most candid internal admission in the entire codebase. From &lt;code&gt;autoCompact.ts&lt;/code&gt; (lines 68-70):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"BQ 2026-03-10: 1,279 sessions had 50+ consecutive failures 
(up to 3,272) in a single session, wasting ~250K API calls/day globally."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The fix was three lines: &lt;code&gt;MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES = 3&lt;/code&gt;. After 3 consecutive compaction failures, it just stops trying. Sometimes good engineering is knowing when to give up.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. The "Capybara" Model Confirmed
&lt;/h2&gt;

&lt;p&gt;The leak didn't expose Claude's model weights, but it did expose multiple references to Anthropic's next major model family. Internal codenames: &lt;strong&gt;Capybara&lt;/strong&gt; (also referred to as &lt;strong&gt;Mythos&lt;/strong&gt; in a separate leaked document from the prior week).&lt;/p&gt;

&lt;p&gt;The beta flags in the source reference specific API version strings for Capybara, suggesting it's well beyond concept stage. Security researcher Roy Paz from LayerX Security, who reviewed the code for Fortune, indicated it will likely ship in fast and slow variants with a significantly larger context window than anything currently on the market.&lt;/p&gt;

&lt;p&gt;These references also confirmed the existence of &lt;code&gt;undercover.ts&lt;/code&gt;, a module that actively instructs Claude Code to never mention internal codenames like "Capybara" or "Tengu" when used in external repositories. There's a hard-coded &lt;code&gt;NO force-OFF&lt;/code&gt; — you can force Undercover Mode on, but you cannot force it off. In external builds, the function gets dead-code-eliminated entirely.&lt;/p&gt;

&lt;p&gt;The implication raised in the &lt;a href="https://news.ycombinator.com/item?id=47584540" rel="noopener noreferrer"&gt;Hacker News thread&lt;/a&gt;: AI-authored commits from Anthropic employees in open source repos will have no indication an AI wrote them. The tool actively conceals its own involvement.&lt;/p&gt;




&lt;h2&gt;
  
  
  7. Alternative Theory: Was This Anthropic's PR Play?
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;I'm not saying I believe this. I'm saying the circumstantial evidence is strange enough that it deserves to be stated clearly.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Anthropic is the self-proclaimed "safety-first AI lab." They're racing for developer mindshare against OpenAI (better brand) and Google (better distribution). Claude Code is their breakout product. They're preparing for an IPO. And they'd just made themselves unpopular with the developer community ten days earlier by sending legal threats to OpenCode for using their internal APIs.&lt;/p&gt;

&lt;p&gt;So let's look at what this "leak" actually did for Anthropic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Exhibit A: The April Fools' Timing&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The leak occurred on March 31, the day before April 1st. The Buddy/companion system had a planned rollout window of April 1-7 coded directly into the source. The "leak" gave developers a sneak peek at what was about to launch anyway. Was this a controlled preview dressed up as an accident?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Exhibit B: The Bun Bug Nobody Fixed&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Anthropic acquired Bun. They own the runtime. The bug causing source maps to ship in production was filed 20 days before the leak and was still open. If you own the runtime and its bug tracker, and that bug causes your own code to leak... why hadn't anyone internally marked it as critical?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Exhibit C: The Undercover Mode Irony&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Claude Code has an entire subsystem called Undercover Mode, purpose-built to prevent internal codenames from leaking through AI-generated content. They built AI-powered leak prevention into the product. Then humans accidentally shipped the entire source code. The gap between their AI safety engineering and their human release engineering is either tragic or theatrical.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Exhibit D: The OpenCode Reputation Reversal&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Ten days before the leak, Anthropic sent cease-and-desist letters to OpenCode, a popular third-party tool. The developer community was furious. The narrative was "Anthropic is acting like a gatekeeping megacorp."&lt;/p&gt;

&lt;p&gt;Then a "leak" happens that shows Anthropic's impressive engineering to the world, makes them look like the underdog, generates three days of breathless coverage about KAIROS, BUDDY, and ULTRAPLAN, and completely reversed developer sentiment. Within 48 hours, developers went from "Anthropic sucks" to "holy shit look what Anthropic is building."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Exhibit E: The Permanent Mirror Problem&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Anthropic filed DMCA takedowns. GitHub complied immediately. But the decentralized mirror at Gitlawb, with a public message saying "Will never be taken down," has been live since day one. Anthropic has a legal team, deep pockets, and relationships. A serious legal effort could make life difficult for every mirror operator. They chose not to go that hard.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Exhibit F: The "Second Leak in a Week" Pattern&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This wasn't Anthropic's first incident that week. A draft blog post about the Capybara/Mythos model had "accidentally" been publicly accessible just days before, as Fortune reported on Thursday. Two high-profile "leaks" in five days, both generating enormous excitement about Anthropic's upcoming roadmap, both very conveniently timed.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Counter-Arguments (Why It's Probably Just Incompetence)
&lt;/h3&gt;

&lt;p&gt;To be fair:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Strategic roadmap exposure is genuinely damaging.&lt;/strong&gt; Cursor, Copilot, and Windsurf now know exactly what Anthropic has already built and what's nearly ready to ship. That's real competitive intelligence permanently in the public domain.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The IPO narrative cuts both ways.&lt;/strong&gt; "We shipped our source code to npm" is not a line you want in your S-1.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The axios RAT timing.&lt;/strong&gt; Nobody would engineer a PR stunt to overlap with an active malware attack on npm. That part made a bad news day significantly worse for anyone who updated Claude Code that morning, and there's no upside to being associated with a supply chain attack.&lt;/p&gt;

&lt;p&gt;The most likely answer is plain human error. A misconfigured &lt;code&gt;.npmignore&lt;/code&gt;. A known Bun bug nobody had marked as critical. A public R2 bucket that should have been private. Three configuration failures that compounded into a disaster.&lt;/p&gt;

&lt;p&gt;The PR outcome though? Undeniably good. The strategic damage? Real but survivable. The timing? Genuinely strange.&lt;/p&gt;

&lt;p&gt;Draw your own conclusions.&lt;/p&gt;




&lt;h2&gt;
  
  
  8. Why DMCA Won't Fix This
&lt;/h2&gt;

&lt;p&gt;DMCA takedowns work on centralized platforms. GitHub complied within hours. But the code spread to places that are harder to reach.&lt;/p&gt;

&lt;p&gt;Gitlawb, with its explicit "Will never be taken down" message, operates outside the DMCA's practical reach. The Python port that appeared the same day was &lt;a href="https://decrypt.co/362917/anthropic-accidentally-leaked-claude-code-source-internet-keeping-forever" rel="noopener noreferrer"&gt;declared DMCA-proof&lt;/a&gt; by The Pragmatic Engineer's Gergely Orosz, who noted the rewrite is a new creative work that violates no copyright. There's also the AI copyright question: Anthropic's own CEO has implied that significant portions of Claude Code were written by Claude. The DC Circuit upheld in March 2025 that AI-generated work doesn't carry automatic copyright. If Anthropic's copyright claim over Claude-authored code is legally murky, the entire takedown strategy weakens.&lt;/p&gt;

&lt;p&gt;And then there are torrents. Content once on the internet at scale doesn't come back.&lt;/p&gt;

&lt;p&gt;The practical reality: 512,000 lines of Claude Code are permanently in the wild, regardless of what any court decides.&lt;/p&gt;




&lt;h2&gt;
  
  
  9. What This Means For You
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;If you're using Claude Code:&lt;/strong&gt; Update immediately past v2.1.88 and use the native installer going forward (&lt;code&gt;curl -fsSL https://claude.ai/install.sh | bash&lt;/code&gt;). If you updated via npm between 00:21 and 03:29 UTC on March 31, do the axios/RAT check above.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you're building AI coding tools:&lt;/strong&gt; The leaked source is now the most detailed public documentation of how to build a production-grade AI agent harness that exists. The three-layer memory architecture, the permission system, the tool plugin design, the multi-agent coordination patterns. It's all there, already analyzed by thousands of developers. The bar for what "production-grade" means just got documented in detail.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you're at Anthropic:&lt;/strong&gt; The code is out. KAIROS, ULTRAPLAN, and BUDDY are already built. Ship them. The community already knows they're coming. Turn the leak into a launch.&lt;/p&gt;




&lt;h2&gt;
  
  
  10. Lessons for Every Dev Team
&lt;/h2&gt;

&lt;p&gt;This incident is a clear example of how release pipeline failures compound. Regardless of your opinion on Anthropic, every team should run through this checklist:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1. Audit your .npmignore / package.json "files" field&lt;/span&gt;
&lt;span class="nb"&gt;cat&lt;/span&gt; .npmignore
&lt;span class="c"&gt;# Do you explicitly exclude *.map, dist/*.map, *.d.ts.map?&lt;/span&gt;

&lt;span class="c"&gt;# 2. Check if source maps ship in your production build&lt;/span&gt;
&lt;span class="nb"&gt;ls &lt;/span&gt;dist/ | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\.&lt;/span&gt;&lt;span class="s2"&gt;map$"&lt;/span&gt;
&lt;span class="c"&gt;# If you see anything: your bundler config needs review&lt;/span&gt;

&lt;span class="c"&gt;# 3. Audit your cloud storage permissions&lt;/span&gt;
&lt;span class="c"&gt;# Are any buckets referenced in your build artifacts publicly accessible?&lt;/span&gt;

&lt;span class="c"&gt;# 4. Check your build toolchain for known bugs&lt;/span&gt;
&lt;span class="c"&gt;# If you're on Bun, check issue #28001 status&lt;/span&gt;

&lt;span class="c"&gt;# 5. Review your npm publish workflow&lt;/span&gt;
npm pack &lt;span class="nt"&gt;--dry-run&lt;/span&gt;
&lt;span class="c"&gt;# Review EVERY file that would be published before actually publishing&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The line that came out of the Hacker News thread: &lt;strong&gt;"Your .npmignore is load-bearing. Treat it like a security boundary."&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Here's what we know for certain: a misconfigured &lt;code&gt;.npmignore&lt;/code&gt; and a public cloud storage bucket exposed 512,000 lines of Claude Code, the code spread instantly and is now permanently in the wild, the leak revealed a technically impressive product with a compelling feature roadmap, and Anthropic's brand among developers bounced back remarkably fast.&lt;/p&gt;

&lt;p&gt;What we'll probably never know: whether anyone inside Anthropic saw the Bun bug and made a judgment call, whether the April Fools' timing of the BUDDY rollout was coincidence, and whether Anthropic's relative restraint on DMCA enforcement is legal strategy or resource allocation.&lt;/p&gt;

&lt;p&gt;What's not in question is that the engineering inside Claude Code is genuinely impressive. The memory architecture, the anti-distillation mechanisms, the multi-agent coordination, the DRM-at-the-HTTP-layer attestation. This is a serious piece of software doing things that are actually hard.&lt;/p&gt;

&lt;p&gt;Accident or not, the world now knows what Anthropic is capable of building.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;And maybe that was the point.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;th&gt;Link&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Alex Kim's technical deep-dive&lt;/td&gt;
&lt;td&gt;&lt;a href="https://alex000kim.com/posts/2026-03-31-claude-code-source-leak/" rel="noopener noreferrer"&gt;alex000kim.com&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;VentureBeat — Full breakdown + axios RAT warning&lt;/td&gt;
&lt;td&gt;&lt;a href="https://venturebeat.com/technology/claude-codes-source-code-appears-to-have-leaked-heres-what-we-know" rel="noopener noreferrer"&gt;venturebeat.com&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;The Register — Anthropic's official statement&lt;/td&gt;
&lt;td&gt;&lt;a href="https://www.theregister.com/2026/03/31/anthropic_claude_code_source_code/" rel="noopener noreferrer"&gt;theregister.com&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fortune — Strategic analysis + Capybara confirmation&lt;/td&gt;
&lt;td&gt;&lt;a href="https://fortune.com/2026/03/31/anthropic-source-code-claude-code-data-leak-second-security-lapse-days-after-accidentally-revealing-mythos/" rel="noopener noreferrer"&gt;fortune.com&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Decrypt — DMCA analysis + permanent mirror situation&lt;/td&gt;
&lt;td&gt;&lt;a href="https://decrypt.co/362917/anthropic-accidentally-leaked-claude-code-source-internet-keeping-forever" rel="noopener noreferrer"&gt;decrypt.co&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CNBC — Revenue figures + company response&lt;/td&gt;
&lt;td&gt;&lt;a href="https://www.cnbc.com/2026/03/31/anthropic-leak-claude-code-internal-source.html" rel="noopener noreferrer"&gt;cnbc.com&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Axios — Feature flag breakdown + roadmap analysis&lt;/td&gt;
&lt;td&gt;&lt;a href="https://www.axios.com/2026/03/31/anthropic-leaked-source-code-ai" rel="noopener noreferrer"&gt;axios.com&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DEV.to (Gabriel Anhaia) — Architecture walkthrough&lt;/td&gt;
&lt;td&gt;&lt;a href="https://dev.to/gabrielanhaia/claude-codes-entire-source-code-was-just-leaked-via-npm-source-maps-heres-whats-inside-cjo"&gt;dev.to&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kuberwastaken/claude-code GitHub&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/Kuberwastaken/claude-code" rel="noopener noreferrer"&gt;github.com&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hacker News thread&lt;/td&gt;
&lt;td&gt;&lt;a href="https://news.ycombinator.com/item?id=47584540" rel="noopener noreferrer"&gt;news.ycombinator.com&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bun bug #28001&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/oven-sh/bun/issues/28001" rel="noopener noreferrer"&gt;github.com/oven-sh/bun&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CyberSecurityNews — Supply chain attack details&lt;/td&gt;
&lt;td&gt;&lt;a href="https://cybersecuritynews.com/claude-code-source-code-leaked/" rel="noopener noreferrer"&gt;cybersecuritynews.com&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;p&gt;&lt;em&gt;If this was useful, drop a reaction. If you spot anything I got wrong, leave it in the comments.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>ai</category>
      <category>programming</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Using Claude Code with Any LLM: Why a Gateway Changes Everything</title>
      <dc:creator>Varshith V Hegde</dc:creator>
      <pubDate>Fri, 13 Mar 2026 03:30:00 +0000</pubDate>
      <link>https://forem.com/varshithvhegde/using-claude-code-with-any-llm-why-a-gateway-changes-everything-4a0c</link>
      <guid>https://forem.com/varshithvhegde/using-claude-code-with-any-llm-why-a-gateway-changes-everything-4a0c</guid>
      <description>&lt;p&gt;I've been using Claude Code for a while now, and if you're a developer who has added it to your daily workflow, you probably know the feeling. It's genuinely good. It reads your codebase, runs commands, modifies files, and helps implement features right from your terminal without you having to context-switch constantly.&lt;/p&gt;

&lt;p&gt;But at some point, most developers hit the same wall I did: what if I want to use a different model?&lt;/p&gt;

&lt;p&gt;What if GPT-4o handles your specific codebase better? What if Gemini's larger context window is exactly what you need for that massive legacy project? What if you're spending more on API calls than you should be, and you know some of those simpler tasks could run on a cheaper model just fine?&lt;/p&gt;

&lt;p&gt;Out of the box, Claude Code only talks to Anthropic. That's just how it works. And while Anthropic's models are genuinely strong, being locked into a single provider means you're trading flexibility for convenience. This guide is about getting both.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Real Friction Points
&lt;/h2&gt;

&lt;p&gt;Before jumping into the solution, it helps to be specific about what problems we're actually solving.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Model flexibility.&lt;/strong&gt; Different models have different strengths. Claude Sonnet is excellent for most coding tasks, but you can't know it's the best tool for every job without being able to test alternatives. Without a gateway, you can't experiment without completely switching tools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cost management.&lt;/strong&gt; Claude Code burns through tokens quickly during an active session. Complex architectural work and boilerplate generation are not the same job, and pricing them identically doesn't make much sense. Routing simpler requests to a more affordable model can cut costs significantly without affecting output quality where it matters.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Compliance and data routing.&lt;/strong&gt; If you work in fintech, healthcare, or any regulated industry, you've likely dealt with requirements around where your data goes. Routing all API traffic through your own infrastructure before it reaches any external provider is often non-negotiable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Observability.&lt;/strong&gt; This one gets overlooked a lot. How many tokens does a typical Claude Code session consume? What's your actual cost per feature shipped? Without request logging, you're genuinely guessing.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Bifrost
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqakfazgxeydkto0p4xn6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqakfazgxeydkto0p4xn6.png" alt="Bifrost" width="800" height="468"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.getmaxim.ai/bifrost" rel="noopener noreferrer"&gt;Bifrost&lt;/a&gt; is an open-source LLM gateway built by &lt;a href="https://www.getmaxim.ai" rel="noopener noreferrer"&gt;Maxim AI&lt;/a&gt; to route, manage, and optimize requests between your application and multiple model providers. It's Apache 2.0 licensed, self-hostable, and supports 20+ providers including OpenAI, Anthropic, Google Gemini, AWS Bedrock, Azure, Mistral, Cohere, Groq, and more.&lt;/p&gt;

&lt;p&gt;A few things that make it stand out technically:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Performance that doesn't get in the way.&lt;/strong&gt; At 5,000 requests per second, Bifrost adds less than 15 microseconds of internal overhead per request. At production scale, that's essentially nothing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Zero-config startup.&lt;/strong&gt; A single &lt;code&gt;npx&lt;/code&gt; command launches the gateway, and everything else is configurable through a web UI.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Built-in fallbacks and load balancing.&lt;/strong&gt; If a provider fails or rate-limits you, Bifrost automatically routes to a backup. Traffic can also be distributed across multiple keys or providers using weighted rules.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Semantic caching.&lt;/strong&gt; Repeated or semantically similar queries can be served from cache, which reduces both latency and cost for workflows with a lot of repetitive prompting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Full observability out of the box.&lt;/strong&gt; Prometheus metrics, request tracing, token usage, latency, and a built-in web dashboard are all included.&lt;/p&gt;

&lt;p&gt;The architecture is straightforward:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Claude Code  --&amp;gt;  Bifrost (localhost:8080)  --&amp;gt;  Any LLM Provider
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Claude Code uses an environment variable called &lt;code&gt;ANTHROPIC_BASE_URL&lt;/code&gt; to know where to send API requests. Normally it points to &lt;code&gt;https://api.anthropic.com&lt;/code&gt;. You point it at Bifrost instead. Bifrost accepts requests in Anthropic's Messages API format, translates them to whichever provider you've configured, and translates the response back. Claude Code never knows the difference.&lt;/p&gt;

&lt;p&gt;No code changes. No patching. One environment variable.&lt;/p&gt;




&lt;h2&gt;
  
  
  What We'll Cover
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Setting up and configuring Bifrost with multiple LLM providers&lt;/li&gt;
&lt;li&gt;Integrating Claude Code with the gateway&lt;/li&gt;
&lt;li&gt;Running Claude Code with any model&lt;/li&gt;
&lt;li&gt;Configuring routing rules, fallbacks, and budgets&lt;/li&gt;
&lt;li&gt;Integrating MCP tools&lt;/li&gt;
&lt;li&gt;Using built-in observability and monitoring&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Part 1: Setting Up Bifrost
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Install Bifrost
&lt;/h3&gt;

&lt;p&gt;Create a project folder, open it in your editor, and run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx &lt;span class="nt"&gt;-y&lt;/span&gt; @maximhq/bifrost &lt;span class="nt"&gt;-app-dir&lt;/span&gt; ./my-bifrost-data
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;-app-dir&lt;/code&gt; flag tells Bifrost where to store all its data. Bifrost will start listening on port 8080.&lt;/p&gt;

&lt;p&gt;If you prefer Docker:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker pull maximhq/bifrost
docker run &lt;span class="nt"&gt;-p&lt;/span&gt; 8080:8080 &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;pwd&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;/data:/app/data maximhq/bifrost
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;-v&lt;/code&gt; flag mounts a volume so your configuration persists across container restarts.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Create Your Config File
&lt;/h3&gt;

&lt;p&gt;Inside your &lt;code&gt;./my-bifrost-data&lt;/code&gt; folder, create a &lt;code&gt;config.json&lt;/code&gt; file. This defines which providers Bifrost can route to, enables request logging, and sets up database persistence:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"$schema"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://www.getbifrost.ai/schema"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"client"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"enable_logging"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"disable_content_logging"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"drop_excess_requests"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"initial_pool_size"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"allow_direct_keys"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"providers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"openai"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"keys"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"openai-primary"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"env.OPENAI_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"models"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[],&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"weight"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"anthropic"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"keys"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"anthropic-primary"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"env.ANTHROPIC_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"models"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[],&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"weight"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"gemini"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"keys"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"gemini-primary"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"env.GEMINI_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"models"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[],&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"weight"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"config_store"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"enabled"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sqlite"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"config"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"path"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"./config.db"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"logs_store"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"enabled"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sqlite"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"config"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"path"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"./logs.db"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;"value": "env.OPENAI_API_KEY"&lt;/code&gt; syntax tells Bifrost to read actual keys from environment variables rather than storing them in the file. Your secrets stay out of version control.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Set Your API Keys
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"your-openai-api-key"&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"your-anthropic-api-key"&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;GEMINI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"your-gemini-api-key"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Start the Gateway
&lt;/h3&gt;

&lt;p&gt;Stop any previously running Bifrost instance, then start it again with the app directory flag:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx &lt;span class="nt"&gt;-y&lt;/span&gt; @maximhq/bifrost &lt;span class="nt"&gt;-app-dir&lt;/span&gt; ./my-bifrost-data
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Open &lt;code&gt;http://localhost:8080&lt;/code&gt; in your browser. You'll see the Bifrost dashboard where all configuration and monitoring lives.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 2: Connecting Claude Code to Bifrost
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Install Claude Code
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @anthropic-ai/claude-code
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Point Claude Code at Bifrost
&lt;/h3&gt;

&lt;p&gt;Set these two environment variables in the same terminal session where you'll run Claude Code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;ANTHROPIC_BASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"http://localhost:8080/anthropic"&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"dummy-key"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;dummy-key&lt;/code&gt; part is a bit counterintuitive at first. Claude Code requires this variable to be set before it will run, but Bifrost handles actual authentication to providers using the keys you configured earlier. You can put any non-empty string here.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Run Claude Code with Any Model
&lt;/h3&gt;

&lt;p&gt;Start Claude Code and specify whichever model you want to use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude &lt;span class="nt"&gt;--model&lt;/span&gt; openai/gpt-4o
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To route to other providers, use the provider prefix pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="s"&gt;openai/gpt-4o&lt;/span&gt;
&lt;span class="s"&gt;openai/gpt-4o-mini&lt;/span&gt;
&lt;span class="s"&gt;gemini/gemini-2.5-pro&lt;/span&gt;
&lt;span class="s"&gt;groq/llama-3.1-70b-versatile&lt;/span&gt;
&lt;span class="s"&gt;mistral/mistral-large-latest&lt;/span&gt;
&lt;span class="s"&gt;anthropic/claude-sonnet-4-20250514&lt;/span&gt;
&lt;span class="s"&gt;ollama/llama3&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run a quick sanity check by asking something simple like "Hello there" to confirm requests are flowing through correctly.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 3: Routing Rules, Fallbacks, and Budgets
&lt;/h2&gt;

&lt;p&gt;Once Claude Code is connected, you can start using Bifrost's routing features to get more control over how requests are handled.&lt;/p&gt;

&lt;h3&gt;
  
  
  Weighted Routing Across Providers
&lt;/h3&gt;

&lt;p&gt;Virtual Keys in Bifrost let you define routing logic that applies automatically. Navigate to &lt;strong&gt;Governance &amp;gt; Virtual Keys&lt;/strong&gt;, create a key, and configure your routing weights:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"dev-routing"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"budget"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"max_budget"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"budget_duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"monthly"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"providers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"provider"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"openai"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"gpt-4o"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"weight"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"provider"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"anthropic"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"claude-sonnet-4-20250514"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"weight"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This routes 70% of requests to GPT-4o and 30% to Claude Sonnet, with a hard monthly cap of $100. Once the budget is exhausted, Bifrost stops routing automatically. For teams, this replaces a lot of manual cost monitoring.&lt;/p&gt;

&lt;h3&gt;
  
  
  Automatic Fallbacks
&lt;/h3&gt;

&lt;p&gt;When a provider goes down or you hit a rate limit, Bifrost works down a fallback list until a request succeeds:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"openai/gpt-4o"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"fallbacks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"provider"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"anthropic"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"claude-sonnet-4-20250514"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"provider"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"gemini"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"gemini-2.5-pro"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your coding session continues without any manual intervention when a provider has issues.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 4: MCP Tool Integration
&lt;/h2&gt;

&lt;p&gt;If you're using Model Context Protocol servers for filesystem access, web search, database queries, or custom integrations, Bifrost supports those too. Configure them once in Bifrost, and they become available to any model routing through it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Add MCP Configuration to Bifrost
&lt;/h3&gt;

&lt;p&gt;Update your &lt;code&gt;config.json&lt;/code&gt; to include MCP server definitions. Here's an example with filesystem access:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"$schema"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://www.getbifrost.ai/schema"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"client"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"enable_logging"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"disable_content_logging"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"drop_excess_requests"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"initial_pool_size"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"allow_direct_keys"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;

  &lt;/span&gt;&lt;span class="nl"&gt;"mcp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"client_configs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"filesystem"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"connection_type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"stdio"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"stdio_config"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@modelcontextprotocol/server-filesystem"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/tmp"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"tools_to_execute"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"*"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"tools_to_auto_execute"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="s2"&gt;"read_file"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="s2"&gt;"list_directory"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="s2"&gt;"create_file"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="s2"&gt;"delete_file"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"tool_manager_config"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"max_agent_depth"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"tool_execution_timeout"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;300000000000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"code_mode_binding_level"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"server"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Restart Bifrost and navigate to the MCP catalog page in the web UI to confirm the filesystem server shows as connected.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Add Bifrost as an MCP Server in Claude Code
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude mcp add &lt;span class="nt"&gt;--transport&lt;/span&gt; http bifrost http://localhost:8080/mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Verify with a Real Task
&lt;/h3&gt;

&lt;p&gt;Restart Claude Code and try a task that exercises the MCP tools. For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Create a simple calculator program in Python.

It should support addition, subtraction, multiplication, and division.
The user should input two numbers and an operation, and the program should print the result.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then follow up with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Analyze this repository and create a README.md explaining how the project works.
Include the project architecture and instructions for running it locally.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the MCP integration is working, Claude Code will read your files, create new ones, and interact with your filesystem through Bifrost's tool injection.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 5: Observability and Monitoring
&lt;/h2&gt;

&lt;p&gt;This is the part that surprised me most when I first set it up.&lt;/p&gt;

&lt;p&gt;Every request that passes through Bifrost is logged with full detail: the input prompt, the response, which model handled it, latency, and cost. The web interface at &lt;code&gt;http://localhost:8080/logs&lt;/code&gt; provides:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Real-time streaming of requests and responses&lt;/li&gt;
&lt;li&gt;Token usage tracking per request&lt;/li&gt;
&lt;li&gt;Latency measurements&lt;/li&gt;
&lt;li&gt;Filtering by provider, model, or conversation content&lt;/li&gt;
&lt;li&gt;Full request and response inspection&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For individual developers, it's useful for understanding your actual usage patterns. For teams, it becomes a proper audit trail. You can see which models are being used most, where the expensive requests are coming from, and whether your routing rules are actually behaving as expected.&lt;/p&gt;

&lt;p&gt;Bifrost also exposes Prometheus metrics for teams that want to integrate this data into existing monitoring pipelines.&lt;/p&gt;




&lt;h2&gt;
  
  
  Is This Worth Setting Up?
&lt;/h2&gt;

&lt;p&gt;If you're a solo developer who uses Claude Code occasionally and doesn't have any compliance or cost concerns, the default setup is probably fine.&lt;/p&gt;

&lt;p&gt;But if any of the following are true, a gateway is worth the time:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You want to test how different models perform on your specific workload&lt;/li&gt;
&lt;li&gt;You're managing API costs across a team&lt;/li&gt;
&lt;li&gt;Your organization has requirements around data routing or infrastructure control&lt;/li&gt;
&lt;li&gt;You want actual visibility into your AI usage rather than end-of-month billing surprises&lt;/li&gt;
&lt;li&gt;You use MCP tools and want them available across multiple model providers without reconfiguring each time&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Bifrost being open source and self-hosted means your prompts and responses stay on your own infrastructure. For teams working on proprietary codebases, that's a meaningful difference from routing everything directly to a third-party API.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Get started:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Website: &lt;a href="https://www.getmaxim.ai/bifrost" rel="noopener noreferrer"&gt;getmax.im/bifrost&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;GitHub: &lt;a href="https://git.new/bifrost" rel="noopener noreferrer"&gt;git.new/bifrost&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Docs: &lt;a href="https://www.getmaxim.ai/bifrost/resources/claude-code" rel="noopener noreferrer"&gt;getmax.im/bifrostdocs&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>cli</category>
      <category>llm</category>
      <category>tooling</category>
    </item>
    <item>
      <title>ContractCompass: Your AI Contract Analyst That Actually Speaks Human</title>
      <dc:creator>Varshith V Hegde</dc:creator>
      <pubDate>Sun, 08 Feb 2026 12:34:11 +0000</pubDate>
      <link>https://forem.com/varshithvhegde/contractcompass-your-ai-contract-analyst-that-actually-speaks-human-nfo</link>
      <guid>https://forem.com/varshithvhegde/contractcompass-your-ai-contract-analyst-that-actually-speaks-human-nfo</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/algolia"&gt;Algolia Agent Studio Challenge&lt;/a&gt;: Consumer-Facing Conversational Experiences&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F833ehy8i626n22zc4pcn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F833ehy8i626n22zc4pcn.png" alt="FrontPage"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ContractCompass&lt;/strong&gt; is an AI-powered contract analysis tool that turns legal jargon into plain English through natural conversation. Think of it as having a friendly lawyer friend who can review your contract over coffee, except this friend never gets tired, works 24/7, and doesn't charge $400/hour.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Problem
&lt;/h3&gt;

&lt;p&gt;Most people sign contracts they don't fully understand. Employment agreements, rental leases, SaaS terms—they're all written in dense legal language that assumes you went to law school. By the time you realize that "perpetual, irrevocable, worldwide license" means the company owns your weekend projects forever, you've already signed away your rights.&lt;/p&gt;

&lt;p&gt;According to research, over 90% of people don't read terms and conditions before accepting them. It's not laziness. These documents are genuinely incomprehensible to the average person. A typical employment contract might be 15 pages of legal clauses that take hours to parse, assuming you even know what to look for.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Solution
&lt;/h3&gt;

&lt;p&gt;ContractCompass solves this through dialogue-based AI interaction. Instead of drowning you in legal analysis reports, it lets you have a natural conversation about your contract:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"What are the red flags here?"&lt;/li&gt;
&lt;li&gt;"Can you explain this termination clause like I'm five?"&lt;/li&gt;
&lt;li&gt;"Is this non-compete actually enforceable?"&lt;/li&gt;
&lt;li&gt;"What should I negotiate before signing?"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The AI agent responds in real-time with contextual answers grounded in a curated database of contract clauses, powered by &lt;strong&gt;Algolia Agent Studio's&lt;/strong&gt; semantic search capabilities.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fucwnfeemcsvsdrsati7u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fucwnfeemcsvsdrsati7u.png" alt="ChatInterface Initial"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Capabilities
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Conversational AI Interface&lt;/strong&gt; - Chat naturally with the agent. No forms, no checkboxes, just questions and answers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Intelligent Risk Detection&lt;/strong&gt; - Every clause gets analyzed and scored on a three-tier system (Low, Medium, High risk) with visual indicators.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Plain English Translations&lt;/strong&gt; - Legal jargon becomes "here's what this actually means for you."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Industry Comparisons&lt;/strong&gt; - The agent explains whether clauses are standard practice or unusual outliers worth negotiating.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rich Visual Analysis&lt;/strong&gt; - For deep dives, the agent generates structured analysis cards with prevalence bars, red flag lists, and detailed reasoning.&lt;/p&gt;




&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Live Demo:&lt;/strong&gt; &lt;a href="https://contractcompass.varshithvhegde.in/" rel="noopener noreferrer"&gt;https://contractcompass.varshithvhegde.in/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;No login required. No credit card. Just upload a contract (or try one of the built-in samples) and start asking questions.&lt;/p&gt;

&lt;p&gt;

  &lt;iframe src="https://www.youtube.com/embed/cWGjpM0eIMc"&gt;
  &lt;/iframe&gt;


&lt;/p&gt;

&lt;h3&gt;
  
  
  How It Feels to Use
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;1. Upload is effortless&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Drag and drop a PDF, paste text, or click one of the sample contracts. I've included four pre-loaded examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Friendly Startup Offer&lt;/strong&gt; (Low Risk) - A well-balanced employment agreement with fair terms&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Red Flag Employment Contract&lt;/strong&gt; (High Risk) - Includes unilateral salary cuts, 24-month lock-in, overbroad IP assignment, 3-year non-compete, and $500K liquidated damages&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Predatory Rental Agreement&lt;/strong&gt; (High Risk) - Non-refundable deposits, tenant pays for ALL repairs, no-notice landlord entry, uncapped rent increases&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reasonable SaaS Agreement&lt;/strong&gt; (Low Risk) - Standard business terms with mutual protections&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq3k1d4mwn40h4qziqk0y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq3k1d4mwn40h4qziqk0y.png" alt="Upload"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. The interface splits&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Your contract appears on the left for reference, chat on the right. You can always scroll back to check what clause the AI is talking about.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fde48nk9tg4pmqhr0n3j4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fde48nk9tg4pmqhr0n3j4.png" alt="Interface"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Suggested prompts guide you&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Six smart buttons help you get started:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Full Risk Analysis&lt;/li&gt;
&lt;li&gt;Find red flags&lt;/li&gt;
&lt;li&gt;Explain in plain English&lt;/li&gt;
&lt;li&gt;What should I negotiate?&lt;/li&gt;
&lt;li&gt;Compare to standards&lt;/li&gt;
&lt;li&gt;Is this enforceable?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;4. Streaming responses feel natural&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The AI types back to you in real-time, token by token, like a real conversation. No waiting for a complete response to load. You see the analysis unfold naturally.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fulhh7i9pelxwngfz5zs4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fulhh7i9pelxwngfz5zs4.png" alt="Suggested Prompts"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  How I Used Algolia Agent Studio
&lt;/h2&gt;

&lt;p&gt;Algolia Agent Studio is the intelligence engine that makes ContractCompass possible. Here's how it powers the entire conversational experience.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Index: A Knowledge Base of Contract Clauses
&lt;/h3&gt;

&lt;p&gt;I created an Algolia index called &lt;code&gt;contract_clauses&lt;/code&gt; containing &lt;strong&gt;50+ curated contract clauses&lt;/strong&gt; across four contract types (employment, rental, SaaS, freelance). Each record includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;clause_text&lt;/strong&gt; - The full text of the clause&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;clause_type&lt;/strong&gt; - Category (termination, compensation, non-compete, etc.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;contract_type&lt;/strong&gt; - Employment, rental, SaaS, or freelance&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;industry&lt;/strong&gt; - Tech, real estate, or general&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;prevalence_score&lt;/strong&gt; - A 0-1 score indicating how common this clause is&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;risk_level&lt;/strong&gt; - Low, medium, or high&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;plain_english&lt;/strong&gt; - Simple explanation for non-lawyers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;red_flags&lt;/strong&gt; - List of concerning aspects&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;standard_version&lt;/strong&gt; - What a fair version would look like&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;legal_implications&lt;/strong&gt; - Real-world impact of accepting the clause&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdkqc0yvfy8e811pvjjz2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdkqc0yvfy8e811pvjjz2.png" alt="Algolia index"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For example, a "predatory non-compete" clause record looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"objectID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"emp-nc-003"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"clause_text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Employee agrees not to work for any competing business..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"clause_type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"non_compete"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"contract_type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"employment"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"industry"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tech"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"prevalence_score"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"risk_level"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"high"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"plain_english"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"You can't work for competitors for 3 years across all of North America"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"red_flags"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Unreasonably broad geographic scope"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Excessive duration"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"standard_version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Typically 6-12 months within 50 miles of office"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"legal_implications"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"May prevent you from working in your field"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  How Retrieval Powers the Conversation
&lt;/h3&gt;

&lt;p&gt;When a user uploads a contract and starts asking questions, here's what happens behind the scenes:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Semantic Clause Matching&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The Algolia agent retrieves semantically similar clauses from the index to provide context-aware responses. For example, if someone asks "Is this non-compete fair?", the agent:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Identifies the non-compete clause in the uploaded contract&lt;/li&gt;
&lt;li&gt;Searches the index for similar non-compete clauses&lt;/li&gt;
&lt;li&gt;Compares the uploaded clause against standard versions&lt;/li&gt;
&lt;li&gt;Explains whether it's typical or unusually restrictive&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;2. Contract Type Detection&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The agent automatically identifies the type of contract (employment, rental, SaaS, etc.) based on the language and clauses present, then adjusts its analysis accordingly. An employment contract gets compared against employment standards, not rental standards.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Prevalence-Based Risk Assessment&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Using the prevalence scores from the indexed data, the agent can say things like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"This termination clause is standard. About 95% of tech employment contracts include similar terms"&lt;/li&gt;
&lt;li&gt;"This security deposit policy is unusual. Only 15% of rental agreements make deposits non-refundable"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;4. Standard Version Recommendations&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When a clause is problematic, the agent doesn't just say "this is bad." It shows what a fair version would look like:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"The current non-compete restricts you for 3 years across North America. A standard tech industry non-compete is typically 6-12 months within 50 miles of the office."&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;
  
  
  Making the Agent Conversational
&lt;/h3&gt;

&lt;p&gt;The key to making ContractCompass feel natural was teaching the agent to think like a helpful friend, not a legal robot. I crafted prompts that guide it to:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Speak like a human&lt;/strong&gt; - Use simple language. Avoid legal jargon unless explaining it. Be conversational but professional.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Be honest about risks&lt;/strong&gt; - If a clause is predatory, say so clearly. Don't sugarcoat problematic terms.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ground everything in data&lt;/strong&gt; - Always search the contract_clauses index for similar examples. Compare the user's clause against standard versions and explain how it differs from typical industry practice.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Provide actionable advice&lt;/strong&gt; - Don't just identify problems. Suggest what to negotiate and how to approach it.&lt;/p&gt;

&lt;p&gt;This approach ensures every response is both friendly and useful, backed by real contract data rather than generic advice.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4j4w7017pmxu4o8eqhr7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4j4w7017pmxu4o8eqhr7.png" alt="Contract Agent Response"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  Why Fast Retrieval Matters
&lt;/h2&gt;

&lt;p&gt;Algolia's speed and semantic search capabilities are critical to making ContractCompass feel like a real conversation rather than a clunky Q&amp;amp;A bot.&lt;/p&gt;
&lt;h3&gt;
  
  
  Speed Creates Natural Dialogue
&lt;/h3&gt;

&lt;p&gt;When someone asks "What are the red flags in this contract?", they expect an answer within seconds, not minutes. Algolia's sub-50ms search latency means the agent can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Retrieve relevant clause examples instantly&lt;/li&gt;
&lt;li&gt;Stream responses token-by-token without lag&lt;/li&gt;
&lt;li&gt;Handle follow-up questions in the same conversation thread&lt;/li&gt;
&lt;li&gt;Maintain context across multiple queries&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If retrieval took 5-10 seconds per query, users would lose patience. The conversation would feel broken. Fast retrieval makes the experience feel fluid and natural.&lt;/p&gt;
&lt;h3&gt;
  
  
  Contextual Retrieval Enables Nuanced Analysis
&lt;/h3&gt;

&lt;p&gt;Algolia's semantic search doesn't just match keywords. It understands meaning. This is crucial for contract analysis because:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Legal language varies widely&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A non-compete clause might say "Employee shall not engage in competitive activities" or "You agree not to work for rival companies." These are semantically similar but textually different. Algolia's vector-based search matches them both.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Users ask in natural language&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Someone might ask "Can they really fire me for any reason?" which should match clauses about "at-will employment" or "termination without cause." Semantic search bridges this gap.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Context matters&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A liability cap of $100K might be reasonable in a $10K/year SaaS contract but predatory in a $500K enterprise agreement. By retrieving similar contracts in the same industry and price range, the agent provides context-aware analysis.&lt;/p&gt;
&lt;h3&gt;
  
  
  Retrieval Grounds Responses in Real Data
&lt;/h3&gt;

&lt;p&gt;One of the biggest risks with AI agents is hallucination. Making up plausible-sounding but incorrect information. By grounding every response in retrieved data from the curated index, ContractCompass avoids this problem.&lt;/p&gt;

&lt;p&gt;When the agent says "This non-compete is unusually restrictive," it's not guessing. It's comparing the uploaded clause against the prevalence scores and standard versions in the index. When it explains what a fair clause looks like, it's showing you actual examples from the database.&lt;/p&gt;

&lt;p&gt;This retrieval-augmented generation (RAG) approach makes the agent both reliable and trustworthy.&lt;/p&gt;
&lt;h3&gt;
  
  
  The Impact on User Experience
&lt;/h3&gt;

&lt;p&gt;From a user perspective, fast contextual retrieval translates to:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Confidence in the analysis&lt;/strong&gt; - "This isn't just an AI's opinion, it's based on real contract data"&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Immediate answers&lt;/strong&gt; - "I can get my questions answered in real-time without waiting"&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conversational flow&lt;/strong&gt; - "It feels like talking to a human expert who knows contract law"&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Actionable insights&lt;/strong&gt; - "I now know exactly what to negotiate before signing"&lt;/p&gt;

&lt;p&gt;Without Algolia's speed and semantic capabilities, ContractCompass would be a generic chatbot that gives vague, unhelpful advice. With them, it's a genuinely useful tool that empowers people to understand and negotiate their contracts.&lt;/p&gt;


&lt;h2&gt;
  
  
  Technical Architecture
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Frontend (React + TypeScript)
&lt;/h3&gt;

&lt;p&gt;The interface is built with React 18 and TypeScript for type safety. I chose a modern stack that prioritizes performance and developer experience:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;UI Library:&lt;/strong&gt; Tailwind CSS + shadcn/ui components for a clean, professional look&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;State Management:&lt;/strong&gt; React hooks for local state (no complex state library needed)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Markdown Rendering:&lt;/strong&gt; react-markdown for rich text in chat responses&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  AI Agent (Algolia Agent Studio)
&lt;/h3&gt;

&lt;p&gt;The chat interface calls Algolia Agent Studio directly from the frontend. This direct integration means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Real-time streaming responses that appear token-by-token&lt;/li&gt;
&lt;li&gt;No backend proxy needed for chat, which reduces latency&lt;/li&gt;
&lt;li&gt;Full conversation history sent with each request for contextual follow-ups&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Search Index (Algolia)
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;contract_clauses&lt;/code&gt; index contains 300+ curated clauses. Each clause is enriched with metadata (prevalence scores, risk levels, plain English explanations) that the agent uses to provide contextual analysis.&lt;/p&gt;
&lt;h3&gt;
  
  
  PDF Processing
&lt;/h3&gt;

&lt;p&gt;When users upload PDFs, the text extraction happens server-side using Google Gemini 2.5 Flash. The flow is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;User uploads PDF via drag-and-drop&lt;/li&gt;
&lt;li&gt;PDF converts to base64 on the client&lt;/li&gt;
&lt;li&gt;Base64 data sent to serverless function&lt;/li&gt;
&lt;li&gt;Function calls Gemini API for text extraction&lt;/li&gt;
&lt;li&gt;Extracted text returns to the frontend&lt;/li&gt;
&lt;li&gt;Text loads into chat interface for analysis&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;
  
  
  Backend (Serverless Functions)
&lt;/h3&gt;

&lt;p&gt;Four serverless functions handle specific tasks:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;extract-pdf&lt;/strong&gt; - PDF text extraction using Gemini&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;analyze-contract&lt;/strong&gt; - Clause parsing and analysis&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;search-clauses&lt;/strong&gt; - Direct Algolia index queries&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;seed-algolia&lt;/strong&gt; - Index population with curated data&lt;/li&gt;
&lt;/ol&gt;


&lt;h2&gt;
  
  
  Design Decisions
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Split-Screen Layout
&lt;/h3&gt;

&lt;p&gt;I chose a split-screen design (contract on left, chat on right) because users need to reference the original text while discussing it. It feels more collaborative, like reviewing a document with someone. Mobile users get a stacked layout that still works well.&lt;/p&gt;
&lt;h3&gt;
  
  
  Color-Coded Risk Levels
&lt;/h3&gt;

&lt;p&gt;Risk levels use universal color psychology:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Green&lt;/strong&gt; - Safe, standard terms&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Amber&lt;/strong&gt; - Caution, worth discussing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Red&lt;/strong&gt; - Danger, likely problematic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These colors are consistent across risk badges, prevalence bars, and analysis cards. You can glance at a clause and immediately understand its risk level.&lt;/p&gt;
&lt;h3&gt;
  
  
  Suggested Prompts
&lt;/h3&gt;

&lt;p&gt;Not everyone knows what questions to ask about a contract. The six suggested prompts serve as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Onboarding&lt;/strong&gt; - Showing users what's possible&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Efficiency&lt;/strong&gt; - Common questions answered with one click&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Discovery&lt;/strong&gt; - Revealing features users might not know about&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Streaming Responses
&lt;/h3&gt;

&lt;p&gt;Token-by-token streaming makes the AI feel more human and less like a loading bar. It also provides immediate feedback that the system is working. Users don't stare at a blank screen wondering if anything is happening.&lt;/p&gt;


&lt;h2&gt;
  
  
  Challenges and Learnings
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Challenge 1: Balancing Legal Accuracy with Accessibility
&lt;/h3&gt;

&lt;p&gt;Legal language exists for precision. Simplifying it risks losing important nuances. I solved this by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Providing both the original clause text and plain English side-by-side&lt;/li&gt;
&lt;li&gt;Including detailed "legal implications" sections for those who want depth&lt;/li&gt;
&lt;li&gt;Being honest about limitations (the disclaimer reminds users this isn't legal advice)&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Challenge 2: Handling Diverse Contract Formats
&lt;/h3&gt;

&lt;p&gt;Contracts vary wildly in structure. Some are 2 pages, others are 50. Some use headers, others are wall-to-wall text. The PDF extraction with Gemini handles this by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Preserving structure where possible&lt;/li&gt;
&lt;li&gt;Extracting text even from scanned/image PDFs&lt;/li&gt;
&lt;li&gt;Cleaning up formatting artifacts&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Challenge 3: Preventing AI Hallucination
&lt;/h3&gt;

&lt;p&gt;Early versions sometimes invented red flags that didn't exist. The solution was retrieval-augmented generation. Every analysis is now grounded in retrieved clause data from the index. The agent can only reference what it finds in the search results.&lt;/p&gt;
&lt;h3&gt;
  
  
  Challenge 4: Making Risk Scores Meaningful
&lt;/h3&gt;

&lt;p&gt;A simple "high risk" label isn't actionable. I added:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Prevalence scores&lt;/strong&gt; - "Only 20% of contracts include this"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Standard versions&lt;/strong&gt; - "Here's what fair looks like"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Specific red flags&lt;/strong&gt; - "This clause is concerning because..."&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These additions turn a vague warning into specific, actionable information.&lt;/p&gt;


&lt;h3&gt;
  
  
  Export Analysis as PDF
&lt;/h3&gt;

&lt;p&gt;Let users download a full risk report they can share with lawyers or keep for their records. Make it official and presentable.&lt;/p&gt;


&lt;h2&gt;
  
  
  What I Learned
&lt;/h2&gt;

&lt;p&gt;Building ContractCompass taught me that the best AI tools don't feel like AI tools. They feel like helpful conversations with knowledgeable friends. The key is combining:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Fast, semantic search&lt;/strong&gt; that finds the right information instantly&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Thoughtful prompting&lt;/strong&gt; that guides the AI to be helpful, not robotic&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real data&lt;/strong&gt; that grounds responses in facts, not hallucinations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Clear design&lt;/strong&gt; that makes complex information accessible&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Algolia Agent Studio made the first part possible. The rest was about understanding what people actually need when facing a contract: clarity, confidence, and actionable advice.&lt;/p&gt;


&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;ContractCompass demonstrates how conversational AI powered by fast, semantic search can democratize access to legal understanding. By combining Algolia Agent Studio's retrieval capabilities with a thoughtfully designed user experience, it transforms contract analysis from an intimidating expert task into an accessible conversation.&lt;/p&gt;

&lt;p&gt;The key insight: people don't need to become lawyers to understand their contracts. They just need the right questions answered in language they can understand, backed by real data about what's standard and what's not.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Try it yourself:&lt;/strong&gt; &lt;a href="https://contractcompass.varshithvhegde.in/" rel="noopener noreferrer"&gt;https://contractcompass.varshithvhegde.in/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feok2trmgk5oi4zv24klm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feok2trmgk5oi4zv24klm.png" alt="Landing Page"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  Built With
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Powered by:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.algolia.com/doc/guides/ai/agent-studio/" rel="noopener noreferrer"&gt;Algolia Agent Studio&lt;/a&gt; - Conversational AI with semantic search&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://react.dev" rel="noopener noreferrer"&gt;React&lt;/a&gt; + &lt;a href="https://www.typescriptlang.org/" rel="noopener noreferrer"&gt;TypeScript&lt;/a&gt; - Frontend framework&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://tailwindcss.com/" rel="noopener noreferrer"&gt;Tailwind CSS&lt;/a&gt; + &lt;a href="https://ui.shadcn.com/" rel="noopener noreferrer"&gt;shadcn/ui&lt;/a&gt; - UI components&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://deepmind.google/technologies/gemini/" rel="noopener noreferrer"&gt;Google Gemini&lt;/a&gt; - PDF text extraction&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;br&gt;


&lt;/p&gt;
&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/Varshithvhegde" rel="noopener noreferrer"&gt;
        Varshithvhegde
      &lt;/a&gt; / &lt;a href="https://github.com/Varshithvhegde/contract-compass" rel="noopener noreferrer"&gt;
        contract-compass
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;ContractCompass 🧭&lt;/h1&gt;
&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;AI-Powered Contract Analysis for Non-Lawyers&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Chat with AI to understand your contract. Identify risks, get plain-English explanations, and learn what to negotiate — powered by &lt;strong&gt;Algolia Agent Studio&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;📋 Table of Contents&lt;/h2&gt;
&lt;/div&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#overview" rel="noopener noreferrer"&gt;Overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#live-demo" rel="noopener noreferrer"&gt;Live Demo&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#key-features" rel="noopener noreferrer"&gt;Key Features&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#architecture" rel="noopener noreferrer"&gt;Architecture&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#technology-stack" rel="noopener noreferrer"&gt;Technology Stack&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#algolia-agent-studio-integration" rel="noopener noreferrer"&gt;Algolia Agent Studio Integration&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#how-it-works" rel="noopener noreferrer"&gt;How It Works&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#contract-types-supported" rel="noopener noreferrer"&gt;Contract Types Supported&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#sample-contracts" rel="noopener noreferrer"&gt;Sample Contracts&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#risk-assessment-system" rel="noopener noreferrer"&gt;Risk Assessment System&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#conversational-ai-capabilities" rel="noopener noreferrer"&gt;Conversational AI Capabilities&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#structured-risk-analysis" rel="noopener noreferrer"&gt;Structured Risk Analysis&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#pdf-extraction" rel="noopener noreferrer"&gt;PDF Extraction&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#algolia-search-index" rel="noopener noreferrer"&gt;Algolia Search Index&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#uiux-design" rel="noopener noreferrer"&gt;UI/UX Design&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#edge-functions" rel="noopener noreferrer"&gt;Edge Functions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#security" rel="noopener noreferrer"&gt;Security&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#getting-started" rel="noopener noreferrer"&gt;Getting Started&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Overview&lt;/h2&gt;
&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;ContractCompass&lt;/strong&gt; is an intelligent contract analysis tool designed to help everyday people — not lawyers — understand legal documents before they sign. Users upload or paste a contract, then have a real-time conversation with an AI agent that identifies risky clauses, explains legal jargon in plain English, and compares terms against industry standards.&lt;/p&gt;

&lt;p&gt;The AI agent is powered by &lt;strong&gt;Algolia Agent Studio&lt;/strong&gt;, which provides semantic search and retrieval of similar contract clauses…&lt;/p&gt;
&lt;/div&gt;


&lt;/div&gt;
&lt;br&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/Varshithvhegde/contract-compass" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;br&gt;
&lt;/div&gt;








&lt;p&gt;&lt;em&gt;ContractCompass is not a substitute for professional legal advice. Always consult a qualified attorney for legal matters.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>algoliachallenge</category>
      <category>ai</category>
      <category>agents</category>
    </item>
    <item>
      <title>Why Your AI Gateway Needs MCP Integration in 2026</title>
      <dc:creator>Varshith V Hegde</dc:creator>
      <pubDate>Mon, 02 Feb 2026 10:19:30 +0000</pubDate>
      <link>https://forem.com/varshithvhegde/why-your-ai-gateway-needs-mcp-integration-in-2026-3dcf</link>
      <guid>https://forem.com/varshithvhegde/why-your-ai-gateway-needs-mcp-integration-in-2026-3dcf</guid>
      <description>&lt;p&gt;You know that feeling when you've spent three hours debugging why your AI agent can't access your database for the third time this week? &lt;/p&gt;

&lt;p&gt;I was there last month. Five different tool integrations, each with its own authentication flow, error handling, and connection management. Want to add Slack notifications? Write another integration. Need file system access? Another one. Every integration was basically the same boilerplate with different endpoints.&lt;/p&gt;

&lt;p&gt;Then I found the Model Context Protocol and Bifrost. It sounded too good to be true one gateway, one protocol, unlimited tools. But it actually works, and it's probably the most practical shift in AI infrastructure you'll deal with this year.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's an AI Gateway and Why Should You Care?
&lt;/h2&gt;

&lt;p&gt;Think of an AI gateway as the central hub between your apps and multiple AI providers. Instead of writing separate code for OpenAI, Anthropic, Google, and others, you connect once to the gateway, and it handles the rest.&lt;/p&gt;

&lt;p&gt;The benefits are immediate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Automatic failover&lt;/strong&gt;: If one AI provider goes down, requests switch to another&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Load balancing&lt;/strong&gt;: Distribute requests across multiple API keys to avoid rate limits&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Caching&lt;/strong&gt;: Reduce costs and improve response times&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unified monitoring&lt;/strong&gt;: One place to track all your AI interactions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Bifrost is an AI gateway built in Go that adds only 11 microseconds of latency while handling 5,000 requests per second. When you're running production AI systems, those microseconds matter.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Model Context Protocol: USB-C for AI
&lt;/h2&gt;

&lt;p&gt;Anthropic introduced MCP in November 2024. Within a year, it became the industry standard. OpenAI adopted it in March 2025. Google DeepMind followed. By December 2025, it was donated to the Linux Foundation with backing from major tech companies.&lt;/p&gt;

&lt;p&gt;Here's why it matters: Before MCP, connecting an AI model to a new tool meant writing custom integration code. Every. Single. Time.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AI needs to search files? Custom code.&lt;/li&gt;
&lt;li&gt;Access a database? More custom code.&lt;/li&gt;
&lt;li&gt;Connect to Slack? Yet another integration.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This created what Anthropic called the "N×M problem" N models needing M different integrations, resulting in exponentially growing complexity.&lt;/p&gt;

&lt;p&gt;MCP solved this with a standardized protocol. Write an MCP server once for a tool, and any MCP-compatible AI client can use it. It's like USB-C for AI systems one standard connection instead of different cables for different devices.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem with Direct MCP Connections
&lt;/h2&gt;

&lt;p&gt;When you connect AI models directly to MCP servers, you run into scaling problems. Every request from the AI includes all available tool definitions in its context window. Connect to five MCP servers with 100 total tools, and every single request carries those 100 tool definitions even for simple queries that don't need tools.&lt;/p&gt;

&lt;p&gt;This creates three issues:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Wasted tokens&lt;/strong&gt;: Most of your context budget goes to tool catalogs instead of actual work. A six-turn conversation with 100 tools burns 600+ tokens just on definitions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Security gaps&lt;/strong&gt;: Tools can execute without validation or approval. No audit trail, no safety checks before destructive operations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Coordination overhead&lt;/strong&gt;: Each tool call requires a separate round trip to the AI model.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Bifrost Solves This
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8rervwbsltoiwyp1fdlf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8rervwbsltoiwyp1fdlf.png" alt="Bifrost" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Bifrost integrates MCP natively into the gateway itself. You get both AI provider management and tool orchestration through a single interface.&lt;/p&gt;

&lt;p&gt;It supports four connection types:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;In-process tools&lt;/strong&gt;: Run directly in Bifrost's memory with zero network overhead&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local MCP servers via STDIO&lt;/strong&gt;: For filesystem operations or database queries&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HTTP connections&lt;/strong&gt;: For remote microservices&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Server-Sent Events&lt;/strong&gt;: For real-time data streams&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The killer feature is &lt;strong&gt;Code Mode&lt;/strong&gt;. Instead of including hundreds of tool definitions in every request, Bifrost exposes just four meta-tools:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;listToolFiles()&lt;/code&gt; - Discover available servers&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;readToolFile(fileName)&lt;/code&gt; - Get tool signatures&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;getToolDocs(server, tool)&lt;/code&gt; - Get detailed documentation&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;executeToolCode(code)&lt;/code&gt; - Run Starlark (Python-like) code&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The AI writes Starlark code that orchestrates tools inside a sandboxed environment, and tool definitions load only when needed. This reduces token usage by 50%+ when using multiple MCP servers (3+). With 8-10 MCP servers (150+ tools), you avoid wasting context on massive tool catalogs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Started: A Real Example
&lt;/h2&gt;

&lt;p&gt;Let me show you how this works in practice. I'll walk through building a simple MCP server and connecting it to Bifrost.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Start Bifrost
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx &lt;span class="nt"&gt;-y&lt;/span&gt; @maximhq/bifrost
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Bifrost starts with zero configuration and opens at &lt;code&gt;localhost:8080&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Build a Simple MCP Server
&lt;/h3&gt;

&lt;p&gt;I created a Flask server with three tools: getting programming jokes, inspirational quotes, and basic calculations. Here's the core:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;flask&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Flask&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;jsonify&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;flask_cors&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;CORS&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Flask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nc"&gt;CORS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;jokes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Why do programmers prefer dark mode? Because light attracts bugs!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Why did the developer go broke? Because he used up all his cache!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="nd"&gt;@app.route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/sse&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;methods&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;POST&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handle_message&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;
    &lt;span class="n"&gt;method&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;method&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;method&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;initialize&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;jsonify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;jsonrpc&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2.0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;result&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;protocolVersion&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2024-11-05&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;capabilities&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tools&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{}},&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;serverInfo&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;example-server&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;version&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1.0.0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;method&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;tools/list&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;jsonify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;jsonrpc&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2.0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;result&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tools&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                    &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;get_joke&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Returns a random programming joke&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;inputSchema&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;object&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;properties&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{}}&lt;/span&gt;
                    &lt;span class="p"&gt;},&lt;/span&gt;
                    &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;calculate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Performs basic arithmetic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;inputSchema&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;object&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;properties&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;operation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;string&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;enum&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;add&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;multiply&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]},&lt;/span&gt;
                                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;a&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;number&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;number&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
                            &lt;span class="p"&gt;}&lt;/span&gt;
                        &lt;span class="p"&gt;}&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt;
                &lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;method&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;tools/call&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;tool_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;params&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;args&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;params&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;arguments&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{})&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;tool_name&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;get_joke&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;choice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;jokes&lt;/span&gt;&lt;span class="p"&gt;)}]}&lt;/span&gt;
        &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;tool_name&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;calculate&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;a&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;b&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;operation&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;add&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;
            &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;
            &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Result: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;answer&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}]}&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;jsonify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;jsonrpc&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2.0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;result&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run it with: &lt;code&gt;python mcp_server.py&lt;/code&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Configure Model Providers and Connect to Bifrost
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Setting Up Model Providers
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs16176xtk78e6xwxr1rp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs16176xtk78e6xwxr1rp.png" alt="Bifrost UI" width="800" height="435"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the Bifrost UI at &lt;code&gt;localhost:8080&lt;/code&gt;, navigate to &lt;strong&gt;Model Providers&lt;/strong&gt; in the left sidebar. You'll see a comprehensive list of supported providers including OpenAI, Anthropic, Google, AWS Bedrock, Azure, and many others.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fext19hjby77arx632h7w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fext19hjby77arx632h7w.png" alt="Model provider UI" width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Click on &lt;strong&gt;OpenAI&lt;/strong&gt; from the list, then click &lt;strong&gt;"+ Add new key"&lt;/strong&gt; in the top-right corner.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmsd2kdus8yg57cwp04ow.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmsd2kdus8yg57cwp04ow.png" alt="Model Providers" width="800" height="435"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Fill in the key configuration:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Name&lt;/strong&gt;: Give it a descriptive name like "Production Key"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API Key&lt;/strong&gt;: Enter your actual API key (e.g., &lt;code&gt;sk-proj-...&lt;/code&gt;) or use an environment variable like &lt;code&gt;env.OPENAI_KEY&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Models&lt;/strong&gt;: Click to select which models this key can access (e.g., &lt;code&gt;gpt-4o&lt;/code&gt;, &lt;code&gt;gpt-4o-mini&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Weight&lt;/strong&gt;: Set to &lt;code&gt;1&lt;/code&gt; for load balancing (higher weights receive proportionally more traffic)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use for Batch APIs&lt;/strong&gt;: Toggle this on if you want to use this key for batch operations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Click &lt;strong&gt;Save&lt;/strong&gt; to add the key. You'll see it appear in your configured keys list with its weight and enabled status.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pro tip:&lt;/strong&gt; For production setups, add multiple API keys for the same provider. Bifrost automatically distributes requests across them to avoid rate limits. You can also add keys from different providers (e.g., OpenAI and Google) for automatic failover.&lt;/p&gt;

&lt;h4&gt;
  
  
  Connecting Your MCP Server
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiz87yvrikvn4z0bzazli.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiz87yvrikvn4z0bzazli.png" alt="MCP server" width="800" height="435"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now go to &lt;strong&gt;MCP Gateway&lt;/strong&gt; in the left sidebar and click &lt;strong&gt;"New MCP Server"&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;Configuration:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Name&lt;/strong&gt;: &lt;code&gt;localmcp&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Connection Type&lt;/strong&gt;: &lt;code&gt;HTTP (Streamable)&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Connection URL&lt;/strong&gt;: &lt;code&gt;http://localhost:5000/sse&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ping Available for Health Check&lt;/strong&gt;: Enable this&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Bifrost immediately connects, discovers your tools, and shows them in "Available Tools."&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Use It
&lt;/h3&gt;

&lt;p&gt;Here's a Python client that uses everything together:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;

&lt;span class="n"&gt;BIFROST_URL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://localhost:8080&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;ask_ai&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;history&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;history&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;👤 You: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Send to AI via Bifrost
&lt;/span&gt;    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;BIFROST_URL&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/v1/chat/completions&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;openai/gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;assistant_msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;choices&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="c1"&gt;# Handle tool calls
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool_calls&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;assistant_msg&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🔧 AI is using &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;assistant_msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;tool_calls&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; tools...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;assistant_msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;tool_call&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;assistant_msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool_calls&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
            &lt;span class="c1"&gt;# Bifrost executes the tool on your MCP server
&lt;/span&gt;            &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;BIFROST_URL&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/v1/mcp/tool/execute&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool_call&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;tool_call&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

            &lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool_call_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;tool_call&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;tool_call&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;function&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;})&lt;/span&gt;

        &lt;span class="c1"&gt;# Get final response
&lt;/span&gt;        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;BIFROST_URL&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/v1/chat/completions&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;openai/gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="n"&gt;assistant_msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;choices&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;assistant_msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🤖 AI: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;assistant_msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;assistant_msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;history&lt;/span&gt;

&lt;span class="c1"&gt;# Try it
&lt;/span&gt;&lt;span class="nf"&gt;ask_ai&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Tell me a programming joke&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;ask_ai&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What is 25 times 4?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffoxx8438zteefyrx54dx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffoxx8438zteefyrx54dx.png" alt="Agent output" width="800" height="405"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  What Just Happened?
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Your script sends "What is 25 times 4?" to Bifrost&lt;/li&gt;
&lt;li&gt;Bifrost adds your MCP tools to the AI's context&lt;/li&gt;
&lt;li&gt;GPT-4 decides to use the &lt;code&gt;calculate&lt;/code&gt; tool&lt;/li&gt;
&lt;li&gt;Your script calls Bifrost's tool execution endpoint&lt;/li&gt;
&lt;li&gt;Bifrost sends a JSON-RPC request to your Flask server&lt;/li&gt;
&lt;li&gt;Your server calculates 25 × 4 = 100 and returns it&lt;/li&gt;
&lt;li&gt;The result goes back to GPT-4&lt;/li&gt;
&lt;li&gt;GPT-4 responds: "25 times 4 equals 100"&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The beautiful part? Clean separation of concerns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your client doesn't know MCP protocol details&lt;/li&gt;
&lt;li&gt;Bifrost handles all MCP communication&lt;/li&gt;
&lt;li&gt;The AI doesn't know your server implementation&lt;/li&gt;
&lt;li&gt;Your MCP server doesn't know which AI is calling it&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the power of standardization.&lt;/p&gt;

&lt;h2&gt;
  
  
  Security Matters
&lt;/h2&gt;

&lt;p&gt;In April 2025, researchers identified MCP security issues: prompt injection, permission combinations that could exfiltrate data, and lookalike tools.&lt;/p&gt;

&lt;p&gt;Bifrost addresses this with a "suggest, don't execute" model by default. When an AI proposes a tool call, nothing runs automatically. Your code reviews and approves each execution. You get full audit trails for compliance.&lt;/p&gt;

&lt;p&gt;You can configure Agent Mode for specific tools. Safe operations like reading files can auto-execute, while destructive operations require approval.&lt;/p&gt;

&lt;p&gt;For scenarios with many MCP servers (3+), you can enable Code Mode to reduce token usage.&lt;/p&gt;

&lt;p&gt;This configuration tells Bifrost to expose the four meta-tools instead of all tool definitions directly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Matters Now
&lt;/h2&gt;

&lt;p&gt;If you're building AI systems without MCP integration in 2026, you're solving yesterday's problems. The standardization is here. The ecosystem is mature. The question isn't whether to adopt MCP, but how quickly.&lt;/p&gt;

&lt;p&gt;Bifrost makes adoption straightforward:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Setup takes less than a minute&lt;/li&gt;
&lt;li&gt;Web UI makes configuration visual&lt;/li&gt;
&lt;li&gt;Open-source means you can examine and customize&lt;/li&gt;
&lt;li&gt;Native support for multiple connection types&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is infrastructure that matters. Not because it's flashy, but because it solves real problems every organization faces when building AI systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Get started with Bifrost:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GitHub: &lt;a href="https://git.new/bifrost" rel="noopener noreferrer"&gt;https://git.new/bifrost&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Documentation: &lt;a href="https://docs.getbifrost.ai" rel="noopener noreferrer"&gt;https://docs.getbifrost.ai&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Quick Start: &lt;a href="https://docs.getbifrost.ai/quickstart/gateway/setting-up" rel="noopener noreferrer"&gt;https://docs.getbifrost.ai/quickstart/gateway/setting-up&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Code Mode: &lt;a href="https://docs.getbifrost.ai/mcp/code-mode" rel="noopener noreferrer"&gt;https://docs.getbifrost.ai/mcp/code-mode&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Agent Mode: &lt;a href="https://docs.getbifrost.ai/mcp/agent-mode" rel="noopener noreferrer"&gt;https://docs.getbifrost.ai/mcp/agent-mode&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;MCP Overview: &lt;a href="https://docs.getbifrost.ai/mcp/overview" rel="noopener noreferrer"&gt;https://docs.getbifrost.ai/mcp/overview&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>webdev</category>
      <category>programming</category>
    </item>
    <item>
      <title>Top 5 LLM Gateways in 2026: A Deep-Dive Comparison for Production Teams</title>
      <dc:creator>Varshith V Hegde</dc:creator>
      <pubDate>Thu, 22 Jan 2026 01:41:56 +0000</pubDate>
      <link>https://forem.com/varshithvhegde/top-5-llm-gateways-in-2026-a-deep-dive-comparison-for-production-teams-34d2</link>
      <guid>https://forem.com/varshithvhegde/top-5-llm-gateways-in-2026-a-deep-dive-comparison-for-production-teams-34d2</guid>
      <description>&lt;p&gt;I spent the last few weeks researching LLM gateway solutions for production teams. Here's what I found after testing five different options, talking to engineering teams running them at scale, and breaking things in my staging environment.&lt;/p&gt;

&lt;p&gt;I didn't test every edge case. We focused on REST APIs with streaming responses, didn't test batch processing extensively, and our traffic patterns might be different from yours. But here's what I learned.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Production Teams Need LLM Gateways
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxkpcwz6l0dqag77jkxgw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxkpcwz6l0dqag77jkxgw.png" alt="LLM Gateway" width="800" height="429"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here's what happened when we didn't use one:&lt;/p&gt;

&lt;p&gt;Our application relied only on OpenAI. When they had an outage last month, our entire product went down. This created problems when we had customers waiting for support.&lt;/p&gt;

&lt;p&gt;Then there's cost. We were using GPT-4 for simple tasks that Claude Haiku could handle for one-tenth the price. One weekend of refactoring our routing logic saved us $3,000 per month.&lt;/p&gt;

&lt;p&gt;But managing multiple providers yourself creates its own problems. You end up writing custom code for each API, normalizing their different error formats, managing API keys, building retry logic from scratch, and spending hours debugging why Anthropic's rate limit response looks different from OpenAI's.&lt;/p&gt;

&lt;p&gt;LLM gateways solve this. One API for all providers. Automatic fallbacks. Cost tracking that works. And your application won't crash because one provider is having issues.&lt;/p&gt;

&lt;p&gt;Here are the five gateways that impressed me.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Bifrost (by Maxim AI)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1vz8eu5bal1yc64hwzqn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1vz8eu5bal1yc64hwzqn.png" alt="Bifrost" width="800" height="414"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What it is&lt;/strong&gt;: A high-performance LLM gateway built in Go. It's designed for speed and reliability.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for&lt;/strong&gt;: Customer-facing applications where latency matters. Real-time chat, high-traffic APIs, anything where users will notice if responses are slow.&lt;/p&gt;

&lt;p&gt;The performance numbers caught my attention first. In our synthetic load tests, Bifrost added about 11 microseconds of latency at 5,000 requests per second. When I ran the same test with LiteLLM (which is Python-based), it added around 50 microseconds.&lt;/p&gt;

&lt;p&gt;What really sold me was the P99 latency test. At 1,000 concurrent users, LiteLLM's slowest responses hit 28 seconds. Bifrost stayed under 50 milliseconds. If you're building a chatbot, that's the difference between users staying on your application and immediately leaving.&lt;/p&gt;

&lt;p&gt;Now, I didn't test this with burst traffic or serverless deployments - our setup is traditional Kubernetes. Your results might differ depending on your infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What makes it different&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;Smart load balancing that actually works. Bifrost was the first gateway I found that automatically routes requests based on real-time performance. It monitors which providers are healthy, routes around failures, and prevents you from hitting rate limits. Most gateways claim to do this, but Bifrost's implementation is noticeably better.&lt;/p&gt;

&lt;p&gt;It also has cluster mode built in, so you can run multiple instances without complicated setup. And here's what surprised me - it includes SSO, audit logs, team budgets, and role-based access control without adding latency. Most gateways make you choose between features and speed. Bifrost somehow does both.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Setup&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx &lt;span class="nt"&gt;-y&lt;/span&gt; @maximhq/bifrost

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In 30 seconds you have a gateway running with a web UI. Since it uses OpenAI's API format, integrating it is just changing your base URL. I had our staging environment switched over in under 10 minutes.&lt;/p&gt;

&lt;p&gt;Bifrost covers all the major providers - OpenAI, Anthropic, Google Vertex AI, AWS Bedrock, Azure OpenAI, Cohere, Mistral, Groq, Together AI, and Replicate. Plus they added support for any OpenAI-compatible endpoint, which means you can actually use custom or self-hosted models too.&lt;/p&gt;

&lt;p&gt;For most production use cases, you're using one of these major providers anyway. LiteLLM does have broader coverage and a more mature open-source community - they've been around longer with more contributors and community support. If that ecosystem and maximum provider choice matters more to you than raw performance, LiteLLM is a solid pick. But for our needs, Bifrost's speed and provider coverage were enough.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why we chose it&lt;/strong&gt;: For our use case (high-scale, customer-facing chat), the 11 microsecond overhead was too good to pass up. The enterprise features were a bonus we didn't expect at this performance level.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pricing&lt;/strong&gt;: Open-source and free to self-host. Enterprise support is available.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. LiteLLM
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fat7humnt3ehpd4ilipql.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fat7humnt3ehpd4ilipql.png" alt="LiteLLM" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is probably the most popular open-source LLM gateway. Python-based, with both an SDK and proxy server.&lt;/p&gt;

&lt;p&gt;If you're in a Python environment or need access to niche models, this is the default choice. The provider coverage is unmatched - over 100 providers including all the major ones (OpenAI, Anthropic, Google, Azure, AWS) plus specialized options like HuggingFace, Ollama, Replicate, Anyscale, and Perplexity.&lt;/p&gt;

&lt;p&gt;For Python developers, setup is straightforward:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;litellm&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;completion&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;completion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hello&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Switch to Claude without changing code
&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;completion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-4-sonnet&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hello&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Configuration uses YAML. The documentation is thorough, and there's a strong community.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where it breaks down&lt;/strong&gt;: Performance at scale. LiteLLM is written in Python using FastAPI. At low to moderate traffic, it performs well. But in our load tests, the limitations showed clearly.&lt;/p&gt;

&lt;p&gt;At 500 requests per second, P99 latency hit 28 seconds. At 1,000 requests per second, it crashed - ran out of memory and started failing requests. The Python GIL and async overhead become real bottlenecks when handling thousands of concurrent requests.&lt;/p&gt;

&lt;p&gt;I saw this in our staging environment. At 200 requests per second, everything ran smoothly. When I simulated higher traffic (around 2,000 requests per second), LiteLLM started timing out. Memory usage increased to over 8GB, and we got cascading failures.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When to use it&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Development and testing environments&lt;/li&gt;
&lt;li&gt;Prototyping and trying different models&lt;/li&gt;
&lt;li&gt;Internal tools with moderate traffic (under 500 RPS)&lt;/li&gt;
&lt;li&gt;When you need access to 100+ providers&lt;/li&gt;
&lt;li&gt;Python-first teams where ecosystem fit matters&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;When to avoid it&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Customer-facing applications at scale&lt;/li&gt;
&lt;li&gt;Real-time features where every millisecond counts&lt;/li&gt;
&lt;li&gt;Production workloads requiring 99.9%+ uptime&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The ecosystem is mature with active development, but if you're planning to handle thousands of requests per second in production, you'll likely hit performance issues.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pricing&lt;/strong&gt;: Fully open-source and free. You pay for hosting it yourself.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Portkey
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjjqckdxeyshj78dshmap.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjjqckdxeyshj78dshmap.png" alt="PortKey" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Portkey is more than just a gateway - it's a full AI control plane with routing, observability, guardrails, and governance.&lt;/p&gt;

&lt;p&gt;The observability depth is what sets it apart. Every request gets full traces showing you which user made the call, which models were tried, why they failed, which fallback was used, how long each step took, and the exact cost. This isn't just logging - it's distributed tracing for AI.&lt;/p&gt;

&lt;p&gt;When our staging environment started using too many tokens, Portkey's traces showed us exactly which user and which prompt was causing it. That level of detail is valuable when debugging production issues.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;portkey_ai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Portkey&lt;/span&gt;

&lt;span class="n"&gt;portkey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Portkey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-portkey-key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;virtual_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-provider-virtual-key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;portkey&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hello&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Enterprise features&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;PII detection, content filtering, prompt injection detection&lt;/li&gt;
&lt;li&gt;SOC 2, HIPAA, GDPR compliance with full audit trails&lt;/li&gt;
&lt;li&gt;SSO/SAML, team permissions, role-based access&lt;/li&gt;
&lt;li&gt;Data residency controls&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;According to their team, they handle over 10 billion requests monthly with 99.9999% uptime. I couldn't independently verify this, but the platform felt stable during our testing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The tradeoff&lt;/strong&gt;: I measured latency overhead of 20-40 milliseconds when using advanced features like guardrails and detailed tracing. For a small team that just needs basic routing, Portkey is probably more than necessary. The learning curve is also steeper than simpler gateways.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why we didn't choose it&lt;/strong&gt;: For our use case, the added latency and complexity weren't worth the governance features we didn't need yet. But I talked to a healthcare company using Portkey specifically for PII detection. Every LLM request gets scanned for protected health information, logged with full audit trails, and only routed to HIPAA-compliant providers. For them, the compliance features justified the cost.&lt;/p&gt;

&lt;p&gt;If you're in a regulated industry or managing AI across multiple teams with governance requirements, Portkey's observability is among the best available.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pricing&lt;/strong&gt;: Free tier for development | Starts at $49/month | Enterprise custom pricing&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Kong AI Gateway
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjwj4rop15l4h17f123cm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjwj4rop15l4h17f123cm.png" alt="Kong AI" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Kong's API Gateway with AI-specific features added. If you're already using Kong, this is worth looking at.&lt;/p&gt;

&lt;p&gt;Kong brings decades of API gateway experience to LLM routing - authentication, rate limiting, security, and observability at large scale. All the infrastructure pieces that matter when running production workloads.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install AI Proxy plugin&lt;/span&gt;
curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8001/services/ai-service/plugins &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--data&lt;/span&gt; &lt;span class="s2"&gt;"name=ai-proxy"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--data&lt;/span&gt; &lt;span class="s2"&gt;"config.route_type=llm/v1/chat"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--data&lt;/span&gt; &lt;span class="s2"&gt;"config.auth.header_name=Authorization"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--data&lt;/span&gt; &lt;span class="s2"&gt;"config.model.provider=openai"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--data&lt;/span&gt; &lt;span class="s2"&gt;"config.model.name=gpt-4"&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;AI-specific capabilities&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Unified API across OpenAI, Anthropic, AWS Bedrock, Azure AI, Google Vertex&lt;/li&gt;
&lt;li&gt;RAG pipelines built in&lt;/li&gt;
&lt;li&gt;PII removal across 12 languages&lt;/li&gt;
&lt;li&gt;Content filtering and safety controls&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Where this makes sense&lt;/strong&gt;: You're already using Kong for API management. That's the primary reason to choose this. The integration with existing Kong infrastructure is seamless, and you get unified observability across all your APIs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where it doesn't&lt;/strong&gt;: If you're not already on Kong, the learning curve is significant. It's built for large enterprises, not small teams needing quick deployment. We evaluated this briefly but decided it was more complexity than we needed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pricing&lt;/strong&gt;: Available through Kong Konnect (managed) or self-hosted | Enterprise custom pricing&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Helicone AI Gateway
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjwluintjiw362u6ff363.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjwluintjiw362u6ff363.png" alt="HeliconeAI" width="800" height="441"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Started as an observability platform, recently launched a Rust-based gateway. Lightweight and fast.&lt;/p&gt;

&lt;p&gt;Built in Rust, Helicone achieves around 8ms P50 latency with sub-5ms overhead even under load, based on what their team shared with me. The gateway ships as a single 15MB binary that runs anywhere.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Run with npx&lt;/span&gt;
npx @helicone/ai-gateway

&lt;span class="c"&gt;# Or with Docker&lt;/span&gt;
docker run &lt;span class="nt"&gt;-p&lt;/span&gt; 8787:8787 helicone/ai-gateway

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The observability is their core strength - request-level tracing, user tracking, cost forecasting, performance analytics, and real-time alerts. It's as comprehensive as Portkey's but with less complexity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Flexible deployment&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cloud-hosted (managed service)&lt;/li&gt;
&lt;li&gt;Self-hosted (full control)&lt;/li&gt;
&lt;li&gt;Hybrid (self-host gateway, use cloud observability)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The consideration&lt;/strong&gt;: The gateway is newer (launched mid-2024). Core routing is solid, but some advanced enterprise features are still developing. For most teams this isn't a problem, but large enterprises might want to validate specific requirements first.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pricing&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Gateway: Open-source and free to self-host&lt;/li&gt;
&lt;li&gt;Observability: Starts free, then $20/month for 100,000 requests&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The separation is smart - you can self-host for free and only pay for observability if you want it.&lt;/p&gt;




&lt;h2&gt;
  
  
  How to Choose
&lt;/h2&gt;

&lt;p&gt;After evaluating these gateways, here's what I learned:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Choose Bifrost if&lt;/strong&gt;: Performance is critical. You're handling 5,000+ requests per second, serving customer-facing features, or building real-time applications where latency matters. The 11 microsecond overhead is hard to beat.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Choose LiteLLM if&lt;/strong&gt;: You're in a Python environment with moderate traffic (under 500 RPS). The provider coverage is unmatched - over 100 models including specialized ones. Great for development, prototyping, and internal tools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Choose Portkey if&lt;/strong&gt;: You're in a regulated industry needing compliance controls (HIPAA, SOC 2) or managing AI across multiple teams. The observability and governance features are excellent, but you'll pay for it in latency (20-40ms overhead).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Choose Kong if&lt;/strong&gt;: You're already using Kong for API management. Otherwise, the learning curve probably isn't worth it unless you're a large enterprise needing infrastructure-level control.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Choose Helicone if&lt;/strong&gt;: You want performance and observability without enterprise complexity. Good for teams with data residency requirements who want self-hosted infrastructure with cloud monitoring.&lt;/p&gt;




&lt;h2&gt;
  
  
  Questions?
&lt;/h2&gt;

&lt;p&gt;Have you deployed LLM gateways in production? What did you choose and why? What surprised you?&lt;/p&gt;

&lt;p&gt;Still evaluating options? I can help with specific questions about performance, integration, or cost modeling at your scale. Leave a comment below.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>programming</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Daily Echo - Your Life in Motion 🎥</title>
      <dc:creator>Varshith V Hegde</dc:creator>
      <pubDate>Sat, 03 Jan 2026 16:23:33 +0000</pubDate>
      <link>https://forem.com/varshithvhegde/daily-echo-your-life-in-motion-2938</link>
      <guid>https://forem.com/varshithvhegde/daily-echo-your-life-in-motion-2938</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/mux-2025-12-03"&gt;DEV's Worldwide Show and Tell Challenge Presented by Mux&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;Daily Echo is a private video journaling app where you record 1-minute daily video diaries. It's like having a conversation with your future self. The app helps you track your mood, reflect on your experiences, and create a visual archive of your life that you can revisit anytime.&lt;/p&gt;

&lt;h2&gt;
  
  
  My Pitch Video
&lt;/h2&gt;

&lt;p&gt;

&lt;iframe src="https://player.mux.com/01jDdJgPd01TNb2NfuOkBkPc027sNGvuzokCpE7g01Qcb4c" width="710" height="399"&gt;
&lt;/iframe&gt;



&lt;/p&gt;

&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Live App&lt;/strong&gt;: &lt;a href="https://dailyecho.varshithvhegde.in/" rel="noopener noreferrer"&gt;https://dailyecho.varshithvhegde.in/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/Varshithvhegde/dailyecho" rel="noopener noreferrer"&gt;https://github.com/Varshithvhegde/dailyecho&lt;/a&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/Varshithvhegde" rel="noopener noreferrer"&gt;
        Varshithvhegde
      &lt;/a&gt; / &lt;a href="https://github.com/Varshithvhegde/dailyecho" rel="noopener noreferrer"&gt;
        dailyecho
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      DailyEcho - A beautiful, private video journaling app that lets you record daily video diary entries, track your mood over time, and relive your memories through immersive story modes and interactive visual walls.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;🎥 Daily Echo&lt;/h1&gt;
&lt;/div&gt;
&lt;p&gt;A beautiful, private video journaling app that lets you record daily video diary entries, track your mood over time, and relive your memories through immersive story modes and interactive visual walls.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;✨ Features&lt;/h2&gt;
&lt;/div&gt;
&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;🎬 Immersive Story Modes (New!)&lt;/h3&gt;
&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Memory Stories&lt;/strong&gt; - Watch your entries in a sequential, story-like format similar to social media.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auto-Curated Playlists&lt;/strong&gt; - Choose from "Recent Moments", "Moments of Joy" (happy/excited/grateful), or "Flashback" (random picks from the past).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Smooth Navigation&lt;/strong&gt; - Interactive progress bars, auto-advance, and gesture/keyboard support.&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;🧱 Echo Wall (Mosaic Mode)&lt;/h3&gt;

&lt;/div&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Living Visual History&lt;/strong&gt; - A dynamic masonry grid of your life in motion.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Living Video Tiles&lt;/strong&gt; - Each tile plays a Mux-generated animated GIF preview simultaneously for a "Harry Potter" newspaper effect.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Interactive Previews&lt;/strong&gt; - Retro CRT scanline overlays and cinematic hover effects.&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;📹 Video Recording &amp;amp; Playback&lt;/h3&gt;

&lt;/div&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Mux-powered streaming&lt;/strong&gt; - Professional-grade video processing and playback with adaptive streaming.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mux GIFs&lt;/strong&gt;…&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/Varshithvhegde/dailyecho" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Testing Credentials:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Email: &lt;code&gt;test@gmail.com&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Password: &lt;code&gt;devtest&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Detailed Explanation
&lt;/h2&gt;

&lt;p&gt;

&lt;iframe src="https://player.mux.com/CCj4qM26bpO6r6Zlx37CFqh01dDZNAaYmt9FaJXDPkEY" width="710" height="399"&gt;
&lt;/iframe&gt;



&lt;/p&gt;

&lt;h2&gt;
  
  
  The Story Behind It
&lt;/h2&gt;

&lt;p&gt;I have a terrible memory. Seriously. Ask me what I did last Tuesday and I'll draw a blank. But I've always been fascinated by the idea of looking back at my life, especially when the end of the year rolls around and everyone's doing their "year in review" thing.&lt;/p&gt;

&lt;p&gt;I wanted to create something that would help me remember the small moments - not just the big events, but the everyday stuff. What was I thinking about on a random Wednesday in March? How did I feel when that thing happened at work? What was going through my mind during that phase of my life?&lt;/p&gt;

&lt;p&gt;The idea was simple: record a 1-minute video every day. Just sit down, talk to the camera like you're talking to a friend, and capture whatever's on your mind. But I didn't want it to feel like a chore. I wanted it to be something I'd actually look forward to doing.&lt;/p&gt;

&lt;p&gt;So I built Daily Echo with features that make revisiting your memories feel magical. The Echo Wall shows all your entries as living video tiles playing simultaneously (like those moving newspapers in Harry Potter). Memory Stories let you watch your entries in sequence, almost like watching a documentary about your own life. And the Time Capsule feature shows you what you were up to exactly one month or one year ago.&lt;/p&gt;

&lt;p&gt;It's been incredibly powerful for me personally. There's something about being able to go back and watch yourself from months ago, seeing how you've grown or changed, or just remembering moments you'd completely forgotten.&lt;/p&gt;

&lt;h2&gt;
  
  
  Technical Highlights
&lt;/h2&gt;

&lt;p&gt;Daily Echo is built with React 18, TypeScript, and Vite on the frontend, with Tailwind CSS and shadcn/ui for the design. The backend runs on Supabase, handling PostgreSQL database, authentication, and edge functions.&lt;/p&gt;

&lt;p&gt;What makes the app special technically:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Living Video Previews Everywhere&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Every entry card in the timeline shows an animated GIF preview that plays automatically. When you hover over the Echo Wall (our mosaic view), you see all your memories playing at once. It creates this incredible "living history" effect that static thumbnails just can't match.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. AI-Powered Insights&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Using OpenAI's GPT-4o-mini, the app automatically analyzes your video transcripts to generate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Two-sentence summaries of each entry&lt;/li&gt;
&lt;li&gt;Emotional sentiment detection&lt;/li&gt;
&lt;li&gt;Personalized daily advice based on what you talked about&lt;/li&gt;
&lt;li&gt;Mood tracking over time&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;3. Immersive Story Modes&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You can watch your entries in different ways:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Recent Moments&lt;/strong&gt;: Your latest recordings in sequence&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Moments of Joy&lt;/strong&gt;: Auto-curated playlist of happy entries&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Flashback&lt;/strong&gt;: Random picks from your past&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each mode has interactive progress bars, auto-advance, and keyboard controls for a cinematic experience.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Gamification That Actually Matters&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Achievement badges like "Zen Master" (recorded before 6 AM), "Night Owl" (recorded after 10 PM), and "Weekend Warrior" (weekend recordings) make the habit more engaging. You can track your recording streaks and see your mood variety over time.&lt;/p&gt;

&lt;h3&gt;
  
  
  Use of Mux
&lt;/h3&gt;

&lt;p&gt;Mux is the heart and soul of Daily Echo. Here's how I'm using it:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Professional Video Infrastructure&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When you record a video, it goes through Mux's direct upload API. No dealing with complicated encoding pipelines or storage headaches. Mux handles everything: transcoding, optimization, and adaptive streaming. The result? Your videos play smoothly on any device, any connection speed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Automatic Transcription&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This was a game-changer. By enabling Mux's transcription feature during upload, I get accurate text transcripts of every video entry automatically. These transcripts power the AI analysis, search functionality, and accessibility features. I didn't have to integrate a separate transcription service or worry about accuracy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Animated GIF Previews&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Instead of static thumbnails, every entry shows a living preview using Mux's GIF generation API. You can watch all your memories playing simultaneously in the Echo Wall view. It's like having a magical photo album where every picture moves. Mux generates these GIFs automatically from your video without any extra work on my end.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Reliable Streaming with Mux Player&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The integrated Mux Player component handles playback with built-in caption support. It just works - no buffering issues, no format compatibility problems, no manual quality switching needed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Webhook Integration&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Mux's webhook system notifies my Supabase edge function when videos are ready, when transcripts are available, and if anything goes wrong. This lets me update the UI in real-time and handle the entire video lifecycle automatically.&lt;/p&gt;

&lt;p&gt;The developer experience with Mux has been fantastic. The documentation is clear, the API is intuitive, and features like automatic transcription and GIF generation saved me weeks of development time. Instead of building video infrastructure, I could focus on making the journaling experience special.&lt;/p&gt;

&lt;p&gt;What really impressed me: I initially thought I'd need separate services for video hosting, transcription, and preview generation. Mux does all of this out of the box, and it scales effortlessly. When a user records their 100th video, it works just as smoothly as their first.&lt;/p&gt;




&lt;p&gt;I hope Daily Echo inspires others to start capturing their daily thoughts. Life moves fast, and our memories fade faster. Having a video archive of your own life is like having a superpower - you can literally go back in time and remember who you were and what mattered to you at any moment.&lt;/p&gt;

&lt;p&gt;Give it a try with the test credentials above, and maybe start your own daily echo habit!&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>muxchallenge</category>
      <category>showandtell</category>
      <category>video</category>
    </item>
    <item>
      <title>My 2025 wrap</title>
      <dc:creator>Varshith V Hegde</dc:creator>
      <pubDate>Wed, 31 Dec 2025 15:16:32 +0000</pubDate>
      <link>https://forem.com/varshithvhegde/my-2025-wrap-ek0</link>
      <guid>https://forem.com/varshithvhegde/my-2025-wrap-ek0</guid>
      <description>&lt;p&gt;2025 was a rollercoaster for me. Looking back, I can clearly divide it into two distinct halves. One that tested me, and another that transformed me.&lt;/p&gt;

&lt;h2&gt;
  
  
  The First Half: Climbing the Corporate Ladder
&lt;/h2&gt;

&lt;p&gt;January started strong. I got promoted at work, which was amazing! Though it was mostly a position bump, I was actually leading a project as an Associate Engineer. The best part? We went from 5 days in the office to just 2 days a week. But honestly, I still chose to work from the office most days because I was so invested in the project.&lt;/p&gt;

&lt;p&gt;Then came February, the "love month," and let's just say things didn't go as planned. I hit one of the lowest points in my life.&lt;/p&gt;

&lt;p&gt;But you know what kept me going? I dove into spirituality and writing poems. These became my anchors during the tough times.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Turning Point: When Everything Changed
&lt;/h2&gt;

&lt;p&gt;And then came the moment that changed everything.&lt;/p&gt;

&lt;p&gt;I had this massive challenge at work. Our tool was taking 8 minutes just to load an MF4 (MDF file). EIGHT MINUTES. And that's before any computation! We were using the asammdf package in Python, which is good, but painfully slow.&lt;/p&gt;

&lt;p&gt;I became obsessed with solving this. I researched everything. Tried JIT compilation, which improved computation time but not the loading. Then I had this wild idea: what if I rewrote the entire package in Rust?&lt;/p&gt;

&lt;p&gt;This wasn't just a work task anymore. This was MY mission. I worked my regular job during the day and coded this project at night. I went deep into understanding how MDF files work at the byte level, implementing a custom package specifically for our project. Countless all-nighters, endless debugging sessions, but when it finally worked? Pure magic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;10 seconds.&lt;/strong&gt; That's all it took to load a 4GB file AND do the computation (which I also replaced with Rust). Only the UI remained in Python.&lt;/p&gt;

&lt;p&gt;This earned me so much respect at work. Due to NDA, I can't share the code or methods (corporate life, you know), but the fact that I pulled this off still makes me proud. This was my turning point.&lt;/p&gt;

&lt;h2&gt;
  
  
  Reigniting the Developer Within
&lt;/h2&gt;

&lt;p&gt;Huge shoutout to Jess and Ben for their weekly "What was your win this week?" posts. Reading those comments and seeing everyone's achievements? That reignited my inner developer. I wanted to be part of that energy again.&lt;/p&gt;

&lt;p&gt;I restarted my dev.to journey, but I was rusty. I couldn't figure out what to write about.&lt;/p&gt;

&lt;p&gt;Then I discovered &lt;strong&gt;DEV Challenges&lt;/strong&gt;, and wow, what a gold mine! I could build, showcase, learn, and enjoy all at once. Every weekend became about tackling a new challenge. This was exactly what I needed to grow and fall in love with coding all over again.&lt;/p&gt;

&lt;h2&gt;
  
  
  Discovering New Heights (Literally!)
&lt;/h2&gt;

&lt;p&gt;In the second half of the year, I found another passion. &lt;strong&gt;Trekking&lt;/strong&gt;. I climbed Asia's 1st and 2nd largest monolithic rocks! Out of 9 total treks, 8 happened in the second half alone.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq2vgqmpc6omlrsica7sb.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq2vgqmpc6omlrsica7sb.jpg" alt="Trekking with friends"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I also did my first solo travel to &lt;strong&gt;Hampi&lt;/strong&gt;. I know it's not far, but for me, it was a huge achievement. Plus, I met some amazing people along the way!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn5r7r2awbzwofsheovsc.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn5r7r2awbzwofsheovsc.jpg" alt="Hampi with friends"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Projects That Keep On Giving
&lt;/h2&gt;

&lt;p&gt;Some of my older projects surprised me this year:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;

&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;a href="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" class="article-body-image-wrapper"&gt;&lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;&lt;/a&gt;
      &lt;a href="https://github.com/Varshithvhegde" rel="noopener noreferrer"&gt;
        Varshithvhegde
      &lt;/a&gt; / &lt;a href="https://github.com/Varshithvhegde/FreeShare" rel="noopener noreferrer"&gt;
        FreeShare
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      FreeShare is a free online file sharing platform designed to simplify the process of sharing files without the need for any sign-up or verification.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;FreeShare: File Sharing Platform&lt;/h1&gt;
&lt;/div&gt;
&lt;p&gt;&lt;a rel="noopener noreferrer" href="https://github.com/Varshithvhegde/FreeShare/./public/assets/landingPage-8e480441-9785-40ca-9f26-d2b48cecc688"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fgithub.com%2FVarshithvhegde%2FFreeShare%2F.%2Fpublic%2Fassets%2FlandingPage-8e480441-9785-40ca-9f26-d2b48cecc688" alt="thumbnail"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;🗂️ Description&lt;/h2&gt;
&lt;/div&gt;
&lt;p&gt;FreeShare is a file sharing platform built with React, Firebase, and Cloud Functions. This project allows users to share files easily and efficiently. It's designed for individuals and teams who need a simple and secure way to share files.&lt;/p&gt;
&lt;p&gt;The platform provides a user-friendly interface for uploading, sharing, and managing files. With FreeShare, you can share files with others by generating a unique link, and recipients can access the files without needing to create an account.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;✨ Key Features&lt;/h2&gt;
&lt;/div&gt;
&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;&lt;strong&gt;File Sharing&lt;/strong&gt;&lt;/h3&gt;

&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;Upload and share files with others via a unique link&lt;/li&gt;
&lt;li&gt;Supports various file types, including documents, images, and videos&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;&lt;strong&gt;Security and Authentication&lt;/strong&gt;&lt;/h3&gt;

&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;Secure file storage using Firebase Storage&lt;/li&gt;
&lt;li&gt;Authentication and authorization using Firebase Authentication&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;&lt;strong&gt;User Interface&lt;/strong&gt;&lt;/h3&gt;

&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;Responsive and user-friendly interface built with React and Material-UI&lt;/li&gt;
&lt;li&gt;Easy navigation and file management&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;🗂️ Folder Structure&lt;/h2&gt;

&lt;/div&gt;

  &lt;div class="js-render-enrichment-target"&gt;
    &lt;div class="render-plaintext-hidden"&gt;
      &lt;pre&gt;graph TD
src--&amp;gt;components
src--&amp;gt;App.test.js;
src--&amp;gt;index.js;
src--&amp;gt;reportWebVitals.js;
src--&amp;gt;setupTests.js;
public--&amp;gt;index.html;
public--&amp;gt;manifest.json;
public--&amp;gt;robots.txt;&lt;/pre&gt;…&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/Varshithvhegde/FreeShare" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
FreeShare is a free online file-sharing platform that needs no sign-up or verification. I built it 2 years ago in college when I was just starting out. I honestly thought it was dead, but when I checked Firebase recently... 10K total users! People are still using it, and my blog post about it still gets views. Sure, it has flaws, but hey, we all start somewhere.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;

&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;a href="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" class="article-body-image-wrapper"&gt;&lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;&lt;/a&gt;
      &lt;a href="https://github.com/Varshithvhegde" rel="noopener noreferrer"&gt;
        Varshithvhegde
      &lt;/a&gt; / &lt;a href="https://github.com/Varshithvhegde/notepage" rel="noopener noreferrer"&gt;
        notepage
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      NotePage is a web application that allows you to easily share code, text, or any content using a unique link.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;&lt;a href="https://notepage.vercel.app" rel="nofollow noopener noreferrer"&gt;NotePage&lt;/a&gt;&lt;/h1&gt;
&lt;/div&gt;
&lt;p&gt;&lt;a rel="noopener noreferrer" href="https://private-user-images.githubusercontent.com/80502833/277675360-a94e6729-3305-4380-94ea-7f2ac01c81be.png?jwt=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NzQ2NDE5NjUsIm5iZiI6MTc3NDY0MTY2NSwicGF0aCI6Ii84MDUwMjgzMy8yNzc2NzUzNjAtYTk0ZTY3MjktMzMwNS00MzgwLTk0ZWEtN2YyYWMwMWM4MWJlLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNjAzMjclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjYwMzI3VDIwMDEwNVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTAyMmQ4MTYxNDA5NDRjMTIxYmJmZWVkODc3Nzg3MjZlOWU3Y2VmMzEwNTE0YmIyNTNiZDdiMDI2NTFjMWQ5MzkmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.Rb2RNP2TOLX327786kGMDNbcapWPnv2JhV4tZY3BbfA"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fprivate-user-images.githubusercontent.com%2F80502833%2F277675360-a94e6729-3305-4380-94ea-7f2ac01c81be.png%3Fjwt%3DeyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NzQ2NDE5NjUsIm5iZiI6MTc3NDY0MTY2NSwicGF0aCI6Ii84MDUwMjgzMy8yNzc2NzUzNjAtYTk0ZTY3MjktMzMwNS00MzgwLTk0ZWEtN2YyYWMwMWM4MWJlLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNjAzMjclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjYwMzI3VDIwMDEwNVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTAyMmQ4MTYxNDA5NDRjMTIxYmJmZWVkODc3Nzg3MjZlOWU3Y2VmMzEwNTE0YmIyNTNiZDdiMDI2NTFjMWQ5MzkmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.Rb2RNP2TOLX327786kGMDNbcapWPnv2JhV4tZY3BbfA" alt="frame_generic_light"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;NotePage&lt;/strong&gt; is a web application that allows you to easily share code, text, or any content using a unique link. You can create new note pages by simply visiting &lt;code&gt;https://notepage.vercel.app&lt;/code&gt;.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;Features&lt;/h3&gt;
&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Custom Pages&lt;/strong&gt;: Create your own custom pages to share content with others. Just use &lt;code&gt;https://notepage.vercel.app/&amp;lt;your-page-name&amp;gt;&lt;/code&gt; and start sharing.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Password Protection&lt;/strong&gt;: Optionally protect your pages with a password, ensuring that only authorized users can access your content.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Real-time Collaboration&lt;/strong&gt;: Collaborate with others in real-time. When multiple users access the same link, any changes made by one user are instantly visible to others, without requiring a page refresh.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Shareable Links&lt;/strong&gt;: Share your pages with others by sending them the unique link.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;Tech Stack&lt;/h3&gt;

&lt;/div&gt;
&lt;p&gt;NotePage is built using the following technologies:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Angular&lt;/strong&gt;: A powerful and popular front-end framework.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Firebase&lt;/strong&gt;: A real-time cloud database, authentication, and hosting platform.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Angular Material&lt;/strong&gt;: A UI component library…&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/Varshithvhegde/notepage" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
This was just a side project I built while learning Angular. The UI wasn't great, and it's full of bugs. But here's the thing: my office has a strict NO ChatGPT policy (we can only use a company AI), and copying/sharing text was difficult. When a friend needed to share something, I suggested notepage, and it blew up within the office! I tried improving the UI, but people love the old version, so I kept it. Now I'm almost hitting free tier limits, but I'll keep it free because projects like these taught me so much and drove me to write for the DEV community.&lt;/p&gt;
&lt;h2&gt;
  
  
  The DEV Community Love
&lt;/h2&gt;

&lt;p&gt;And then came the end of the year. Oh wow.&lt;/p&gt;

&lt;p&gt;I finally reached &lt;strong&gt;10K followers&lt;/strong&gt;! I remember celebrating my first 1K like it was yesterday (it was actually 2 years ago).&lt;/p&gt;

&lt;p&gt;

&lt;/p&gt;
&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/varshithvhegde/achieving-1k-followers-on-devto-my-journey-to-success-201n" class="crayons-story__hidden-navigation-link"&gt;Achieving 1K Followers on dev.to: My Journey to Success&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/varshithvhegde" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F885064%2F4ab304f4-a3f3-409c-8217-9ce130e57c18.jpeg" alt="varshithvhegde profile" class="crayons-avatar__image"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/varshithvhegde" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Varshith V Hegde
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Varshith V Hegde
                &lt;a href="/++"&gt;&lt;img alt="Subscriber" class="subscription-icon" src="https://assets.dev.to/assets/subscription-icon-805dfa7ac7dd660f07ed8d654877270825b07a92a03841aa99a1093bd00431b2.png"&gt;&lt;/a&gt;
              
              &lt;div id="story-author-preview-content-1376964" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/varshithvhegde" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F885064%2F4ab304f4-a3f3-409c-8217-9ce130e57c18.jpeg" class="crayons-avatar__image" alt=""&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Varshith V Hegde&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/varshithvhegde/achieving-1k-followers-on-devto-my-journey-to-success-201n" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;Feb 23 '23&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/varshithvhegde/achieving-1k-followers-on-devto-my-journey-to-success-201n" id="article-link-1376964"&gt;
          Achieving 1K Followers on dev.to: My Journey to Success
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/career"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;career&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/productivity"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;productivity&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/networking"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;networking&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/mentorship"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;mentorship&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/varshithvhegde/achieving-1k-followers-on-devto-my-journey-to-success-201n" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/fire-f60e7a582391810302117f987b22a8ef04a2fe0df7e3258a5f49332df1cec71e.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/raised-hands-74b2099fd66a39f2d7eed9305ee0f4553df0eb7b4f11b01b6b1b499973048fe5.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;24&lt;span class="hidden s:inline"&gt; reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/varshithvhegde/achieving-1k-followers-on-devto-my-journey-to-success-201n#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              Comments


              18&lt;span class="hidden s:inline"&gt; comments&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            3 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;




&lt;p&gt;I was also a &lt;strong&gt;Top Weekly Author twice&lt;/strong&gt;! My DEV profile is practically part of my resume now. Whenever I'm in an interview, I proudly show it and talk about my journey. Why not? I've worked hard for this.&lt;/p&gt;

&lt;p&gt;I participated in 5 DEV Challenges and &lt;strong&gt;won 2 of them&lt;/strong&gt;. These challenges helped me grow immensely. A huge thank you to the entire DEV Team for creating such an amazing initiative!&lt;/p&gt;

&lt;h2&gt;
  
  
  Shoutouts
&lt;/h2&gt;

&lt;p&gt;Some amazing devs/writers whose content I absolutely loved this year:&lt;/p&gt;

&lt;p&gt;

&lt;/p&gt;
&lt;div class="ltag__user ltag__user__id__3226798"&gt;
    &lt;a href="/axrisi" class="ltag__user__link profile-image-link"&gt;
      &lt;div class="ltag__user__pic"&gt;
        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3226798%2F0c0a8594-658c-4146-a639-8068ede85f67.jpg" alt="axrisi image"&gt;
      &lt;/div&gt;
    &lt;/a&gt;
  &lt;div class="ltag__user__content"&gt;
    &lt;h2&gt;
&lt;a class="ltag__user__link" href="/axrisi"&gt;Nikoloz Turazashvili (@axrisi)&lt;/a&gt;Follow
&lt;/h2&gt;
    &lt;div class="ltag__user__summary"&gt;
      &lt;a class="ltag__user__link" href="/axrisi"&gt;Founder &amp;amp; CTO at Vexrail (www. vexrail.com), Axrisi (www.axrisi.com). Opened Chicos restaurant in Tbilisi, Georgia.&lt;/a&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;


&lt;br&gt;


&lt;div class="ltag__user ltag__user__id__941720"&gt;
    &lt;a href="/dumebii" class="ltag__user__link profile-image-link"&gt;
      &lt;div class="ltag__user__pic"&gt;
        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F941720%2Ff316bf93-ef0b-4bc5-aee2-5e062255d5f0.jpg" alt="dumebii image"&gt;
      &lt;/div&gt;
    &lt;/a&gt;
  &lt;div class="ltag__user__content"&gt;
    &lt;h2&gt;
&lt;a class="ltag__user__link" href="/dumebii"&gt;Dumebi Okolo&lt;/a&gt;Follow
&lt;/h2&gt;
    &lt;div class="ltag__user__summary"&gt;
      &lt;a class="ltag__user__link" href="/dumebii"&gt;Confident technical writer with frontend developer skills, marketing skills and developer relations skills. 
I am also a very fun person to hang around with. &lt;/a&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;


&lt;br&gt;


&lt;div class="ltag__user ltag__user__id__965723"&gt;
    &lt;a href="/arindam_1729" class="ltag__user__link profile-image-link"&gt;
      &lt;div class="ltag__user__pic"&gt;
        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F965723%2Fe0982512-4de1-4154-b3c3-1869d19e9ecc.png" alt="arindam_1729 image"&gt;
      &lt;/div&gt;
    &lt;/a&gt;
  &lt;div class="ltag__user__content"&gt;
    &lt;h2&gt;
&lt;a class="ltag__user__link" href="/arindam_1729"&gt;Arindam Majumder &lt;/a&gt;Follow
&lt;/h2&gt;
    &lt;div class="ltag__user__summary"&gt;
      &lt;a class="ltag__user__link" href="/arindam_1729"&gt;Developer Advocate | Technical Writer | 600k+ Reads | Mail for Collabs&lt;/a&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;


&lt;br&gt;


&lt;div class="ltag__user ltag__user__id__889475"&gt;
    &lt;a href="/divyasinghdev" class="ltag__user__link profile-image-link"&gt;
      &lt;div class="ltag__user__pic"&gt;
        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F889475%2F1165be61-6903-4b59-af67-c262acfb1c94.webp" alt="divyasinghdev image"&gt;
      &lt;/div&gt;
    &lt;/a&gt;
  &lt;div class="ltag__user__content"&gt;
    &lt;h2&gt;
&lt;a class="ltag__user__link" href="/divyasinghdev"&gt;Divya&lt;/a&gt;Follow
&lt;/h2&gt;
    &lt;div class="ltag__user__summary"&gt;
      &lt;a class="ltag__user__link" href="/divyasinghdev"&gt;A curious lifelong learner, currently a full-time Masters student persuing Computer Science stream. Enthusiastic about development.&lt;/a&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;




&lt;h2&gt;
  
  
  Looking Ahead to 2026
&lt;/h2&gt;

&lt;p&gt;So yeah, the second half of 2025 was my redemption arc. I really loved this year, and my resolutions are crystal clear:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Get all my friends to participate in DEV Challenges&lt;/li&gt;
&lt;li&gt;Level up my skills in AI and Agentic development&lt;/li&gt;
&lt;li&gt;Travel more (maybe even an international trip!)&lt;/li&gt;
&lt;li&gt;Blog about my journey more consistently&lt;/li&gt;
&lt;li&gt;Start learning rock climbing (need to get in shape first 😅)&lt;/li&gt;
&lt;li&gt;Finally get my driver's license 😭 (It was my 2025 resolution too, but I still haven't done it!)&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;That's all for this year, folks!&lt;/p&gt;

&lt;p&gt;Thank you everyone for the support and love. Here's to an even better 2026!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Happy New Year to ALL!&lt;/strong&gt; 🎉&lt;/p&gt;

</description>
      <category>programming</category>
      <category>beginners</category>
      <category>career</category>
      <category>webdev</category>
    </item>
    <item>
      <title>I Built a Form Backend in a Weekend Because Paying $20/Month for Contact Forms is Stupid</title>
      <dc:creator>Varshith V Hegde</dc:creator>
      <pubDate>Tue, 30 Dec 2025 15:59:39 +0000</pubDate>
      <link>https://forem.com/varshithvhegde/i-built-a-form-backend-in-a-weekend-because-paying-20month-for-contact-forms-is-stupid-1o34</link>
      <guid>https://forem.com/varshithvhegde/i-built-a-form-backend-in-a-weekend-because-paying-20month-for-contact-forms-is-stupid-1o34</guid>
      <description>&lt;p&gt;So here's the thing - I was helping my friend set up his portfolio site last weekend. Everything was going smooth. Nice design, fast site, Vercel hosting on the free tier. Perfect.&lt;/p&gt;

&lt;p&gt;Then he goes "I need a contact form."&lt;/p&gt;

&lt;p&gt;Cool, I say. Just use one of those form backend services. Easy.&lt;/p&gt;

&lt;p&gt;He checks the pricing. "$20 a month?! Just to save some text?"&lt;/p&gt;

&lt;p&gt;And honestly? He's right. When did we all just accept this?&lt;/p&gt;

&lt;h2&gt;
  
  
  This got me thinking
&lt;/h2&gt;

&lt;p&gt;We're paying Netflix money for what's basically a database insert and an email. That's it. Store some text, send a notification. &lt;/p&gt;

&lt;p&gt;I spent more time being annoyed about this than I'd like to admit. Then I figured - you know what, I can probably build this myself. How hard can it be?&lt;/p&gt;

&lt;h2&gt;
  
  
  What I built
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgbxv3sc4hq6fgdistm06.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgbxv3sc4hq6fgdistm06.png" alt="FormRelay" width="800" height="573"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://formrelay.varshithvhegde.in" rel="noopener noreferrer"&gt;FormRelay&lt;/a&gt; is pretty straightforward. You point your HTML form at it, it saves the data, sends you an email, and shows everything in a dashboard. That's the whole thing.&lt;/p&gt;

&lt;p&gt;The difference? You host it yourself. Your Supabase database. Your Vercel deployment. Your data.&lt;/p&gt;

&lt;p&gt;And the best part? Supabase free tier gives you 50k users. Vercel hobby plan is free. Resend gives you 3k emails free per month.&lt;/p&gt;

&lt;p&gt;So yeah. $0/month vs $20/month. You do the math.&lt;/p&gt;

&lt;h2&gt;
  
  
  "But isn't self-hosting complicated?"
&lt;/h2&gt;

&lt;p&gt;This is what everyone says. And look, 10 years ago? Sure. You needed to know server management, deal with security updates, all that stuff.&lt;/p&gt;

&lt;p&gt;But now? With Vercel and Supabase?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fork the repo&lt;/li&gt;
&lt;li&gt;Click deploy&lt;/li&gt;
&lt;li&gt;Copy paste some environment variables&lt;/li&gt;
&lt;li&gt;You're done&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Took me longer to write the README than it takes to deploy this thing.&lt;/p&gt;

&lt;p&gt;Compare that to creating yet another account, entering your credit card, dealing with their dashboard, hitting some arbitrary limit, and then having to migrate everything when they raise prices next year.&lt;/p&gt;

&lt;p&gt;Which one sounds more complicated?&lt;/p&gt;

&lt;h2&gt;
  
  
  Here's where I might lose some of you
&lt;/h2&gt;

&lt;p&gt;I think we've gotten too comfortable renting everything.&lt;/p&gt;

&lt;p&gt;The whole indie web thing used to be about actually owning your stuff. Yeah, it was messier. Yeah, you had to learn things. But your website was &lt;em&gt;yours&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Now? We just subscribe to everything. And sure, time is money, I get that. Not everyone wants to deal with infrastructure.&lt;/p&gt;

&lt;p&gt;But there's this huge gap between "run your own server rack" and "pay someone $300/year to store contact form entries."&lt;/p&gt;

&lt;p&gt;That's the gap I'm trying to fill here.&lt;/p&gt;

&lt;h2&gt;
  
  
  Random things I learned
&lt;/h2&gt;

&lt;p&gt;Next.js 15 is actually good now. After all the App Router drama, it finally feels right. Server actions just work. No more fighting with it.&lt;/p&gt;

&lt;p&gt;Supabase is wild. Real-time updates, auth, and actual good documentation? Sign me up.&lt;/p&gt;

&lt;p&gt;The hardest part wasn't the code. It was making the setup instructions clear enough that anyone could follow them. Spent way too long on that.&lt;/p&gt;

&lt;h2&gt;
  
  
  About the email thing
&lt;/h2&gt;

&lt;p&gt;I used Resend because my domain didn't come with email when I bought it, and I wasn't planning to buy email hosting separately. Resend's free tier (3k emails/month) was perfect for this.&lt;/p&gt;

&lt;p&gt;But here's the thing - if you already have email with your domain, you can just swap Resend for regular SMTP. It's actually simpler in some ways. Just plug in your SMTP credentials and you're good to go.&lt;/p&gt;

&lt;p&gt;So even that "dependency" isn't really a dependency. Use what you've got.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tech stuff if you care
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Next.js 15 (app router)&lt;/li&gt;
&lt;li&gt;Supabase (postgres + realtime + auth)&lt;/li&gt;
&lt;li&gt;Tailwind CSS&lt;/li&gt;
&lt;li&gt;Radix UI&lt;/li&gt;
&lt;li&gt;Resend for emails (or just use SMTP if you have it)&lt;/li&gt;
&lt;li&gt;Lucide icons&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Nothing fancy. Just stuff that works and doesn't break.&lt;/p&gt;

&lt;h2&gt;
  
  
  You can use it
&lt;/h2&gt;

&lt;p&gt;Whole thing's on GitHub: &lt;a href="https://github.com/Varshithvhegde/formrelay" rel="noopener noreferrer"&gt;github.com/Varshithvhegde/formrelay&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Live demo: &lt;a href="https://formrelay.varshithvhegde.in" rel="noopener noreferrer"&gt;formrelay.varshithvhegde.in&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It's MIT licensed. Do whatever you want with it. If you find bugs, let me know. If you want to add features, PRs are open.&lt;/p&gt;

&lt;h2&gt;
  
  
  The actual point
&lt;/h2&gt;

&lt;p&gt;This isn't really about forms or saving money.&lt;/p&gt;

&lt;p&gt;It's about remembering that we can actually build stuff ourselves. We don't need a SaaS product for every little thing. The tools are there. The platforms are free. We know how to code.&lt;/p&gt;

&lt;p&gt;A lot of problems that cost $20/month are actually just weekend projects in disguise.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Quick note: Yes, I know SaaS companies provide value. Support, maintenance, features, etc. But for something as basic as form handling? Come on.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>saas</category>
      <category>webdev</category>
      <category>beginners</category>
    </item>
    <item>
      <title>If you are creating any Multi Agent or AI apps you need to check this out!!!</title>
      <dc:creator>Varshith V Hegde</dc:creator>
      <pubDate>Sun, 21 Dec 2025 15:47:12 +0000</pubDate>
      <link>https://forem.com/varshithvhegde/if-you-are-creating-any-multi-agent-or-ai-apps-you-need-to-check-this-out-gd6</link>
      <guid>https://forem.com/varshithvhegde/if-you-are-creating-any-multi-agent-or-ai-apps-you-need-to-check-this-out-gd6</guid>
      <description>&lt;p&gt;

&lt;/p&gt;
&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/varshithvhegde/bifrost-the-llm-gateway-thats-40x-faster-than-litellm-1763" class="crayons-story__hidden-navigation-link"&gt;Bifrost: The LLM Gateway That's 40x Faster Than LiteLLM&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/varshithvhegde" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F885064%2F4ab304f4-a3f3-409c-8217-9ce130e57c18.jpeg" alt="varshithvhegde profile" class="crayons-avatar__image"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/varshithvhegde" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Varshith V Hegde
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Varshith V Hegde
                &lt;a href="/++"&gt;&lt;img alt="Subscriber" class="subscription-icon" src="https://assets.dev.to/assets/subscription-icon-805dfa7ac7dd660f07ed8d654877270825b07a92a03841aa99a1093bd00431b2.png"&gt;&lt;/a&gt;
              
              &lt;div id="story-author-preview-content-3105139" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/varshithvhegde" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F885064%2F4ab304f4-a3f3-409c-8217-9ce130e57c18.jpeg" class="crayons-avatar__image" alt=""&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Varshith V Hegde&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/varshithvhegde/bifrost-the-llm-gateway-thats-40x-faster-than-litellm-1763" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;Dec 18 '25&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/varshithvhegde/bifrost-the-llm-gateway-thats-40x-faster-than-litellm-1763" id="article-link-3105139"&gt;
          Bifrost: The LLM Gateway That's 40x Faster Than LiteLLM
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/webdev"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;webdev&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/ai"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;ai&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/programming"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;programming&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/agents"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;agents&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/varshithvhegde/bifrost-the-llm-gateway-thats-40x-faster-than-litellm-1763" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/exploding-head-daceb38d627e6ae9b730f36a1e390fca556a4289d5a41abb2c35068ad3e2c4b5.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/multi-unicorn-b44d6f8c23cdd00964192bedc38af3e82463978aa611b4365bd33a0f1f4f3e97.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;49&lt;span class="hidden s:inline"&gt; reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/varshithvhegde/bifrost-the-llm-gateway-thats-40x-faster-than-litellm-1763#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              Comments


              2&lt;span class="hidden s:inline"&gt; comments&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            10 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;




</description>
      <category>webdev</category>
      <category>ai</category>
      <category>programming</category>
      <category>agents</category>
    </item>
  </channel>
</rss>
