<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Varshith V Hegde</title>
    <description>The latest articles on Forem by Varshith V Hegde (@varshithvhegde).</description>
    <link>https://forem.com/varshithvhegde</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F885064%2F4ab304f4-a3f3-409c-8217-9ce130e57c18.jpeg</url>
      <title>Forem: Varshith V Hegde</title>
      <link>https://forem.com/varshithvhegde</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/varshithvhegde"/>
    <language>en</language>
    <item>
      <title>I Spent 3 Days Debugging Our LLM Setup. Turns Out We Needed an AI Gateway the Whole Time.</title>
      <dc:creator>Varshith V Hegde</dc:creator>
      <pubDate>Wed, 15 Apr 2026 08:46:08 +0000</pubDate>
      <link>https://forem.com/varshithvhegde/i-spent-3-days-debugging-our-llm-setup-turns-out-we-needed-an-ai-gateway-the-whole-time-50a2</link>
      <guid>https://forem.com/varshithvhegde/i-spent-3-days-debugging-our-llm-setup-turns-out-we-needed-an-ai-gateway-the-whole-time-50a2</guid>
      <description>&lt;p&gt;Let me tell you about a Friday afternoon I'd rather forget.&lt;/p&gt;

&lt;p&gt;Three teams, four models, six API keys living in different &lt;code&gt;.env&lt;/code&gt; files, one very angry compliance officer, and me just staring at a terminal trying to figure out why we got a $1,400 OpenAI bill for a feature that was supposed to cost fifty bucks.&lt;/p&gt;

&lt;p&gt;That was my "okay something is genuinely broken here" moment.&lt;/p&gt;

&lt;p&gt;Not some big insight. Just a $1,400 invoice and dead silence on a Slack thread for about ten minutes.&lt;/p&gt;

&lt;p&gt;If you've felt even a small version of that, this post is for you.&lt;/p&gt;




&lt;h2&gt;
  
  
  So what actually is an AI Gateway?
&lt;/h2&gt;

&lt;p&gt;Not the textbook answer. That one goes something like "middleware that abstracts your LLM provider calls." Technically fine, tells you nothing.&lt;/p&gt;

&lt;p&gt;Here's how I actually think about it.&lt;/p&gt;

&lt;p&gt;You know how bigger engineering orgs eventually build out a platform team? Before that team exists, every squad is doing their own thing. Their own CI setup, their own infra configs, their own credentials. It mostly works. Until it doesn't. And then it catastrophically doesn't all at once.&lt;/p&gt;

&lt;p&gt;An AI Gateway is basically that platform layer, except it's for LLMs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkjuor74bks4xvbsncv14.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkjuor74bks4xvbsncv14.webp" alt="AI Gateway" width="800" height="448"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Every single request your app makes to any model (OpenAI, Anthropic, a self-hosted Llama, whatever you're running) goes through it. Because everything flows through one place, you finally get:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One set of credentials instead of keys scattered across five repos&lt;/li&gt;
&lt;li&gt;Rate limits and budgets that are actually enforced&lt;/li&gt;
&lt;li&gt;Cost tracking per team, per model, per request&lt;/li&gt;
&lt;li&gt;Guardrails that catch PII before it leaves your infra&lt;/li&gt;
&lt;li&gt;One place to look when something blows up&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One control plane. Every team. Every model.&lt;/p&gt;




&lt;h2&gt;
  
  
  The architecture is simpler than it sounds
&lt;/h2&gt;

&lt;p&gt;Here's what happens when you put a gateway in the middle:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhjgbzpismfi3ilj62wmy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhjgbzpismfi3ilj62wmy.png" alt="Excalidraw AI gateway" width="800" height="567"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Request comes in from your app, gateway catches it, validates auth, checks rate limits, applies input guardrails, picks the right provider, logs everything, checks the response output, sends it back. That's the whole flow.&lt;/p&gt;

&lt;p&gt;Your application code doesn't change. You stop pointing at &lt;code&gt;api.openai.com&lt;/code&gt; directly and point at your gateway instead. That's literally it from your team's perspective.&lt;/p&gt;

&lt;p&gt;The control layer just sits there doing its job quietly.&lt;/p&gt;




&lt;h2&gt;
  
  
  "But I already have an API gateway. Isn't that enough?"
&lt;/h2&gt;

&lt;p&gt;This is where most people get confused. Including me when I first looked into this.&lt;/p&gt;

&lt;p&gt;Quick answer: no. Here's why.&lt;/p&gt;

&lt;p&gt;Your API gateway (Kong, AWS API Gateway, Nginx, take your pick) understands traffic. It knows Team A sent 10,000 HTTP requests. It can enforce rate limits, handle auth tokens. That's useful.&lt;/p&gt;

&lt;p&gt;Your AI gateway understands what's actually inside those requests. It knows Team A sent &lt;strong&gt;4.2 million tokens to GPT-4o&lt;/strong&gt;, it cost &lt;strong&gt;$84&lt;/strong&gt;, average latency was &lt;strong&gt;340ms&lt;/strong&gt;, and &lt;strong&gt;3 of those requests triggered the PII guardrail&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;One sees requests. The other sees meaning. That's not a small difference.&lt;/p&gt;

&lt;p&gt;For stateless REST APIs, a regular API gateway is totally fine. For LLM workloads where tokens equal money and every prompt is a potential compliance issue, you need something that actually speaks the language.&lt;/p&gt;




&lt;h2&gt;
  
  
  Do you actually need one right now though?
&lt;/h2&gt;

&lt;p&gt;Let me skip the usual "it depends" and be direct.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You're probably fine without one if:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One team, one model, one use case&lt;/li&gt;
&lt;li&gt;Nobody is asking about costs yet&lt;/li&gt;
&lt;li&gt;Zero compliance requirements&lt;/li&gt;
&lt;li&gt;It's a POC or side project&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Don't add infrastructure you don't need. Raw SDK calls are fast to ship. Keep it simple when simple works.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You've outgrown the simple setup if:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Multiple teams are calling models independently with no visibility into what they're doing&lt;/li&gt;
&lt;li&gt;Swapping providers requires actual code changes&lt;/li&gt;
&lt;li&gt;Someone from legal or security or finance asked a question you couldn't answer&lt;/li&gt;
&lt;li&gt;You've had an API key accidentally committed to a public repo (or almost did)&lt;/li&gt;
&lt;li&gt;You can't answer "what did we spend on AI last month, by team?" without going on a scavenger hunt through billing dashboards&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last point is genuinely the biggest tell. If someone asks that question and you have to go digging, you already needed this.&lt;/p&gt;

&lt;h3&gt;
  
  
  What actually pushes teams over the edge
&lt;/h3&gt;

&lt;p&gt;It's never one thing. It's always a pile of smaller things that suddenly feel heavy together.&lt;/p&gt;

&lt;p&gt;DevOps realizes they can't track spend because keys are everywhere. Someone commits a key to a public repo. A team uses GPT-4 Turbo for tasks that GPT-4 Mini handles just fine, and you find out after they've burned $2K. Compliance asks for an audit trail and you have nothing.&lt;/p&gt;

&lt;p&gt;Each of those individually, fine, you deal with it. All of them stacking up at the same time? That's when the "simple" setup reveals it was never actually simple. You were just deferring the complexity.&lt;/p&gt;




&lt;h2&gt;
  
  
  What a production gateway actually looks like
&lt;/h2&gt;

&lt;p&gt;Okay enough talking around it. Here's what it gives you in practice, using TrueFoundry as the concrete example.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe7wqj5il9c2jcll6z8a9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe7wqj5il9c2jcll6z8a9.png" alt="TrueFoundry MainPage" width="800" height="497"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;One API key across all providers&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpzba6qd46w4cqyrq3nse.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpzba6qd46w4cqyrq3nse.png" alt="Model Unify" width="800" height="331"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Your teams stop touching raw OpenAI or Anthropic keys entirely. One key, routed through the gateway, with access to every approved model. Rotate it in one place. Done.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Per-team budgets with real enforcement&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh4l0e4t4h7tccj3rwm7r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh4l0e4t4h7tccj3rwm7r.png" alt="team" width="800" height="434"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Not "we log it and send you a Slack alert." Actual hard limits. Team hits their monthly budget, the next request gets rejected with a clear error. No surprise bills, no awkward retros about where the spend went.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Automatic failover&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;OpenAI goes down. It happens. Your app doesn't go down with it because requests automatically route to Anthropic or your self-hosted model. No code changes. No one gets paged. It just keeps working.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Full request tracing&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhlrnranv1rkxfvv22jyt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhlrnranv1rkxfvv22jyt.png" alt="request tracing" width="800" height="441"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Every prompt, every response, every token count, every cost attribution. Logged and queryable. Pull a request from six months ago and reconstruct exactly what happened. This feature alone has saved me more debugging time than I can measure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Guardrails that actually run everywhere&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F42qwsft6nxn4x574gv87.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F42qwsft6nxn4x574gv87.png" alt="Guardrails" width="800" height="454"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;PII filtering, prompt injection detection, custom output policies. You define the rule once and it applies across every team and every model. No per-team implementation, no "oops we forgot to add the check in this service."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Runs inside your own environment&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;VPC, on-prem, air-gapped. Data doesn't leave your infra. SOC 2, HIPAA, GDPR compliant. If your compliance team has ever asked "but where does the data actually go," this is finally a clean answer.&lt;/p&gt;

&lt;p&gt;Performance-wise it handles 350+ RPS on a single vCPU with sub-3ms latency so you're not adding meaningful overhead to your request path.&lt;/p&gt;

&lt;p&gt;TrueFoundry is in the 2026 Gartner Market Guide for AI Gateways and processes 10B+ requests per month for companies like Siemens Healthineers, NVIDIA, Resmed, and Automation Anywhere. Mentioning it not as a flex but as a sense of scale.&lt;/p&gt;




&lt;h2&gt;
  
  
  The question that actually helped me decide
&lt;/h2&gt;

&lt;p&gt;Forget "do I need an AI gateway."&lt;/p&gt;

&lt;p&gt;Ask this instead: when does the cost of NOT having one start to exceed the cost of setting one up?&lt;/p&gt;

&lt;p&gt;For most teams that crossover happens way earlier than expected. For us it wasn't one event. It was the accumulation. The audit trail we didn't have. The $1,400 bill nobody could explain. The near-miss with a key in a public repo.&lt;/p&gt;

&lt;p&gt;Setting up TrueFoundry honestly took less time than the post-mortem meeting for that billing incident.&lt;/p&gt;




&lt;p&gt;Try TrueFoundry free at &lt;strong&gt;&lt;a href="https://truefoundry.com" rel="noopener noreferrer"&gt;truefoundry.com&lt;/a&gt;&lt;/strong&gt; (no credit card required, deploys on your cloud in under 10 minutes).&lt;/p&gt;




&lt;p&gt;&lt;em&gt;What does your current setup look like? Still on raw SDK calls or have you already hit the wall? Drop a comment, genuinely curious where people are when they start asking this question.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>beginners</category>
      <category>performance</category>
    </item>
    <item>
      <title>The Great Claude Code Leak of 2026: Accident, Incompetence, or the Best PR Stunt in AI History?</title>
      <dc:creator>Varshith V Hegde</dc:creator>
      <pubDate>Wed, 01 Apr 2026 02:29:18 +0000</pubDate>
      <link>https://forem.com/varshithvhegde/the-great-claude-code-leak-of-2026-accident-incompetence-or-the-best-pr-stunt-in-ai-history-3igm</link>
      <guid>https://forem.com/varshithvhegde/the-great-claude-code-leak-of-2026-accident-incompetence-or-the-best-pr-stunt-in-ai-history-3igm</guid>
      <description>&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; On March 31, 2026, Anthropic accidentally shipped the &lt;em&gt;entire source code&lt;/em&gt; of Claude Code to the public npm registry via a single misconfigured debug file. 512,000 lines. 1,906 TypeScript files. 44 hidden feature flags. A Tamagotchi pet. And one very uncomfortable question: was it really an accident?&lt;/p&gt;

&lt;h2&gt;
  
  
  1. What Actually Happened
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Root Cause: One Missing Line in &lt;code&gt;.npmignore&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;This is both the most embarrassing and most instructive part of the story. Let me walk through the technical chain of events.&lt;/p&gt;

&lt;p&gt;When you publish a JavaScript/TypeScript package to npm, your build toolchain (Webpack, esbuild, Bun, etc.) optionally generates &lt;strong&gt;source map files&lt;/strong&gt;, which have a &lt;code&gt;.map&lt;/code&gt; extension. Their entire purpose is debugging: they bridge the gap between the minified, bundled production code and your original readable source. When a crash happens, a source map lets the stack trace point to your actual TypeScript file at line 47 rather than &lt;code&gt;main.js:1:284729&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Source maps are strictly for internal debugging. They should never ship to users.&lt;/p&gt;

&lt;p&gt;The way you exclude them from npm packages is with an &lt;code&gt;.npmignore&lt;/code&gt; file, or a &lt;code&gt;files&lt;/code&gt; field in &lt;code&gt;package.json&lt;/code&gt;. Here's the mistake in plain English:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# What Claude Code's .npmignore should have had:&lt;/span&gt;
&lt;span class="k"&gt;*&lt;/span&gt;.map
dist/&lt;span class="k"&gt;*&lt;/span&gt;.map

&lt;span class="c"&gt;# What it apparently had:&lt;/span&gt;
&lt;span class="c"&gt;# (nothing about .map files)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. That's the whole disaster.&lt;/p&gt;

&lt;p&gt;But it gets worse. The source map didn't contain the source code directly. It &lt;em&gt;referenced&lt;/em&gt; it, pointing to a URL of a &lt;code&gt;.zip&lt;/code&gt; file hosted on Anthropic's own Cloudflare R2 storage bucket. A publicly accessible one, with no authentication required.&lt;/p&gt;

&lt;p&gt;So the full chain looked like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;npm install @anthropic-ai/claude-code
  → downloads package including main.js.map (59.8 MB)
    → .map file contains URL pointing to src.zip
      → src.zip is hosted publicly on Anthropic's R2 bucket
        → anyone can download and unzip 512,000 lines of TypeScript
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two separate configuration failures, stacked on top of each other.&lt;/p&gt;

&lt;p&gt;As software engineer Gabriel Anhaia put it in his &lt;a href="https://dev.to/gabrielanhaia/claude-codes-entire-source-code-was-just-leaked-via-npm-source-maps-heres-whats-inside-cjo"&gt;deep dive&lt;/a&gt;: "A single misconfigured &lt;code&gt;.npmignore&lt;/code&gt; or &lt;code&gt;files&lt;/code&gt; field in &lt;code&gt;package.json&lt;/code&gt; can expose everything."&lt;/p&gt;

&lt;h3&gt;
  
  
  The Bun Factor
&lt;/h3&gt;

&lt;p&gt;There's a third layer. Anthropic acquired the &lt;strong&gt;Bun JavaScript runtime&lt;/strong&gt; at the end of 2025, and Claude Code is built on top of it. A known Bun bug (&lt;a href="https://github.com/oven-sh/bun/issues/28001" rel="noopener noreferrer"&gt;issue #28001&lt;/a&gt;, filed on March 11, 2026) reports that source maps are served in production builds even when the documentation says they shouldn't be.&lt;/p&gt;

&lt;p&gt;The bug was open for 20 days before this happened. Nobody caught it. Anthropic's own acquired toolchain contributed to exposing Anthropic's own product.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. The Timeline
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;00:21 UTC — March 31, 2026
Malicious axios versions (1.14.1 / 0.30.4) appear on npm
with an embedded Remote Access Trojan. Unrelated to Anthropic,
but catastrophically bad timing.

~04:00 UTC
Claude Code v2.1.88 is pushed to npm. The 59.8 MB source map
ships with it. The R2 bucket containing all source code is live
and publicly accessible.

04:23 UTC
Chaofan Shou (@Fried_rice), an intern at Solayer Labs,
tweets the discovery with a direct download link.
16 million people descend on the thread.

Next 2 hours
GitHub repositories spring up. The fastest repo in history
to hit 50,000 stars does it in under 2 hours.
41,500+ forks proliferate. DMCA requests begin.

~08:00 UTC
Anthropic pulls the npm package from the registry.
Issues the "human error, not a security breach" statement
to VentureBeat, The Register, CNBC, Fortune, Axios, Decrypt.

Same day
A Python clean-room rewrite appears, legally DMCA-proof.
Decentralized mirrors on Gitlawb go live with the message:
"Will never be taken down."
The code is permanently in the wild.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  By the Numbers
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Lines of code exposed&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;512,000+&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TypeScript files&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1,906&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Source map file size&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;59.8 MB&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GitHub forks (peak)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;41,500+&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Stars on fastest repo&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;50,000 in 2 hours&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hidden feature flags&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;44&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Claude Code ARR&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$2.5 billion&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Anthropic total ARR&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$19 billion&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Views on original tweet&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;16 million&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  3. SECURITY ALERT: The axios RAT
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Stop. Read this before anything else if you updated Claude Code that morning.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Coinciding with the leak, but entirely unrelated to it, was a real supply chain attack on npm. Malicious versions of the widely-used &lt;code&gt;axios&lt;/code&gt; HTTP library were published:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;axios@1.14.1&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;axios@0.30.4&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both contain an embedded &lt;strong&gt;Remote Access Trojan (RAT)&lt;/strong&gt;. The malicious dependency is called &lt;code&gt;plain-crypto-js&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you ran &lt;code&gt;npm install&lt;/code&gt; or updated Claude Code between 00:21 UTC and 03:29 UTC on March 31, 2026:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Check your lockfiles immediately:&lt;/span&gt;
&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s2"&gt;"1.14.1&lt;/span&gt;&lt;span class="se"&gt;\|&lt;/span&gt;&lt;span class="s2"&gt;0.30.4&lt;/span&gt;&lt;span class="se"&gt;\|&lt;/span&gt;&lt;span class="s2"&gt;plain-crypto-js"&lt;/span&gt; package-lock.json
&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s2"&gt;"1.14.1&lt;/span&gt;&lt;span class="se"&gt;\|&lt;/span&gt;&lt;span class="s2"&gt;0.30.4&lt;/span&gt;&lt;span class="se"&gt;\|&lt;/span&gt;&lt;span class="s2"&gt;plain-crypto-js"&lt;/span&gt; yarn.lock
&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s2"&gt;"1.14.1&lt;/span&gt;&lt;span class="se"&gt;\|&lt;/span&gt;&lt;span class="s2"&gt;0.30.4&lt;/span&gt;&lt;span class="se"&gt;\|&lt;/span&gt;&lt;span class="s2"&gt;plain-crypto-js"&lt;/span&gt; bun.lockb
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you find a match:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Treat the machine as fully compromised&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Rotate all credentials, API keys, and secrets immediately&lt;/li&gt;
&lt;li&gt;Perform a clean OS reinstallation&lt;/li&gt;
&lt;li&gt;File incident reports for any organizational data&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Going forward, Anthropic has designated the &lt;strong&gt;Native Installer&lt;/strong&gt; as the recommended installation method:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://claude.ai/install.sh | bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The native installer uses a standalone binary that doesn't rely on the npm dependency chain.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. What Was Inside: The Full Breakdown
&lt;/h2&gt;

&lt;p&gt;The leaked codebase is the &lt;code&gt;src/&lt;/code&gt; directory of Claude Code, the "agentic harness" that wraps the underlying Claude model and gives it the ability to use tools, manage files, run bash commands, and orchestrate multi-agent workflows. This is not the model weights (those weren't exposed), but in many ways this is &lt;em&gt;more&lt;/em&gt; strategically valuable.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Architecture
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;The Tool System (~40 tools, ~29,000 lines)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Claude Code isn't a chat wrapper. It's a plugin-style architecture where every capability is a discrete, permission-gated tool:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;BashTool&lt;/code&gt; — shell command execution with safety guards&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;FileReadTool&lt;/code&gt;, &lt;code&gt;FileWriteTool&lt;/code&gt;, &lt;code&gt;FileEditTool&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;WebFetchTool&lt;/code&gt; — live web access&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;LSPTool&lt;/code&gt; — Language Server Protocol integration for IDE features&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;GlobTool&lt;/code&gt;, &lt;code&gt;GrepTool&lt;/code&gt; — codebase search&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;NotebookReadTool&lt;/code&gt;, &lt;code&gt;NotebookEditTool&lt;/code&gt; — Jupyter support&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;MultiEditTool&lt;/code&gt; — atomic multi-file edits&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;TodoReadTool&lt;/code&gt;, &lt;code&gt;TodoWriteTool&lt;/code&gt; — task tracking&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each tool has its own permission model, validation logic, and output formatting. The base tool definition alone spans 29,000 lines.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Query Engine (46,000 lines)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Labeled "the brain of the operation" in Gabriel Anhaia's &lt;a href="https://dev.to/gabrielanhaia"&gt;analysis&lt;/a&gt;. It handles all LLM API calls and response streaming, token caching and context management, multi-agent orchestration, and retry logic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Memory Architecture&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is what competitors will study most carefully. Anthropic built a solution to "context entropy," the tendency for long-running AI sessions to degrade into hallucination as the context grows. Their answer is a three-layer memory system:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;Layer 1: MEMORY.md
  → A lightweight index of pointers (~150 chars per entry)
  → Always loaded in context
  → Stores LOCATIONS, not data

Layer 2: Topic Files
  → Actual project knowledge, fetched on-demand
  → Never fully in context simultaneously

Layer 3: Raw Transcripts
  → Never re-read fully
  → Only grep'd for specific identifiers when needed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key insight is what they call &lt;strong&gt;Strict Write Discipline&lt;/strong&gt;. The agent can only update its memory index after a confirmed successful file write. This prevents the agent from polluting its context with failed attempts. The agent also treats its own memory as a "hint" and verifies facts against the actual codebase before acting, rather than trusting its stored beliefs.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Hidden Features Anthropic Never Meant to Ship
&lt;/h2&gt;

&lt;h3&gt;
  
  
  KAIROS: Always-On Autonomous Agent
&lt;/h3&gt;

&lt;p&gt;KAIROS (from the Ancient Greek for "the right moment") is mentioned 150+ times in the source. It's an unreleased autonomous background daemon mode that runs background sessions while you're idle, executes a process called &lt;code&gt;autoDream&lt;/code&gt; for nightly memory consolidation, merges disparate observations, removes logical contradictions, and converts vague insights into verified facts. It also has a special &lt;code&gt;Brief&lt;/code&gt; output mode designed for a persistent assistant and access to tools regular Claude Code doesn't have.&lt;/p&gt;

&lt;p&gt;Think of it as Claude Code actively maintaining its understanding of your project while you sleep, not just sitting there waiting.&lt;/p&gt;

&lt;h3&gt;
  
  
  ULTRAPLAN: 30-Minute Remote Planning Sessions
&lt;/h3&gt;

&lt;p&gt;ULTRAPLAN offloads a complex planning task to a remote Cloud Container Runtime (CCR) session running Opus, gives it up to 30 minutes to think, and lets you approve the result from your phone or browser. When approved, a special sentinel value &lt;code&gt;__ULTRAPLAN_TELEPORT_LOCAL__&lt;/code&gt; brings the result back to your local terminal. Remote cloud-powered reasoning, delivered locally.&lt;/p&gt;

&lt;h3&gt;
  
  
  Coordinator Mode: Multi-Agent Orchestration
&lt;/h3&gt;

&lt;p&gt;One Claude spawning and managing multiple worker Claude agents in parallel. The Coordinator handles task distribution, result aggregation, and conflicts between worker outputs. It's infrastructure for AI teams, not just AI assistants.&lt;/p&gt;

&lt;h3&gt;
  
  
  BUDDY: The Part Nobody Expected
&lt;/h3&gt;

&lt;p&gt;The most talked-about find, not for its strategic implications but because it's genuinely fun.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;buddy/companion.ts&lt;/code&gt; implements a full Tamagotchi-style AI pet that lives in a speech bubble next to your terminal input.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Species (18 total, hidden via String.fromCharCode() arrays):
duck, dragon, axolotl, capybara, mushroom, ghost, nebulynx...

Rarity tiers:
Common &amp;gt; Uncommon &amp;gt; Rare &amp;gt; Epic &amp;gt; Legendary
1% shiny chance, independent of rarity

Stats:
DEBUGGING / PATIENCE / CHAOS / WISDOM / SNARK

Determined by:
Mulberry32 PRNG seeded from your userId hash + salt 'friend-2026-401'
(Same user always gets the same buddy species -- deterministic)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Claude generates a custom name and personality description for your buddy on first hatch. There are sprite animations and a floating heart effect. The planned rollout window in the source code: &lt;strong&gt;April 1-7, 2026&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Someone at Anthropic is clearly having a very good time.&lt;/p&gt;

&lt;h3&gt;
  
  
  Anti-Distillation: Poisoning Competitor Training Data
&lt;/h3&gt;

&lt;p&gt;In &lt;code&gt;claude.ts&lt;/code&gt; (lines 301-313), a flag called &lt;code&gt;ANTI_DISTILLATION_CC&lt;/code&gt;, when enabled, sends &lt;code&gt;anti_distillation: ['fake_tools']&lt;/code&gt; in API requests. This tells the server to inject decoy tool definitions into the system prompt. The idea: if a competitor is recording Claude Code's API traffic to train their own model, the fake tool definitions corrupt that training data.&lt;/p&gt;

&lt;p&gt;There's a second mechanism in &lt;code&gt;betas.ts&lt;/code&gt; (lines 279-298): server-side connector-text summarization. When enabled, the API buffers the assistant's reasoning between tool calls, returns only summaries, and cryptographically signs them. Competitors recording traffic get the summaries, not the full reasoning chain.&lt;/p&gt;

&lt;p&gt;As Alex Kim &lt;a href="https://alex000kim.com/posts/2026-03-31-claude-code-source-leak/" rel="noopener noreferrer"&gt;notes in his analysis&lt;/a&gt;: "Anyone serious about distilling from Claude Code traffic would find the workarounds in about an hour of reading the source. The real protection is probably legal, not technical."&lt;/p&gt;

&lt;h3&gt;
  
  
  Frustration Detection via Regex
&lt;/h3&gt;

&lt;p&gt;Found in &lt;code&gt;userPromptKeywords.ts&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="nf"&gt;b&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;wtf&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;wth&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;ffs&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;omfg&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nf"&gt;shit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ty&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;tiest&lt;/span&gt;&lt;span class="p"&gt;)?&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;dumbass&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;horrible&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;awful&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
&lt;span class="nf"&gt;piss&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ed&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;ing&lt;/span&gt;&lt;span class="p"&gt;)?&lt;/span&gt; &lt;span class="nx"&gt;off&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;piece&lt;/span&gt; &lt;span class="k"&gt;of &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;shit&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;crap&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;junk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;what&lt;/span&gt; &lt;span class="nf"&gt;the &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;fuck&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;hell&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
&lt;span class="nx"&gt;fucking&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;broken&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;useless&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;terrible&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;awful&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;horrible&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;fuck&lt;/span&gt; &lt;span class="nx"&gt;you&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
&lt;span class="nf"&gt;screw &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;you&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;so&lt;/span&gt; &lt;span class="nx"&gt;frustrating&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt; &lt;span class="nx"&gt;sucks&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;damn&lt;/span&gt; &lt;span class="nx"&gt;it&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="nx"&gt;b&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A multi-billion-dollar AI company is detecting user frustration with a regex. The Hacker News thread lost it. To be fair though, it's faster, cheaper, and more predictable than running an LLM inference every time to check if the user is angry at the tool.&lt;/p&gt;

&lt;h3&gt;
  
  
  250,000 Wasted API Calls Per Day
&lt;/h3&gt;

&lt;p&gt;The most candid internal admission in the entire codebase. From &lt;code&gt;autoCompact.ts&lt;/code&gt; (lines 68-70):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"BQ 2026-03-10: 1,279 sessions had 50+ consecutive failures 
(up to 3,272) in a single session, wasting ~250K API calls/day globally."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The fix was three lines: &lt;code&gt;MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES = 3&lt;/code&gt;. After 3 consecutive compaction failures, it just stops trying. Sometimes good engineering is knowing when to give up.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. The "Capybara" Model Confirmed
&lt;/h2&gt;

&lt;p&gt;The leak didn't expose Claude's model weights, but it did expose multiple references to Anthropic's next major model family. Internal codenames: &lt;strong&gt;Capybara&lt;/strong&gt; (also referred to as &lt;strong&gt;Mythos&lt;/strong&gt; in a separate leaked document from the prior week).&lt;/p&gt;

&lt;p&gt;The beta flags in the source reference specific API version strings for Capybara, suggesting it's well beyond concept stage. Security researcher Roy Paz from LayerX Security, who reviewed the code for Fortune, indicated it will likely ship in fast and slow variants with a significantly larger context window than anything currently on the market.&lt;/p&gt;

&lt;p&gt;These references also confirmed the existence of &lt;code&gt;undercover.ts&lt;/code&gt;, a module that actively instructs Claude Code to never mention internal codenames like "Capybara" or "Tengu" when used in external repositories. There's a hard-coded &lt;code&gt;NO force-OFF&lt;/code&gt; — you can force Undercover Mode on, but you cannot force it off. In external builds, the function gets dead-code-eliminated entirely.&lt;/p&gt;

&lt;p&gt;The implication raised in the &lt;a href="https://news.ycombinator.com/item?id=47584540" rel="noopener noreferrer"&gt;Hacker News thread&lt;/a&gt;: AI-authored commits from Anthropic employees in open source repos will have no indication an AI wrote them. The tool actively conceals its own involvement.&lt;/p&gt;




&lt;h2&gt;
  
  
  7. Alternative Theory: Was This Anthropic's PR Play?
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;I'm not saying I believe this. I'm saying the circumstantial evidence is strange enough that it deserves to be stated clearly.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Anthropic is the self-proclaimed "safety-first AI lab." They're racing for developer mindshare against OpenAI (better brand) and Google (better distribution). Claude Code is their breakout product. They're preparing for an IPO. And they'd just made themselves unpopular with the developer community ten days earlier by sending legal threats to OpenCode for using their internal APIs.&lt;/p&gt;

&lt;p&gt;So let's look at what this "leak" actually did for Anthropic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Exhibit A: The April Fools' Timing&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The leak occurred on March 31, the day before April 1st. The Buddy/companion system had a planned rollout window of April 1-7 coded directly into the source. The "leak" gave developers a sneak peek at what was about to launch anyway. Was this a controlled preview dressed up as an accident?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Exhibit B: The Bun Bug Nobody Fixed&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Anthropic acquired Bun. They own the runtime. The bug causing source maps to ship in production was filed 20 days before the leak and was still open. If you own the runtime and its bug tracker, and that bug causes your own code to leak... why hadn't anyone internally marked it as critical?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Exhibit C: The Undercover Mode Irony&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Claude Code has an entire subsystem called Undercover Mode, purpose-built to prevent internal codenames from leaking through AI-generated content. They built AI-powered leak prevention into the product. Then humans accidentally shipped the entire source code. The gap between their AI safety engineering and their human release engineering is either tragic or theatrical.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Exhibit D: The OpenCode Reputation Reversal&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Ten days before the leak, Anthropic sent cease-and-desist letters to OpenCode, a popular third-party tool. The developer community was furious. The narrative was "Anthropic is acting like a gatekeeping megacorp."&lt;/p&gt;

&lt;p&gt;Then a "leak" happens that shows Anthropic's impressive engineering to the world, makes them look like the underdog, generates three days of breathless coverage about KAIROS, BUDDY, and ULTRAPLAN, and completely reversed developer sentiment. Within 48 hours, developers went from "Anthropic sucks" to "holy shit look what Anthropic is building."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Exhibit E: The Permanent Mirror Problem&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Anthropic filed DMCA takedowns. GitHub complied immediately. But the decentralized mirror at Gitlawb, with a public message saying "Will never be taken down," has been live since day one. Anthropic has a legal team, deep pockets, and relationships. A serious legal effort could make life difficult for every mirror operator. They chose not to go that hard.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Exhibit F: The "Second Leak in a Week" Pattern&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This wasn't Anthropic's first incident that week. A draft blog post about the Capybara/Mythos model had "accidentally" been publicly accessible just days before, as Fortune reported on Thursday. Two high-profile "leaks" in five days, both generating enormous excitement about Anthropic's upcoming roadmap, both very conveniently timed.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Counter-Arguments (Why It's Probably Just Incompetence)
&lt;/h3&gt;

&lt;p&gt;To be fair:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Strategic roadmap exposure is genuinely damaging.&lt;/strong&gt; Cursor, Copilot, and Windsurf now know exactly what Anthropic has already built and what's nearly ready to ship. That's real competitive intelligence permanently in the public domain.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The IPO narrative cuts both ways.&lt;/strong&gt; "We shipped our source code to npm" is not a line you want in your S-1.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The axios RAT timing.&lt;/strong&gt; Nobody would engineer a PR stunt to overlap with an active malware attack on npm. That part made a bad news day significantly worse for anyone who updated Claude Code that morning, and there's no upside to being associated with a supply chain attack.&lt;/p&gt;

&lt;p&gt;The most likely answer is plain human error. A misconfigured &lt;code&gt;.npmignore&lt;/code&gt;. A known Bun bug nobody had marked as critical. A public R2 bucket that should have been private. Three configuration failures that compounded into a disaster.&lt;/p&gt;

&lt;p&gt;The PR outcome though? Undeniably good. The strategic damage? Real but survivable. The timing? Genuinely strange.&lt;/p&gt;

&lt;p&gt;Draw your own conclusions.&lt;/p&gt;




&lt;h2&gt;
  
  
  8. Why DMCA Won't Fix This
&lt;/h2&gt;

&lt;p&gt;DMCA takedowns work on centralized platforms. GitHub complied within hours. But the code spread to places that are harder to reach.&lt;/p&gt;

&lt;p&gt;Gitlawb, with its explicit "Will never be taken down" message, operates outside the DMCA's practical reach. The Python port that appeared the same day was &lt;a href="https://decrypt.co/362917/anthropic-accidentally-leaked-claude-code-source-internet-keeping-forever" rel="noopener noreferrer"&gt;declared DMCA-proof&lt;/a&gt; by The Pragmatic Engineer's Gergely Orosz, who noted the rewrite is a new creative work that violates no copyright. There's also the AI copyright question: Anthropic's own CEO has implied that significant portions of Claude Code were written by Claude. The DC Circuit upheld in March 2025 that AI-generated work doesn't carry automatic copyright. If Anthropic's copyright claim over Claude-authored code is legally murky, the entire takedown strategy weakens.&lt;/p&gt;

&lt;p&gt;And then there are torrents. Content once on the internet at scale doesn't come back.&lt;/p&gt;

&lt;p&gt;The practical reality: 512,000 lines of Claude Code are permanently in the wild, regardless of what any court decides.&lt;/p&gt;




&lt;h2&gt;
  
  
  9. What This Means For You
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;If you're using Claude Code:&lt;/strong&gt; Update immediately past v2.1.88 and use the native installer going forward (&lt;code&gt;curl -fsSL https://claude.ai/install.sh | bash&lt;/code&gt;). If you updated via npm between 00:21 and 03:29 UTC on March 31, do the axios/RAT check above.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you're building AI coding tools:&lt;/strong&gt; The leaked source is now the most detailed public documentation of how to build a production-grade AI agent harness that exists. The three-layer memory architecture, the permission system, the tool plugin design, the multi-agent coordination patterns. It's all there, already analyzed by thousands of developers. The bar for what "production-grade" means just got documented in detail.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you're at Anthropic:&lt;/strong&gt; The code is out. KAIROS, ULTRAPLAN, and BUDDY are already built. Ship them. The community already knows they're coming. Turn the leak into a launch.&lt;/p&gt;




&lt;h2&gt;
  
  
  10. Lessons for Every Dev Team
&lt;/h2&gt;

&lt;p&gt;This incident is a clear example of how release pipeline failures compound. Regardless of your opinion on Anthropic, every team should run through this checklist:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1. Audit your .npmignore / package.json "files" field&lt;/span&gt;
&lt;span class="nb"&gt;cat&lt;/span&gt; .npmignore
&lt;span class="c"&gt;# Do you explicitly exclude *.map, dist/*.map, *.d.ts.map?&lt;/span&gt;

&lt;span class="c"&gt;# 2. Check if source maps ship in your production build&lt;/span&gt;
&lt;span class="nb"&gt;ls &lt;/span&gt;dist/ | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\.&lt;/span&gt;&lt;span class="s2"&gt;map$"&lt;/span&gt;
&lt;span class="c"&gt;# If you see anything: your bundler config needs review&lt;/span&gt;

&lt;span class="c"&gt;# 3. Audit your cloud storage permissions&lt;/span&gt;
&lt;span class="c"&gt;# Are any buckets referenced in your build artifacts publicly accessible?&lt;/span&gt;

&lt;span class="c"&gt;# 4. Check your build toolchain for known bugs&lt;/span&gt;
&lt;span class="c"&gt;# If you're on Bun, check issue #28001 status&lt;/span&gt;

&lt;span class="c"&gt;# 5. Review your npm publish workflow&lt;/span&gt;
npm pack &lt;span class="nt"&gt;--dry-run&lt;/span&gt;
&lt;span class="c"&gt;# Review EVERY file that would be published before actually publishing&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The line that came out of the Hacker News thread: &lt;strong&gt;"Your .npmignore is load-bearing. Treat it like a security boundary."&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Here's what we know for certain: a misconfigured &lt;code&gt;.npmignore&lt;/code&gt; and a public cloud storage bucket exposed 512,000 lines of Claude Code, the code spread instantly and is now permanently in the wild, the leak revealed a technically impressive product with a compelling feature roadmap, and Anthropic's brand among developers bounced back remarkably fast.&lt;/p&gt;

&lt;p&gt;What we'll probably never know: whether anyone inside Anthropic saw the Bun bug and made a judgment call, whether the April Fools' timing of the BUDDY rollout was coincidence, and whether Anthropic's relative restraint on DMCA enforcement is legal strategy or resource allocation.&lt;/p&gt;

&lt;p&gt;What's not in question is that the engineering inside Claude Code is genuinely impressive. The memory architecture, the anti-distillation mechanisms, the multi-agent coordination, the DRM-at-the-HTTP-layer attestation. This is a serious piece of software doing things that are actually hard.&lt;/p&gt;

&lt;p&gt;Accident or not, the world now knows what Anthropic is capable of building.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;And maybe that was the point.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;th&gt;Link&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Alex Kim's technical deep-dive&lt;/td&gt;
&lt;td&gt;&lt;a href="https://alex000kim.com/posts/2026-03-31-claude-code-source-leak/" rel="noopener noreferrer"&gt;alex000kim.com&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;VentureBeat — Full breakdown + axios RAT warning&lt;/td&gt;
&lt;td&gt;&lt;a href="https://venturebeat.com/technology/claude-codes-source-code-appears-to-have-leaked-heres-what-we-know" rel="noopener noreferrer"&gt;venturebeat.com&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;The Register — Anthropic's official statement&lt;/td&gt;
&lt;td&gt;&lt;a href="https://www.theregister.com/2026/03/31/anthropic_claude_code_source_code/" rel="noopener noreferrer"&gt;theregister.com&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fortune — Strategic analysis + Capybara confirmation&lt;/td&gt;
&lt;td&gt;&lt;a href="https://fortune.com/2026/03/31/anthropic-source-code-claude-code-data-leak-second-security-lapse-days-after-accidentally-revealing-mythos/" rel="noopener noreferrer"&gt;fortune.com&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Decrypt — DMCA analysis + permanent mirror situation&lt;/td&gt;
&lt;td&gt;&lt;a href="https://decrypt.co/362917/anthropic-accidentally-leaked-claude-code-source-internet-keeping-forever" rel="noopener noreferrer"&gt;decrypt.co&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CNBC — Revenue figures + company response&lt;/td&gt;
&lt;td&gt;&lt;a href="https://www.cnbc.com/2026/03/31/anthropic-leak-claude-code-internal-source.html" rel="noopener noreferrer"&gt;cnbc.com&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Axios — Feature flag breakdown + roadmap analysis&lt;/td&gt;
&lt;td&gt;&lt;a href="https://www.axios.com/2026/03/31/anthropic-leaked-source-code-ai" rel="noopener noreferrer"&gt;axios.com&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DEV.to (Gabriel Anhaia) — Architecture walkthrough&lt;/td&gt;
&lt;td&gt;&lt;a href="https://dev.to/gabrielanhaia/claude-codes-entire-source-code-was-just-leaked-via-npm-source-maps-heres-whats-inside-cjo"&gt;dev.to&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kuberwastaken/claude-code GitHub&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/Kuberwastaken/claude-code" rel="noopener noreferrer"&gt;github.com&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hacker News thread&lt;/td&gt;
&lt;td&gt;&lt;a href="https://news.ycombinator.com/item?id=47584540" rel="noopener noreferrer"&gt;news.ycombinator.com&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bun bug #28001&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/oven-sh/bun/issues/28001" rel="noopener noreferrer"&gt;github.com/oven-sh/bun&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CyberSecurityNews — Supply chain attack details&lt;/td&gt;
&lt;td&gt;&lt;a href="https://cybersecuritynews.com/claude-code-source-code-leaked/" rel="noopener noreferrer"&gt;cybersecuritynews.com&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;p&gt;&lt;em&gt;If this was useful, drop a reaction. If you spot anything I got wrong, leave it in the comments.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>ai</category>
      <category>programming</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Using Claude Code with Any LLM: Why a Gateway Changes Everything</title>
      <dc:creator>Varshith V Hegde</dc:creator>
      <pubDate>Fri, 13 Mar 2026 03:30:00 +0000</pubDate>
      <link>https://forem.com/varshithvhegde/using-claude-code-with-any-llm-why-a-gateway-changes-everything-4a0c</link>
      <guid>https://forem.com/varshithvhegde/using-claude-code-with-any-llm-why-a-gateway-changes-everything-4a0c</guid>
      <description>&lt;p&gt;I've been using Claude Code for a while now, and if you're a developer who has added it to your daily workflow, you probably know the feeling. It's genuinely good. It reads your codebase, runs commands, modifies files, and helps implement features right from your terminal without you having to context-switch constantly.&lt;/p&gt;

&lt;p&gt;But at some point, most developers hit the same wall I did: what if I want to use a different model?&lt;/p&gt;

&lt;p&gt;What if GPT-4o handles your specific codebase better? What if Gemini's larger context window is exactly what you need for that massive legacy project? What if you're spending more on API calls than you should be, and you know some of those simpler tasks could run on a cheaper model just fine?&lt;/p&gt;

&lt;p&gt;Out of the box, Claude Code only talks to Anthropic. That's just how it works. And while Anthropic's models are genuinely strong, being locked into a single provider means you're trading flexibility for convenience. This guide is about getting both.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Real Friction Points
&lt;/h2&gt;

&lt;p&gt;Before jumping into the solution, it helps to be specific about what problems we're actually solving.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Model flexibility.&lt;/strong&gt; Different models have different strengths. Claude Sonnet is excellent for most coding tasks, but you can't know it's the best tool for every job without being able to test alternatives. Without a gateway, you can't experiment without completely switching tools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cost management.&lt;/strong&gt; Claude Code burns through tokens quickly during an active session. Complex architectural work and boilerplate generation are not the same job, and pricing them identically doesn't make much sense. Routing simpler requests to a more affordable model can cut costs significantly without affecting output quality where it matters.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Compliance and data routing.&lt;/strong&gt; If you work in fintech, healthcare, or any regulated industry, you've likely dealt with requirements around where your data goes. Routing all API traffic through your own infrastructure before it reaches any external provider is often non-negotiable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Observability.&lt;/strong&gt; This one gets overlooked a lot. How many tokens does a typical Claude Code session consume? What's your actual cost per feature shipped? Without request logging, you're genuinely guessing.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Bifrost
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqakfazgxeydkto0p4xn6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqakfazgxeydkto0p4xn6.png" alt="Bifrost" width="800" height="468"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.getmaxim.ai/bifrost" rel="noopener noreferrer"&gt;Bifrost&lt;/a&gt; is an open-source LLM gateway built by &lt;a href="https://www.getmaxim.ai" rel="noopener noreferrer"&gt;Maxim AI&lt;/a&gt; to route, manage, and optimize requests between your application and multiple model providers. It's Apache 2.0 licensed, self-hostable, and supports 20+ providers including OpenAI, Anthropic, Google Gemini, AWS Bedrock, Azure, Mistral, Cohere, Groq, and more.&lt;/p&gt;

&lt;p&gt;A few things that make it stand out technically:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Performance that doesn't get in the way.&lt;/strong&gt; At 5,000 requests per second, Bifrost adds less than 15 microseconds of internal overhead per request. At production scale, that's essentially nothing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Zero-config startup.&lt;/strong&gt; A single &lt;code&gt;npx&lt;/code&gt; command launches the gateway, and everything else is configurable through a web UI.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Built-in fallbacks and load balancing.&lt;/strong&gt; If a provider fails or rate-limits you, Bifrost automatically routes to a backup. Traffic can also be distributed across multiple keys or providers using weighted rules.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Semantic caching.&lt;/strong&gt; Repeated or semantically similar queries can be served from cache, which reduces both latency and cost for workflows with a lot of repetitive prompting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Full observability out of the box.&lt;/strong&gt; Prometheus metrics, request tracing, token usage, latency, and a built-in web dashboard are all included.&lt;/p&gt;

&lt;p&gt;The architecture is straightforward:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Claude Code  --&amp;gt;  Bifrost (localhost:8080)  --&amp;gt;  Any LLM Provider
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Claude Code uses an environment variable called &lt;code&gt;ANTHROPIC_BASE_URL&lt;/code&gt; to know where to send API requests. Normally it points to &lt;code&gt;https://api.anthropic.com&lt;/code&gt;. You point it at Bifrost instead. Bifrost accepts requests in Anthropic's Messages API format, translates them to whichever provider you've configured, and translates the response back. Claude Code never knows the difference.&lt;/p&gt;

&lt;p&gt;No code changes. No patching. One environment variable.&lt;/p&gt;




&lt;h2&gt;
  
  
  What We'll Cover
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Setting up and configuring Bifrost with multiple LLM providers&lt;/li&gt;
&lt;li&gt;Integrating Claude Code with the gateway&lt;/li&gt;
&lt;li&gt;Running Claude Code with any model&lt;/li&gt;
&lt;li&gt;Configuring routing rules, fallbacks, and budgets&lt;/li&gt;
&lt;li&gt;Integrating MCP tools&lt;/li&gt;
&lt;li&gt;Using built-in observability and monitoring&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Part 1: Setting Up Bifrost
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Install Bifrost
&lt;/h3&gt;

&lt;p&gt;Create a project folder, open it in your editor, and run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx &lt;span class="nt"&gt;-y&lt;/span&gt; @maximhq/bifrost &lt;span class="nt"&gt;-app-dir&lt;/span&gt; ./my-bifrost-data
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;-app-dir&lt;/code&gt; flag tells Bifrost where to store all its data. Bifrost will start listening on port 8080.&lt;/p&gt;

&lt;p&gt;If you prefer Docker:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker pull maximhq/bifrost
docker run &lt;span class="nt"&gt;-p&lt;/span&gt; 8080:8080 &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;pwd&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;/data:/app/data maximhq/bifrost
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;-v&lt;/code&gt; flag mounts a volume so your configuration persists across container restarts.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Create Your Config File
&lt;/h3&gt;

&lt;p&gt;Inside your &lt;code&gt;./my-bifrost-data&lt;/code&gt; folder, create a &lt;code&gt;config.json&lt;/code&gt; file. This defines which providers Bifrost can route to, enables request logging, and sets up database persistence:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"$schema"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://www.getbifrost.ai/schema"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"client"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"enable_logging"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"disable_content_logging"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"drop_excess_requests"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"initial_pool_size"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"allow_direct_keys"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"providers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"openai"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"keys"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"openai-primary"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"env.OPENAI_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"models"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[],&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"weight"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"anthropic"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"keys"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"anthropic-primary"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"env.ANTHROPIC_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"models"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[],&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"weight"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"gemini"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"keys"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"gemini-primary"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"env.GEMINI_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"models"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[],&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"weight"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"config_store"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"enabled"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sqlite"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"config"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"path"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"./config.db"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"logs_store"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"enabled"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sqlite"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"config"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"path"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"./logs.db"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;"value": "env.OPENAI_API_KEY"&lt;/code&gt; syntax tells Bifrost to read actual keys from environment variables rather than storing them in the file. Your secrets stay out of version control.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Set Your API Keys
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"your-openai-api-key"&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"your-anthropic-api-key"&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;GEMINI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"your-gemini-api-key"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Start the Gateway
&lt;/h3&gt;

&lt;p&gt;Stop any previously running Bifrost instance, then start it again with the app directory flag:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx &lt;span class="nt"&gt;-y&lt;/span&gt; @maximhq/bifrost &lt;span class="nt"&gt;-app-dir&lt;/span&gt; ./my-bifrost-data
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Open &lt;code&gt;http://localhost:8080&lt;/code&gt; in your browser. You'll see the Bifrost dashboard where all configuration and monitoring lives.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 2: Connecting Claude Code to Bifrost
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Install Claude Code
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @anthropic-ai/claude-code
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Point Claude Code at Bifrost
&lt;/h3&gt;

&lt;p&gt;Set these two environment variables in the same terminal session where you'll run Claude Code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;ANTHROPIC_BASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"http://localhost:8080/anthropic"&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"dummy-key"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;dummy-key&lt;/code&gt; part is a bit counterintuitive at first. Claude Code requires this variable to be set before it will run, but Bifrost handles actual authentication to providers using the keys you configured earlier. You can put any non-empty string here.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Run Claude Code with Any Model
&lt;/h3&gt;

&lt;p&gt;Start Claude Code and specify whichever model you want to use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude &lt;span class="nt"&gt;--model&lt;/span&gt; openai/gpt-4o
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To route to other providers, use the provider prefix pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="s"&gt;openai/gpt-4o&lt;/span&gt;
&lt;span class="s"&gt;openai/gpt-4o-mini&lt;/span&gt;
&lt;span class="s"&gt;gemini/gemini-2.5-pro&lt;/span&gt;
&lt;span class="s"&gt;groq/llama-3.1-70b-versatile&lt;/span&gt;
&lt;span class="s"&gt;mistral/mistral-large-latest&lt;/span&gt;
&lt;span class="s"&gt;anthropic/claude-sonnet-4-20250514&lt;/span&gt;
&lt;span class="s"&gt;ollama/llama3&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run a quick sanity check by asking something simple like "Hello there" to confirm requests are flowing through correctly.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 3: Routing Rules, Fallbacks, and Budgets
&lt;/h2&gt;

&lt;p&gt;Once Claude Code is connected, you can start using Bifrost's routing features to get more control over how requests are handled.&lt;/p&gt;

&lt;h3&gt;
  
  
  Weighted Routing Across Providers
&lt;/h3&gt;

&lt;p&gt;Virtual Keys in Bifrost let you define routing logic that applies automatically. Navigate to &lt;strong&gt;Governance &amp;gt; Virtual Keys&lt;/strong&gt;, create a key, and configure your routing weights:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"dev-routing"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"budget"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"max_budget"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"budget_duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"monthly"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"providers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"provider"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"openai"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"gpt-4o"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"weight"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"provider"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"anthropic"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"claude-sonnet-4-20250514"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"weight"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This routes 70% of requests to GPT-4o and 30% to Claude Sonnet, with a hard monthly cap of $100. Once the budget is exhausted, Bifrost stops routing automatically. For teams, this replaces a lot of manual cost monitoring.&lt;/p&gt;

&lt;h3&gt;
  
  
  Automatic Fallbacks
&lt;/h3&gt;

&lt;p&gt;When a provider goes down or you hit a rate limit, Bifrost works down a fallback list until a request succeeds:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"openai/gpt-4o"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"fallbacks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"provider"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"anthropic"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"claude-sonnet-4-20250514"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"provider"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"gemini"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"gemini-2.5-pro"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your coding session continues without any manual intervention when a provider has issues.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 4: MCP Tool Integration
&lt;/h2&gt;

&lt;p&gt;If you're using Model Context Protocol servers for filesystem access, web search, database queries, or custom integrations, Bifrost supports those too. Configure them once in Bifrost, and they become available to any model routing through it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Add MCP Configuration to Bifrost
&lt;/h3&gt;

&lt;p&gt;Update your &lt;code&gt;config.json&lt;/code&gt; to include MCP server definitions. Here's an example with filesystem access:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"$schema"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://www.getbifrost.ai/schema"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"client"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"enable_logging"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"disable_content_logging"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"drop_excess_requests"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"initial_pool_size"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"allow_direct_keys"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;

  &lt;/span&gt;&lt;span class="nl"&gt;"mcp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"client_configs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"filesystem"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"connection_type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"stdio"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"stdio_config"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@modelcontextprotocol/server-filesystem"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/tmp"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"tools_to_execute"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"*"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"tools_to_auto_execute"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="s2"&gt;"read_file"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="s2"&gt;"list_directory"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="s2"&gt;"create_file"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="s2"&gt;"delete_file"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"tool_manager_config"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"max_agent_depth"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"tool_execution_timeout"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;300000000000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"code_mode_binding_level"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"server"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Restart Bifrost and navigate to the MCP catalog page in the web UI to confirm the filesystem server shows as connected.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Add Bifrost as an MCP Server in Claude Code
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude mcp add &lt;span class="nt"&gt;--transport&lt;/span&gt; http bifrost http://localhost:8080/mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Verify with a Real Task
&lt;/h3&gt;

&lt;p&gt;Restart Claude Code and try a task that exercises the MCP tools. For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Create a simple calculator program in Python.

It should support addition, subtraction, multiplication, and division.
The user should input two numbers and an operation, and the program should print the result.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then follow up with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Analyze this repository and create a README.md explaining how the project works.
Include the project architecture and instructions for running it locally.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the MCP integration is working, Claude Code will read your files, create new ones, and interact with your filesystem through Bifrost's tool injection.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 5: Observability and Monitoring
&lt;/h2&gt;

&lt;p&gt;This is the part that surprised me most when I first set it up.&lt;/p&gt;

&lt;p&gt;Every request that passes through Bifrost is logged with full detail: the input prompt, the response, which model handled it, latency, and cost. The web interface at &lt;code&gt;http://localhost:8080/logs&lt;/code&gt; provides:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Real-time streaming of requests and responses&lt;/li&gt;
&lt;li&gt;Token usage tracking per request&lt;/li&gt;
&lt;li&gt;Latency measurements&lt;/li&gt;
&lt;li&gt;Filtering by provider, model, or conversation content&lt;/li&gt;
&lt;li&gt;Full request and response inspection&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For individual developers, it's useful for understanding your actual usage patterns. For teams, it becomes a proper audit trail. You can see which models are being used most, where the expensive requests are coming from, and whether your routing rules are actually behaving as expected.&lt;/p&gt;

&lt;p&gt;Bifrost also exposes Prometheus metrics for teams that want to integrate this data into existing monitoring pipelines.&lt;/p&gt;




&lt;h2&gt;
  
  
  Is This Worth Setting Up?
&lt;/h2&gt;

&lt;p&gt;If you're a solo developer who uses Claude Code occasionally and doesn't have any compliance or cost concerns, the default setup is probably fine.&lt;/p&gt;

&lt;p&gt;But if any of the following are true, a gateway is worth the time:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You want to test how different models perform on your specific workload&lt;/li&gt;
&lt;li&gt;You're managing API costs across a team&lt;/li&gt;
&lt;li&gt;Your organization has requirements around data routing or infrastructure control&lt;/li&gt;
&lt;li&gt;You want actual visibility into your AI usage rather than end-of-month billing surprises&lt;/li&gt;
&lt;li&gt;You use MCP tools and want them available across multiple model providers without reconfiguring each time&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Bifrost being open source and self-hosted means your prompts and responses stay on your own infrastructure. For teams working on proprietary codebases, that's a meaningful difference from routing everything directly to a third-party API.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Get started:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Website: &lt;a href="https://www.getmaxim.ai/bifrost" rel="noopener noreferrer"&gt;getmax.im/bifrost&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;GitHub: &lt;a href="https://git.new/bifrost" rel="noopener noreferrer"&gt;git.new/bifrost&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Docs: &lt;a href="https://www.getmaxim.ai/bifrost/resources/claude-code" rel="noopener noreferrer"&gt;getmax.im/bifrostdocs&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>cli</category>
      <category>llm</category>
      <category>tooling</category>
    </item>
    <item>
      <title>ContractCompass: Your AI Contract Analyst That Actually Speaks Human</title>
      <dc:creator>Varshith V Hegde</dc:creator>
      <pubDate>Sun, 08 Feb 2026 12:34:11 +0000</pubDate>
      <link>https://forem.com/varshithvhegde/contractcompass-your-ai-contract-analyst-that-actually-speaks-human-nfo</link>
      <guid>https://forem.com/varshithvhegde/contractcompass-your-ai-contract-analyst-that-actually-speaks-human-nfo</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/algolia"&gt;Algolia Agent Studio Challenge&lt;/a&gt;: Consumer-Facing Conversational Experiences&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F833ehy8i626n22zc4pcn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F833ehy8i626n22zc4pcn.png" alt="FrontPage"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ContractCompass&lt;/strong&gt; is an AI-powered contract analysis tool that turns legal jargon into plain English through natural conversation. Think of it as having a friendly lawyer friend who can review your contract over coffee, except this friend never gets tired, works 24/7, and doesn't charge $400/hour.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Problem
&lt;/h3&gt;

&lt;p&gt;Most people sign contracts they don't fully understand. Employment agreements, rental leases, SaaS terms—they're all written in dense legal language that assumes you went to law school. By the time you realize that "perpetual, irrevocable, worldwide license" means the company owns your weekend projects forever, you've already signed away your rights.&lt;/p&gt;

&lt;p&gt;According to research, over 90% of people don't read terms and conditions before accepting them. It's not laziness. These documents are genuinely incomprehensible to the average person. A typical employment contract might be 15 pages of legal clauses that take hours to parse, assuming you even know what to look for.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Solution
&lt;/h3&gt;

&lt;p&gt;ContractCompass solves this through dialogue-based AI interaction. Instead of drowning you in legal analysis reports, it lets you have a natural conversation about your contract:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"What are the red flags here?"&lt;/li&gt;
&lt;li&gt;"Can you explain this termination clause like I'm five?"&lt;/li&gt;
&lt;li&gt;"Is this non-compete actually enforceable?"&lt;/li&gt;
&lt;li&gt;"What should I negotiate before signing?"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The AI agent responds in real-time with contextual answers grounded in a curated database of contract clauses, powered by &lt;strong&gt;Algolia Agent Studio's&lt;/strong&gt; semantic search capabilities.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fucwnfeemcsvsdrsati7u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fucwnfeemcsvsdrsati7u.png" alt="ChatInterface Initial"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Capabilities
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Conversational AI Interface&lt;/strong&gt; - Chat naturally with the agent. No forms, no checkboxes, just questions and answers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Intelligent Risk Detection&lt;/strong&gt; - Every clause gets analyzed and scored on a three-tier system (Low, Medium, High risk) with visual indicators.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Plain English Translations&lt;/strong&gt; - Legal jargon becomes "here's what this actually means for you."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Industry Comparisons&lt;/strong&gt; - The agent explains whether clauses are standard practice or unusual outliers worth negotiating.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rich Visual Analysis&lt;/strong&gt; - For deep dives, the agent generates structured analysis cards with prevalence bars, red flag lists, and detailed reasoning.&lt;/p&gt;




&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Live Demo:&lt;/strong&gt; &lt;a href="https://contractcompass.varshithvhegde.in/" rel="noopener noreferrer"&gt;https://contractcompass.varshithvhegde.in/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;No login required. No credit card. Just upload a contract (or try one of the built-in samples) and start asking questions.&lt;/p&gt;

&lt;p&gt;

  &lt;iframe src="https://www.youtube.com/embed/cWGjpM0eIMc"&gt;
  &lt;/iframe&gt;


&lt;/p&gt;

&lt;h3&gt;
  
  
  How It Feels to Use
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;1. Upload is effortless&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Drag and drop a PDF, paste text, or click one of the sample contracts. I've included four pre-loaded examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Friendly Startup Offer&lt;/strong&gt; (Low Risk) - A well-balanced employment agreement with fair terms&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Red Flag Employment Contract&lt;/strong&gt; (High Risk) - Includes unilateral salary cuts, 24-month lock-in, overbroad IP assignment, 3-year non-compete, and $500K liquidated damages&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Predatory Rental Agreement&lt;/strong&gt; (High Risk) - Non-refundable deposits, tenant pays for ALL repairs, no-notice landlord entry, uncapped rent increases&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reasonable SaaS Agreement&lt;/strong&gt; (Low Risk) - Standard business terms with mutual protections&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq3k1d4mwn40h4qziqk0y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq3k1d4mwn40h4qziqk0y.png" alt="Upload"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. The interface splits&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Your contract appears on the left for reference, chat on the right. You can always scroll back to check what clause the AI is talking about.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fde48nk9tg4pmqhr0n3j4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fde48nk9tg4pmqhr0n3j4.png" alt="Interface"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Suggested prompts guide you&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Six smart buttons help you get started:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Full Risk Analysis&lt;/li&gt;
&lt;li&gt;Find red flags&lt;/li&gt;
&lt;li&gt;Explain in plain English&lt;/li&gt;
&lt;li&gt;What should I negotiate?&lt;/li&gt;
&lt;li&gt;Compare to standards&lt;/li&gt;
&lt;li&gt;Is this enforceable?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;4. Streaming responses feel natural&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The AI types back to you in real-time, token by token, like a real conversation. No waiting for a complete response to load. You see the analysis unfold naturally.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fulhh7i9pelxwngfz5zs4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fulhh7i9pelxwngfz5zs4.png" alt="Suggested Prompts"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  How I Used Algolia Agent Studio
&lt;/h2&gt;

&lt;p&gt;Algolia Agent Studio is the intelligence engine that makes ContractCompass possible. Here's how it powers the entire conversational experience.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Index: A Knowledge Base of Contract Clauses
&lt;/h3&gt;

&lt;p&gt;I created an Algolia index called &lt;code&gt;contract_clauses&lt;/code&gt; containing &lt;strong&gt;50+ curated contract clauses&lt;/strong&gt; across four contract types (employment, rental, SaaS, freelance). Each record includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;clause_text&lt;/strong&gt; - The full text of the clause&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;clause_type&lt;/strong&gt; - Category (termination, compensation, non-compete, etc.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;contract_type&lt;/strong&gt; - Employment, rental, SaaS, or freelance&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;industry&lt;/strong&gt; - Tech, real estate, or general&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;prevalence_score&lt;/strong&gt; - A 0-1 score indicating how common this clause is&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;risk_level&lt;/strong&gt; - Low, medium, or high&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;plain_english&lt;/strong&gt; - Simple explanation for non-lawyers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;red_flags&lt;/strong&gt; - List of concerning aspects&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;standard_version&lt;/strong&gt; - What a fair version would look like&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;legal_implications&lt;/strong&gt; - Real-world impact of accepting the clause&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdkqc0yvfy8e811pvjjz2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdkqc0yvfy8e811pvjjz2.png" alt="Algolia index"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For example, a "predatory non-compete" clause record looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"objectID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"emp-nc-003"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"clause_text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Employee agrees not to work for any competing business..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"clause_type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"non_compete"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"contract_type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"employment"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"industry"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tech"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"prevalence_score"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"risk_level"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"high"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"plain_english"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"You can't work for competitors for 3 years across all of North America"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"red_flags"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Unreasonably broad geographic scope"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Excessive duration"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"standard_version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Typically 6-12 months within 50 miles of office"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"legal_implications"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"May prevent you from working in your field"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  How Retrieval Powers the Conversation
&lt;/h3&gt;

&lt;p&gt;When a user uploads a contract and starts asking questions, here's what happens behind the scenes:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Semantic Clause Matching&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The Algolia agent retrieves semantically similar clauses from the index to provide context-aware responses. For example, if someone asks "Is this non-compete fair?", the agent:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Identifies the non-compete clause in the uploaded contract&lt;/li&gt;
&lt;li&gt;Searches the index for similar non-compete clauses&lt;/li&gt;
&lt;li&gt;Compares the uploaded clause against standard versions&lt;/li&gt;
&lt;li&gt;Explains whether it's typical or unusually restrictive&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;2. Contract Type Detection&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The agent automatically identifies the type of contract (employment, rental, SaaS, etc.) based on the language and clauses present, then adjusts its analysis accordingly. An employment contract gets compared against employment standards, not rental standards.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Prevalence-Based Risk Assessment&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Using the prevalence scores from the indexed data, the agent can say things like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"This termination clause is standard. About 95% of tech employment contracts include similar terms"&lt;/li&gt;
&lt;li&gt;"This security deposit policy is unusual. Only 15% of rental agreements make deposits non-refundable"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;4. Standard Version Recommendations&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When a clause is problematic, the agent doesn't just say "this is bad." It shows what a fair version would look like:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"The current non-compete restricts you for 3 years across North America. A standard tech industry non-compete is typically 6-12 months within 50 miles of the office."&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;
  
  
  Making the Agent Conversational
&lt;/h3&gt;

&lt;p&gt;The key to making ContractCompass feel natural was teaching the agent to think like a helpful friend, not a legal robot. I crafted prompts that guide it to:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Speak like a human&lt;/strong&gt; - Use simple language. Avoid legal jargon unless explaining it. Be conversational but professional.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Be honest about risks&lt;/strong&gt; - If a clause is predatory, say so clearly. Don't sugarcoat problematic terms.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ground everything in data&lt;/strong&gt; - Always search the contract_clauses index for similar examples. Compare the user's clause against standard versions and explain how it differs from typical industry practice.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Provide actionable advice&lt;/strong&gt; - Don't just identify problems. Suggest what to negotiate and how to approach it.&lt;/p&gt;

&lt;p&gt;This approach ensures every response is both friendly and useful, backed by real contract data rather than generic advice.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4j4w7017pmxu4o8eqhr7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4j4w7017pmxu4o8eqhr7.png" alt="Contract Agent Response"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  Why Fast Retrieval Matters
&lt;/h2&gt;

&lt;p&gt;Algolia's speed and semantic search capabilities are critical to making ContractCompass feel like a real conversation rather than a clunky Q&amp;amp;A bot.&lt;/p&gt;
&lt;h3&gt;
  
  
  Speed Creates Natural Dialogue
&lt;/h3&gt;

&lt;p&gt;When someone asks "What are the red flags in this contract?", they expect an answer within seconds, not minutes. Algolia's sub-50ms search latency means the agent can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Retrieve relevant clause examples instantly&lt;/li&gt;
&lt;li&gt;Stream responses token-by-token without lag&lt;/li&gt;
&lt;li&gt;Handle follow-up questions in the same conversation thread&lt;/li&gt;
&lt;li&gt;Maintain context across multiple queries&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If retrieval took 5-10 seconds per query, users would lose patience. The conversation would feel broken. Fast retrieval makes the experience feel fluid and natural.&lt;/p&gt;
&lt;h3&gt;
  
  
  Contextual Retrieval Enables Nuanced Analysis
&lt;/h3&gt;

&lt;p&gt;Algolia's semantic search doesn't just match keywords. It understands meaning. This is crucial for contract analysis because:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Legal language varies widely&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A non-compete clause might say "Employee shall not engage in competitive activities" or "You agree not to work for rival companies." These are semantically similar but textually different. Algolia's vector-based search matches them both.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Users ask in natural language&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Someone might ask "Can they really fire me for any reason?" which should match clauses about "at-will employment" or "termination without cause." Semantic search bridges this gap.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Context matters&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A liability cap of $100K might be reasonable in a $10K/year SaaS contract but predatory in a $500K enterprise agreement. By retrieving similar contracts in the same industry and price range, the agent provides context-aware analysis.&lt;/p&gt;
&lt;h3&gt;
  
  
  Retrieval Grounds Responses in Real Data
&lt;/h3&gt;

&lt;p&gt;One of the biggest risks with AI agents is hallucination. Making up plausible-sounding but incorrect information. By grounding every response in retrieved data from the curated index, ContractCompass avoids this problem.&lt;/p&gt;

&lt;p&gt;When the agent says "This non-compete is unusually restrictive," it's not guessing. It's comparing the uploaded clause against the prevalence scores and standard versions in the index. When it explains what a fair clause looks like, it's showing you actual examples from the database.&lt;/p&gt;

&lt;p&gt;This retrieval-augmented generation (RAG) approach makes the agent both reliable and trustworthy.&lt;/p&gt;
&lt;h3&gt;
  
  
  The Impact on User Experience
&lt;/h3&gt;

&lt;p&gt;From a user perspective, fast contextual retrieval translates to:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Confidence in the analysis&lt;/strong&gt; - "This isn't just an AI's opinion, it's based on real contract data"&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Immediate answers&lt;/strong&gt; - "I can get my questions answered in real-time without waiting"&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conversational flow&lt;/strong&gt; - "It feels like talking to a human expert who knows contract law"&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Actionable insights&lt;/strong&gt; - "I now know exactly what to negotiate before signing"&lt;/p&gt;

&lt;p&gt;Without Algolia's speed and semantic capabilities, ContractCompass would be a generic chatbot that gives vague, unhelpful advice. With them, it's a genuinely useful tool that empowers people to understand and negotiate their contracts.&lt;/p&gt;


&lt;h2&gt;
  
  
  Technical Architecture
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Frontend (React + TypeScript)
&lt;/h3&gt;

&lt;p&gt;The interface is built with React 18 and TypeScript for type safety. I chose a modern stack that prioritizes performance and developer experience:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;UI Library:&lt;/strong&gt; Tailwind CSS + shadcn/ui components for a clean, professional look&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;State Management:&lt;/strong&gt; React hooks for local state (no complex state library needed)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Markdown Rendering:&lt;/strong&gt; react-markdown for rich text in chat responses&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  AI Agent (Algolia Agent Studio)
&lt;/h3&gt;

&lt;p&gt;The chat interface calls Algolia Agent Studio directly from the frontend. This direct integration means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Real-time streaming responses that appear token-by-token&lt;/li&gt;
&lt;li&gt;No backend proxy needed for chat, which reduces latency&lt;/li&gt;
&lt;li&gt;Full conversation history sent with each request for contextual follow-ups&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Search Index (Algolia)
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;contract_clauses&lt;/code&gt; index contains 300+ curated clauses. Each clause is enriched with metadata (prevalence scores, risk levels, plain English explanations) that the agent uses to provide contextual analysis.&lt;/p&gt;
&lt;h3&gt;
  
  
  PDF Processing
&lt;/h3&gt;

&lt;p&gt;When users upload PDFs, the text extraction happens server-side using Google Gemini 2.5 Flash. The flow is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;User uploads PDF via drag-and-drop&lt;/li&gt;
&lt;li&gt;PDF converts to base64 on the client&lt;/li&gt;
&lt;li&gt;Base64 data sent to serverless function&lt;/li&gt;
&lt;li&gt;Function calls Gemini API for text extraction&lt;/li&gt;
&lt;li&gt;Extracted text returns to the frontend&lt;/li&gt;
&lt;li&gt;Text loads into chat interface for analysis&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;
  
  
  Backend (Serverless Functions)
&lt;/h3&gt;

&lt;p&gt;Four serverless functions handle specific tasks:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;extract-pdf&lt;/strong&gt; - PDF text extraction using Gemini&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;analyze-contract&lt;/strong&gt; - Clause parsing and analysis&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;search-clauses&lt;/strong&gt; - Direct Algolia index queries&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;seed-algolia&lt;/strong&gt; - Index population with curated data&lt;/li&gt;
&lt;/ol&gt;


&lt;h2&gt;
  
  
  Design Decisions
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Split-Screen Layout
&lt;/h3&gt;

&lt;p&gt;I chose a split-screen design (contract on left, chat on right) because users need to reference the original text while discussing it. It feels more collaborative, like reviewing a document with someone. Mobile users get a stacked layout that still works well.&lt;/p&gt;
&lt;h3&gt;
  
  
  Color-Coded Risk Levels
&lt;/h3&gt;

&lt;p&gt;Risk levels use universal color psychology:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Green&lt;/strong&gt; - Safe, standard terms&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Amber&lt;/strong&gt; - Caution, worth discussing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Red&lt;/strong&gt; - Danger, likely problematic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These colors are consistent across risk badges, prevalence bars, and analysis cards. You can glance at a clause and immediately understand its risk level.&lt;/p&gt;
&lt;h3&gt;
  
  
  Suggested Prompts
&lt;/h3&gt;

&lt;p&gt;Not everyone knows what questions to ask about a contract. The six suggested prompts serve as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Onboarding&lt;/strong&gt; - Showing users what's possible&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Efficiency&lt;/strong&gt; - Common questions answered with one click&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Discovery&lt;/strong&gt; - Revealing features users might not know about&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Streaming Responses
&lt;/h3&gt;

&lt;p&gt;Token-by-token streaming makes the AI feel more human and less like a loading bar. It also provides immediate feedback that the system is working. Users don't stare at a blank screen wondering if anything is happening.&lt;/p&gt;


&lt;h2&gt;
  
  
  Challenges and Learnings
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Challenge 1: Balancing Legal Accuracy with Accessibility
&lt;/h3&gt;

&lt;p&gt;Legal language exists for precision. Simplifying it risks losing important nuances. I solved this by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Providing both the original clause text and plain English side-by-side&lt;/li&gt;
&lt;li&gt;Including detailed "legal implications" sections for those who want depth&lt;/li&gt;
&lt;li&gt;Being honest about limitations (the disclaimer reminds users this isn't legal advice)&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Challenge 2: Handling Diverse Contract Formats
&lt;/h3&gt;

&lt;p&gt;Contracts vary wildly in structure. Some are 2 pages, others are 50. Some use headers, others are wall-to-wall text. The PDF extraction with Gemini handles this by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Preserving structure where possible&lt;/li&gt;
&lt;li&gt;Extracting text even from scanned/image PDFs&lt;/li&gt;
&lt;li&gt;Cleaning up formatting artifacts&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Challenge 3: Preventing AI Hallucination
&lt;/h3&gt;

&lt;p&gt;Early versions sometimes invented red flags that didn't exist. The solution was retrieval-augmented generation. Every analysis is now grounded in retrieved clause data from the index. The agent can only reference what it finds in the search results.&lt;/p&gt;
&lt;h3&gt;
  
  
  Challenge 4: Making Risk Scores Meaningful
&lt;/h3&gt;

&lt;p&gt;A simple "high risk" label isn't actionable. I added:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Prevalence scores&lt;/strong&gt; - "Only 20% of contracts include this"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Standard versions&lt;/strong&gt; - "Here's what fair looks like"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Specific red flags&lt;/strong&gt; - "This clause is concerning because..."&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These additions turn a vague warning into specific, actionable information.&lt;/p&gt;


&lt;h3&gt;
  
  
  Export Analysis as PDF
&lt;/h3&gt;

&lt;p&gt;Let users download a full risk report they can share with lawyers or keep for their records. Make it official and presentable.&lt;/p&gt;


&lt;h2&gt;
  
  
  What I Learned
&lt;/h2&gt;

&lt;p&gt;Building ContractCompass taught me that the best AI tools don't feel like AI tools. They feel like helpful conversations with knowledgeable friends. The key is combining:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Fast, semantic search&lt;/strong&gt; that finds the right information instantly&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Thoughtful prompting&lt;/strong&gt; that guides the AI to be helpful, not robotic&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real data&lt;/strong&gt; that grounds responses in facts, not hallucinations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Clear design&lt;/strong&gt; that makes complex information accessible&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Algolia Agent Studio made the first part possible. The rest was about understanding what people actually need when facing a contract: clarity, confidence, and actionable advice.&lt;/p&gt;


&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;ContractCompass demonstrates how conversational AI powered by fast, semantic search can democratize access to legal understanding. By combining Algolia Agent Studio's retrieval capabilities with a thoughtfully designed user experience, it transforms contract analysis from an intimidating expert task into an accessible conversation.&lt;/p&gt;

&lt;p&gt;The key insight: people don't need to become lawyers to understand their contracts. They just need the right questions answered in language they can understand, backed by real data about what's standard and what's not.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Try it yourself:&lt;/strong&gt; &lt;a href="https://contractcompass.varshithvhegde.in/" rel="noopener noreferrer"&gt;https://contractcompass.varshithvhegde.in/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feok2trmgk5oi4zv24klm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feok2trmgk5oi4zv24klm.png" alt="Landing Page"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  Built With
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Powered by:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.algolia.com/doc/guides/ai/agent-studio/" rel="noopener noreferrer"&gt;Algolia Agent Studio&lt;/a&gt; - Conversational AI with semantic search&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://react.dev" rel="noopener noreferrer"&gt;React&lt;/a&gt; + &lt;a href="https://www.typescriptlang.org/" rel="noopener noreferrer"&gt;TypeScript&lt;/a&gt; - Frontend framework&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://tailwindcss.com/" rel="noopener noreferrer"&gt;Tailwind CSS&lt;/a&gt; + &lt;a href="https://ui.shadcn.com/" rel="noopener noreferrer"&gt;shadcn/ui&lt;/a&gt; - UI components&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://deepmind.google/technologies/gemini/" rel="noopener noreferrer"&gt;Google Gemini&lt;/a&gt; - PDF text extraction&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;br&gt;


&lt;/p&gt;
&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/Varshithvhegde" rel="noopener noreferrer"&gt;
        Varshithvhegde
      &lt;/a&gt; / &lt;a href="https://github.com/Varshithvhegde/contract-compass" rel="noopener noreferrer"&gt;
        contract-compass
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;ContractCompass 🧭&lt;/h1&gt;
&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;AI-Powered Contract Analysis for Non-Lawyers&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Chat with AI to understand your contract. Identify risks, get plain-English explanations, and learn what to negotiate — powered by &lt;strong&gt;Algolia Agent Studio&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;📋 Table of Contents&lt;/h2&gt;
&lt;/div&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#overview" rel="noopener noreferrer"&gt;Overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#live-demo" rel="noopener noreferrer"&gt;Live Demo&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#key-features" rel="noopener noreferrer"&gt;Key Features&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#architecture" rel="noopener noreferrer"&gt;Architecture&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#technology-stack" rel="noopener noreferrer"&gt;Technology Stack&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#algolia-agent-studio-integration" rel="noopener noreferrer"&gt;Algolia Agent Studio Integration&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#how-it-works" rel="noopener noreferrer"&gt;How It Works&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#contract-types-supported" rel="noopener noreferrer"&gt;Contract Types Supported&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#sample-contracts" rel="noopener noreferrer"&gt;Sample Contracts&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#risk-assessment-system" rel="noopener noreferrer"&gt;Risk Assessment System&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#conversational-ai-capabilities" rel="noopener noreferrer"&gt;Conversational AI Capabilities&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#structured-risk-analysis" rel="noopener noreferrer"&gt;Structured Risk Analysis&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#pdf-extraction" rel="noopener noreferrer"&gt;PDF Extraction&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#algolia-search-index" rel="noopener noreferrer"&gt;Algolia Search Index&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#uiux-design" rel="noopener noreferrer"&gt;UI/UX Design&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#edge-functions" rel="noopener noreferrer"&gt;Edge Functions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#security" rel="noopener noreferrer"&gt;Security&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Varshithvhegde/contract-compass#getting-started" rel="noopener noreferrer"&gt;Getting Started&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Overview&lt;/h2&gt;
&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;ContractCompass&lt;/strong&gt; is an intelligent contract analysis tool designed to help everyday people — not lawyers — understand legal documents before they sign. Users upload or paste a contract, then have a real-time conversation with an AI agent that identifies risky clauses, explains legal jargon in plain English, and compares terms against industry standards.&lt;/p&gt;

&lt;p&gt;The AI agent is powered by &lt;strong&gt;Algolia Agent Studio&lt;/strong&gt;, which provides semantic search and retrieval of similar contract clauses…&lt;/p&gt;
&lt;/div&gt;


&lt;/div&gt;
&lt;br&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/Varshithvhegde/contract-compass" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;br&gt;
&lt;/div&gt;








&lt;p&gt;&lt;em&gt;ContractCompass is not a substitute for professional legal advice. Always consult a qualified attorney for legal matters.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>algoliachallenge</category>
      <category>ai</category>
      <category>agents</category>
    </item>
    <item>
      <title>Why Your AI Gateway Needs MCP Integration in 2026</title>
      <dc:creator>Varshith V Hegde</dc:creator>
      <pubDate>Mon, 02 Feb 2026 10:19:30 +0000</pubDate>
      <link>https://forem.com/varshithvhegde/why-your-ai-gateway-needs-mcp-integration-in-2026-3dcf</link>
      <guid>https://forem.com/varshithvhegde/why-your-ai-gateway-needs-mcp-integration-in-2026-3dcf</guid>
      <description>&lt;p&gt;You know that feeling when you've spent three hours debugging why your AI agent can't access your database for the third time this week? &lt;/p&gt;

&lt;p&gt;I was there last month. Five different tool integrations, each with its own authentication flow, error handling, and connection management. Want to add Slack notifications? Write another integration. Need file system access? Another one. Every integration was basically the same boilerplate with different endpoints.&lt;/p&gt;

&lt;p&gt;Then I found the Model Context Protocol and Bifrost. It sounded too good to be true one gateway, one protocol, unlimited tools. But it actually works, and it's probably the most practical shift in AI infrastructure you'll deal with this year.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's an AI Gateway and Why Should You Care?
&lt;/h2&gt;

&lt;p&gt;Think of an AI gateway as the central hub between your apps and multiple AI providers. Instead of writing separate code for OpenAI, Anthropic, Google, and others, you connect once to the gateway, and it handles the rest.&lt;/p&gt;

&lt;p&gt;The benefits are immediate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Automatic failover&lt;/strong&gt;: If one AI provider goes down, requests switch to another&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Load balancing&lt;/strong&gt;: Distribute requests across multiple API keys to avoid rate limits&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Caching&lt;/strong&gt;: Reduce costs and improve response times&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unified monitoring&lt;/strong&gt;: One place to track all your AI interactions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Bifrost is an AI gateway built in Go that adds only 11 microseconds of latency while handling 5,000 requests per second. When you're running production AI systems, those microseconds matter.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Model Context Protocol: USB-C for AI
&lt;/h2&gt;

&lt;p&gt;Anthropic introduced MCP in November 2024. Within a year, it became the industry standard. OpenAI adopted it in March 2025. Google DeepMind followed. By December 2025, it was donated to the Linux Foundation with backing from major tech companies.&lt;/p&gt;

&lt;p&gt;Here's why it matters: Before MCP, connecting an AI model to a new tool meant writing custom integration code. Every. Single. Time.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AI needs to search files? Custom code.&lt;/li&gt;
&lt;li&gt;Access a database? More custom code.&lt;/li&gt;
&lt;li&gt;Connect to Slack? Yet another integration.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This created what Anthropic called the "N×M problem" N models needing M different integrations, resulting in exponentially growing complexity.&lt;/p&gt;

&lt;p&gt;MCP solved this with a standardized protocol. Write an MCP server once for a tool, and any MCP-compatible AI client can use it. It's like USB-C for AI systems one standard connection instead of different cables for different devices.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem with Direct MCP Connections
&lt;/h2&gt;

&lt;p&gt;When you connect AI models directly to MCP servers, you run into scaling problems. Every request from the AI includes all available tool definitions in its context window. Connect to five MCP servers with 100 total tools, and every single request carries those 100 tool definitions even for simple queries that don't need tools.&lt;/p&gt;

&lt;p&gt;This creates three issues:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Wasted tokens&lt;/strong&gt;: Most of your context budget goes to tool catalogs instead of actual work. A six-turn conversation with 100 tools burns 600+ tokens just on definitions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Security gaps&lt;/strong&gt;: Tools can execute without validation or approval. No audit trail, no safety checks before destructive operations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Coordination overhead&lt;/strong&gt;: Each tool call requires a separate round trip to the AI model.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Bifrost Solves This
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8rervwbsltoiwyp1fdlf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8rervwbsltoiwyp1fdlf.png" alt="Bifrost" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Bifrost integrates MCP natively into the gateway itself. You get both AI provider management and tool orchestration through a single interface.&lt;/p&gt;

&lt;p&gt;It supports four connection types:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;In-process tools&lt;/strong&gt;: Run directly in Bifrost's memory with zero network overhead&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local MCP servers via STDIO&lt;/strong&gt;: For filesystem operations or database queries&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HTTP connections&lt;/strong&gt;: For remote microservices&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Server-Sent Events&lt;/strong&gt;: For real-time data streams&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The killer feature is &lt;strong&gt;Code Mode&lt;/strong&gt;. Instead of including hundreds of tool definitions in every request, Bifrost exposes just four meta-tools:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;listToolFiles()&lt;/code&gt; - Discover available servers&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;readToolFile(fileName)&lt;/code&gt; - Get tool signatures&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;getToolDocs(server, tool)&lt;/code&gt; - Get detailed documentation&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;executeToolCode(code)&lt;/code&gt; - Run Starlark (Python-like) code&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The AI writes Starlark code that orchestrates tools inside a sandboxed environment, and tool definitions load only when needed. This reduces token usage by 50%+ when using multiple MCP servers (3+). With 8-10 MCP servers (150+ tools), you avoid wasting context on massive tool catalogs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Started: A Real Example
&lt;/h2&gt;

&lt;p&gt;Let me show you how this works in practice. I'll walk through building a simple MCP server and connecting it to Bifrost.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Start Bifrost
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx &lt;span class="nt"&gt;-y&lt;/span&gt; @maximhq/bifrost
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Bifrost starts with zero configuration and opens at &lt;code&gt;localhost:8080&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Build a Simple MCP Server
&lt;/h3&gt;

&lt;p&gt;I created a Flask server with three tools: getting programming jokes, inspirational quotes, and basic calculations. Here's the core:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;flask&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Flask&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;jsonify&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;flask_cors&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;CORS&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Flask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nc"&gt;CORS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;jokes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Why do programmers prefer dark mode? Because light attracts bugs!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Why did the developer go broke? Because he used up all his cache!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="nd"&gt;@app.route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/sse&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;methods&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;POST&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handle_message&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;
    &lt;span class="n"&gt;method&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;method&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;method&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;initialize&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;jsonify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;jsonrpc&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2.0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;result&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;protocolVersion&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2024-11-05&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;capabilities&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tools&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{}},&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;serverInfo&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;example-server&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;version&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1.0.0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;method&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;tools/list&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;jsonify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;jsonrpc&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2.0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;result&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tools&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                    &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;get_joke&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Returns a random programming joke&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;inputSchema&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;object&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;properties&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{}}&lt;/span&gt;
                    &lt;span class="p"&gt;},&lt;/span&gt;
                    &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;calculate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Performs basic arithmetic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;inputSchema&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;object&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;properties&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;operation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;string&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;enum&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;add&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;multiply&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]},&lt;/span&gt;
                                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;a&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;number&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;number&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
                            &lt;span class="p"&gt;}&lt;/span&gt;
                        &lt;span class="p"&gt;}&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt;
                &lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;method&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;tools/call&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;tool_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;params&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;args&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;params&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;arguments&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{})&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;tool_name&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;get_joke&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;choice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;jokes&lt;/span&gt;&lt;span class="p"&gt;)}]}&lt;/span&gt;
        &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;tool_name&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;calculate&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;a&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;b&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;operation&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;add&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;
            &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;
            &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Result: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;answer&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}]}&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;jsonify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;jsonrpc&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2.0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;result&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run it with: &lt;code&gt;python mcp_server.py&lt;/code&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Configure Model Providers and Connect to Bifrost
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Setting Up Model Providers
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs16176xtk78e6xwxr1rp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs16176xtk78e6xwxr1rp.png" alt="Bifrost UI" width="800" height="435"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the Bifrost UI at &lt;code&gt;localhost:8080&lt;/code&gt;, navigate to &lt;strong&gt;Model Providers&lt;/strong&gt; in the left sidebar. You'll see a comprehensive list of supported providers including OpenAI, Anthropic, Google, AWS Bedrock, Azure, and many others.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fext19hjby77arx632h7w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fext19hjby77arx632h7w.png" alt="Model provider UI" width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Click on &lt;strong&gt;OpenAI&lt;/strong&gt; from the list, then click &lt;strong&gt;"+ Add new key"&lt;/strong&gt; in the top-right corner.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmsd2kdus8yg57cwp04ow.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmsd2kdus8yg57cwp04ow.png" alt="Model Providers" width="800" height="435"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Fill in the key configuration:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Name&lt;/strong&gt;: Give it a descriptive name like "Production Key"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API Key&lt;/strong&gt;: Enter your actual API key (e.g., &lt;code&gt;sk-proj-...&lt;/code&gt;) or use an environment variable like &lt;code&gt;env.OPENAI_KEY&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Models&lt;/strong&gt;: Click to select which models this key can access (e.g., &lt;code&gt;gpt-4o&lt;/code&gt;, &lt;code&gt;gpt-4o-mini&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Weight&lt;/strong&gt;: Set to &lt;code&gt;1&lt;/code&gt; for load balancing (higher weights receive proportionally more traffic)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use for Batch APIs&lt;/strong&gt;: Toggle this on if you want to use this key for batch operations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Click &lt;strong&gt;Save&lt;/strong&gt; to add the key. You'll see it appear in your configured keys list with its weight and enabled status.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pro tip:&lt;/strong&gt; For production setups, add multiple API keys for the same provider. Bifrost automatically distributes requests across them to avoid rate limits. You can also add keys from different providers (e.g., OpenAI and Google) for automatic failover.&lt;/p&gt;

&lt;h4&gt;
  
  
  Connecting Your MCP Server
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiz87yvrikvn4z0bzazli.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiz87yvrikvn4z0bzazli.png" alt="MCP server" width="800" height="435"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now go to &lt;strong&gt;MCP Gateway&lt;/strong&gt; in the left sidebar and click &lt;strong&gt;"New MCP Server"&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;Configuration:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Name&lt;/strong&gt;: &lt;code&gt;localmcp&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Connection Type&lt;/strong&gt;: &lt;code&gt;HTTP (Streamable)&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Connection URL&lt;/strong&gt;: &lt;code&gt;http://localhost:5000/sse&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ping Available for Health Check&lt;/strong&gt;: Enable this&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Bifrost immediately connects, discovers your tools, and shows them in "Available Tools."&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Use It
&lt;/h3&gt;

&lt;p&gt;Here's a Python client that uses everything together:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;

&lt;span class="n"&gt;BIFROST_URL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://localhost:8080&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;ask_ai&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;history&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;history&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;👤 You: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Send to AI via Bifrost
&lt;/span&gt;    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;BIFROST_URL&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/v1/chat/completions&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;openai/gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;assistant_msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;choices&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="c1"&gt;# Handle tool calls
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool_calls&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;assistant_msg&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🔧 AI is using &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;assistant_msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;tool_calls&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; tools...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;assistant_msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;tool_call&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;assistant_msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool_calls&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
            &lt;span class="c1"&gt;# Bifrost executes the tool on your MCP server
&lt;/span&gt;            &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;BIFROST_URL&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/v1/mcp/tool/execute&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool_call&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;tool_call&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

            &lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool_call_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;tool_call&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;tool_call&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;function&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;})&lt;/span&gt;

        &lt;span class="c1"&gt;# Get final response
&lt;/span&gt;        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;BIFROST_URL&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/v1/chat/completions&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;openai/gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="n"&gt;assistant_msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;choices&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;assistant_msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🤖 AI: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;assistant_msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;assistant_msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;history&lt;/span&gt;

&lt;span class="c1"&gt;# Try it
&lt;/span&gt;&lt;span class="nf"&gt;ask_ai&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Tell me a programming joke&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;ask_ai&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What is 25 times 4?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffoxx8438zteefyrx54dx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffoxx8438zteefyrx54dx.png" alt="Agent output" width="800" height="405"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  What Just Happened?
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Your script sends "What is 25 times 4?" to Bifrost&lt;/li&gt;
&lt;li&gt;Bifrost adds your MCP tools to the AI's context&lt;/li&gt;
&lt;li&gt;GPT-4 decides to use the &lt;code&gt;calculate&lt;/code&gt; tool&lt;/li&gt;
&lt;li&gt;Your script calls Bifrost's tool execution endpoint&lt;/li&gt;
&lt;li&gt;Bifrost sends a JSON-RPC request to your Flask server&lt;/li&gt;
&lt;li&gt;Your server calculates 25 × 4 = 100 and returns it&lt;/li&gt;
&lt;li&gt;The result goes back to GPT-4&lt;/li&gt;
&lt;li&gt;GPT-4 responds: "25 times 4 equals 100"&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The beautiful part? Clean separation of concerns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your client doesn't know MCP protocol details&lt;/li&gt;
&lt;li&gt;Bifrost handles all MCP communication&lt;/li&gt;
&lt;li&gt;The AI doesn't know your server implementation&lt;/li&gt;
&lt;li&gt;Your MCP server doesn't know which AI is calling it&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the power of standardization.&lt;/p&gt;

&lt;h2&gt;
  
  
  Security Matters
&lt;/h2&gt;

&lt;p&gt;In April 2025, researchers identified MCP security issues: prompt injection, permission combinations that could exfiltrate data, and lookalike tools.&lt;/p&gt;

&lt;p&gt;Bifrost addresses this with a "suggest, don't execute" model by default. When an AI proposes a tool call, nothing runs automatically. Your code reviews and approves each execution. You get full audit trails for compliance.&lt;/p&gt;

&lt;p&gt;You can configure Agent Mode for specific tools. Safe operations like reading files can auto-execute, while destructive operations require approval.&lt;/p&gt;

&lt;p&gt;For scenarios with many MCP servers (3+), you can enable Code Mode to reduce token usage.&lt;/p&gt;

&lt;p&gt;This configuration tells Bifrost to expose the four meta-tools instead of all tool definitions directly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Matters Now
&lt;/h2&gt;

&lt;p&gt;If you're building AI systems without MCP integration in 2026, you're solving yesterday's problems. The standardization is here. The ecosystem is mature. The question isn't whether to adopt MCP, but how quickly.&lt;/p&gt;

&lt;p&gt;Bifrost makes adoption straightforward:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Setup takes less than a minute&lt;/li&gt;
&lt;li&gt;Web UI makes configuration visual&lt;/li&gt;
&lt;li&gt;Open-source means you can examine and customize&lt;/li&gt;
&lt;li&gt;Native support for multiple connection types&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is infrastructure that matters. Not because it's flashy, but because it solves real problems every organization faces when building AI systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Get started with Bifrost:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GitHub: &lt;a href="https://git.new/bifrost" rel="noopener noreferrer"&gt;https://git.new/bifrost&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Documentation: &lt;a href="https://docs.getbifrost.ai" rel="noopener noreferrer"&gt;https://docs.getbifrost.ai&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Quick Start: &lt;a href="https://docs.getbifrost.ai/quickstart/gateway/setting-up" rel="noopener noreferrer"&gt;https://docs.getbifrost.ai/quickstart/gateway/setting-up&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Code Mode: &lt;a href="https://docs.getbifrost.ai/mcp/code-mode" rel="noopener noreferrer"&gt;https://docs.getbifrost.ai/mcp/code-mode&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Agent Mode: &lt;a href="https://docs.getbifrost.ai/mcp/agent-mode" rel="noopener noreferrer"&gt;https://docs.getbifrost.ai/mcp/agent-mode&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;MCP Overview: &lt;a href="https://docs.getbifrost.ai/mcp/overview" rel="noopener noreferrer"&gt;https://docs.getbifrost.ai/mcp/overview&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>webdev</category>
      <category>programming</category>
    </item>
    <item>
      <title>Top 5 LLM Gateways in 2026: A Deep-Dive Comparison for Production Teams</title>
      <dc:creator>Varshith V Hegde</dc:creator>
      <pubDate>Thu, 22 Jan 2026 01:41:56 +0000</pubDate>
      <link>https://forem.com/varshithvhegde/top-5-llm-gateways-in-2026-a-deep-dive-comparison-for-production-teams-34d2</link>
      <guid>https://forem.com/varshithvhegde/top-5-llm-gateways-in-2026-a-deep-dive-comparison-for-production-teams-34d2</guid>
      <description>&lt;p&gt;I spent the last few weeks researching LLM gateway solutions for production teams. Here's what I found after testing five different options, talking to engineering teams running them at scale, and breaking things in my staging environment.&lt;/p&gt;

&lt;p&gt;I didn't test every edge case. We focused on REST APIs with streaming responses, didn't test batch processing extensively, and our traffic patterns might be different from yours. But here's what I learned.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Production Teams Need LLM Gateways
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxkpcwz6l0dqag77jkxgw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxkpcwz6l0dqag77jkxgw.png" alt="LLM Gateway" width="800" height="429"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here's what happened when we didn't use one:&lt;/p&gt;

&lt;p&gt;Our application relied only on OpenAI. When they had an outage last month, our entire product went down. This created problems when we had customers waiting for support.&lt;/p&gt;

&lt;p&gt;Then there's cost. We were using GPT-4 for simple tasks that Claude Haiku could handle for one-tenth the price. One weekend of refactoring our routing logic saved us $3,000 per month.&lt;/p&gt;

&lt;p&gt;But managing multiple providers yourself creates its own problems. You end up writing custom code for each API, normalizing their different error formats, managing API keys, building retry logic from scratch, and spending hours debugging why Anthropic's rate limit response looks different from OpenAI's.&lt;/p&gt;

&lt;p&gt;LLM gateways solve this. One API for all providers. Automatic fallbacks. Cost tracking that works. And your application won't crash because one provider is having issues.&lt;/p&gt;

&lt;p&gt;Here are the five gateways that impressed me.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Bifrost (by Maxim AI)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1vz8eu5bal1yc64hwzqn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1vz8eu5bal1yc64hwzqn.png" alt="Bifrost" width="800" height="414"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What it is&lt;/strong&gt;: A high-performance LLM gateway built in Go. It's designed for speed and reliability.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for&lt;/strong&gt;: Customer-facing applications where latency matters. Real-time chat, high-traffic APIs, anything where users will notice if responses are slow.&lt;/p&gt;

&lt;p&gt;The performance numbers caught my attention first. In our synthetic load tests, Bifrost added about 11 microseconds of latency at 5,000 requests per second. When I ran the same test with LiteLLM (which is Python-based), it added around 50 microseconds.&lt;/p&gt;

&lt;p&gt;What really sold me was the P99 latency test. At 1,000 concurrent users, LiteLLM's slowest responses hit 28 seconds. Bifrost stayed under 50 milliseconds. If you're building a chatbot, that's the difference between users staying on your application and immediately leaving.&lt;/p&gt;

&lt;p&gt;Now, I didn't test this with burst traffic or serverless deployments - our setup is traditional Kubernetes. Your results might differ depending on your infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What makes it different&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;Smart load balancing that actually works. Bifrost was the first gateway I found that automatically routes requests based on real-time performance. It monitors which providers are healthy, routes around failures, and prevents you from hitting rate limits. Most gateways claim to do this, but Bifrost's implementation is noticeably better.&lt;/p&gt;

&lt;p&gt;It also has cluster mode built in, so you can run multiple instances without complicated setup. And here's what surprised me - it includes SSO, audit logs, team budgets, and role-based access control without adding latency. Most gateways make you choose between features and speed. Bifrost somehow does both.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Setup&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx &lt;span class="nt"&gt;-y&lt;/span&gt; @maximhq/bifrost

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In 30 seconds you have a gateway running with a web UI. Since it uses OpenAI's API format, integrating it is just changing your base URL. I had our staging environment switched over in under 10 minutes.&lt;/p&gt;

&lt;p&gt;Bifrost covers all the major providers - OpenAI, Anthropic, Google Vertex AI, AWS Bedrock, Azure OpenAI, Cohere, Mistral, Groq, Together AI, and Replicate. Plus they added support for any OpenAI-compatible endpoint, which means you can actually use custom or self-hosted models too.&lt;/p&gt;

&lt;p&gt;For most production use cases, you're using one of these major providers anyway. LiteLLM does have broader coverage and a more mature open-source community - they've been around longer with more contributors and community support. If that ecosystem and maximum provider choice matters more to you than raw performance, LiteLLM is a solid pick. But for our needs, Bifrost's speed and provider coverage were enough.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why we chose it&lt;/strong&gt;: For our use case (high-scale, customer-facing chat), the 11 microsecond overhead was too good to pass up. The enterprise features were a bonus we didn't expect at this performance level.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pricing&lt;/strong&gt;: Open-source and free to self-host. Enterprise support is available.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. LiteLLM
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fat7humnt3ehpd4ilipql.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fat7humnt3ehpd4ilipql.png" alt="LiteLLM" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is probably the most popular open-source LLM gateway. Python-based, with both an SDK and proxy server.&lt;/p&gt;

&lt;p&gt;If you're in a Python environment or need access to niche models, this is the default choice. The provider coverage is unmatched - over 100 providers including all the major ones (OpenAI, Anthropic, Google, Azure, AWS) plus specialized options like HuggingFace, Ollama, Replicate, Anyscale, and Perplexity.&lt;/p&gt;

&lt;p&gt;For Python developers, setup is straightforward:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;litellm&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;completion&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;completion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hello&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Switch to Claude without changing code
&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;completion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-4-sonnet&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hello&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Configuration uses YAML. The documentation is thorough, and there's a strong community.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where it breaks down&lt;/strong&gt;: Performance at scale. LiteLLM is written in Python using FastAPI. At low to moderate traffic, it performs well. But in our load tests, the limitations showed clearly.&lt;/p&gt;

&lt;p&gt;At 500 requests per second, P99 latency hit 28 seconds. At 1,000 requests per second, it crashed - ran out of memory and started failing requests. The Python GIL and async overhead become real bottlenecks when handling thousands of concurrent requests.&lt;/p&gt;

&lt;p&gt;I saw this in our staging environment. At 200 requests per second, everything ran smoothly. When I simulated higher traffic (around 2,000 requests per second), LiteLLM started timing out. Memory usage increased to over 8GB, and we got cascading failures.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When to use it&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Development and testing environments&lt;/li&gt;
&lt;li&gt;Prototyping and trying different models&lt;/li&gt;
&lt;li&gt;Internal tools with moderate traffic (under 500 RPS)&lt;/li&gt;
&lt;li&gt;When you need access to 100+ providers&lt;/li&gt;
&lt;li&gt;Python-first teams where ecosystem fit matters&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;When to avoid it&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Customer-facing applications at scale&lt;/li&gt;
&lt;li&gt;Real-time features where every millisecond counts&lt;/li&gt;
&lt;li&gt;Production workloads requiring 99.9%+ uptime&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The ecosystem is mature with active development, but if you're planning to handle thousands of requests per second in production, you'll likely hit performance issues.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pricing&lt;/strong&gt;: Fully open-source and free. You pay for hosting it yourself.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Portkey
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjjqckdxeyshj78dshmap.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjjqckdxeyshj78dshmap.png" alt="PortKey" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Portkey is more than just a gateway - it's a full AI control plane with routing, observability, guardrails, and governance.&lt;/p&gt;

&lt;p&gt;The observability depth is what sets it apart. Every request gets full traces showing you which user made the call, which models were tried, why they failed, which fallback was used, how long each step took, and the exact cost. This isn't just logging - it's distributed tracing for AI.&lt;/p&gt;

&lt;p&gt;When our staging environment started using too many tokens, Portkey's traces showed us exactly which user and which prompt was causing it. That level of detail is valuable when debugging production issues.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;portkey_ai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Portkey&lt;/span&gt;

&lt;span class="n"&gt;portkey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Portkey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-portkey-key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;virtual_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-provider-virtual-key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;portkey&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hello&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Enterprise features&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;PII detection, content filtering, prompt injection detection&lt;/li&gt;
&lt;li&gt;SOC 2, HIPAA, GDPR compliance with full audit trails&lt;/li&gt;
&lt;li&gt;SSO/SAML, team permissions, role-based access&lt;/li&gt;
&lt;li&gt;Data residency controls&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;According to their team, they handle over 10 billion requests monthly with 99.9999% uptime. I couldn't independently verify this, but the platform felt stable during our testing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The tradeoff&lt;/strong&gt;: I measured latency overhead of 20-40 milliseconds when using advanced features like guardrails and detailed tracing. For a small team that just needs basic routing, Portkey is probably more than necessary. The learning curve is also steeper than simpler gateways.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why we didn't choose it&lt;/strong&gt;: For our use case, the added latency and complexity weren't worth the governance features we didn't need yet. But I talked to a healthcare company using Portkey specifically for PII detection. Every LLM request gets scanned for protected health information, logged with full audit trails, and only routed to HIPAA-compliant providers. For them, the compliance features justified the cost.&lt;/p&gt;

&lt;p&gt;If you're in a regulated industry or managing AI across multiple teams with governance requirements, Portkey's observability is among the best available.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pricing&lt;/strong&gt;: Free tier for development | Starts at $49/month | Enterprise custom pricing&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Kong AI Gateway
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjwj4rop15l4h17f123cm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjwj4rop15l4h17f123cm.png" alt="Kong AI" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Kong's API Gateway with AI-specific features added. If you're already using Kong, this is worth looking at.&lt;/p&gt;

&lt;p&gt;Kong brings decades of API gateway experience to LLM routing - authentication, rate limiting, security, and observability at large scale. All the infrastructure pieces that matter when running production workloads.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install AI Proxy plugin&lt;/span&gt;
curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8001/services/ai-service/plugins &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--data&lt;/span&gt; &lt;span class="s2"&gt;"name=ai-proxy"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--data&lt;/span&gt; &lt;span class="s2"&gt;"config.route_type=llm/v1/chat"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--data&lt;/span&gt; &lt;span class="s2"&gt;"config.auth.header_name=Authorization"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--data&lt;/span&gt; &lt;span class="s2"&gt;"config.model.provider=openai"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--data&lt;/span&gt; &lt;span class="s2"&gt;"config.model.name=gpt-4"&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;AI-specific capabilities&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Unified API across OpenAI, Anthropic, AWS Bedrock, Azure AI, Google Vertex&lt;/li&gt;
&lt;li&gt;RAG pipelines built in&lt;/li&gt;
&lt;li&gt;PII removal across 12 languages&lt;/li&gt;
&lt;li&gt;Content filtering and safety controls&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Where this makes sense&lt;/strong&gt;: You're already using Kong for API management. That's the primary reason to choose this. The integration with existing Kong infrastructure is seamless, and you get unified observability across all your APIs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where it doesn't&lt;/strong&gt;: If you're not already on Kong, the learning curve is significant. It's built for large enterprises, not small teams needing quick deployment. We evaluated this briefly but decided it was more complexity than we needed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pricing&lt;/strong&gt;: Available through Kong Konnect (managed) or self-hosted | Enterprise custom pricing&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Helicone AI Gateway
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjwluintjiw362u6ff363.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjwluintjiw362u6ff363.png" alt="HeliconeAI" width="800" height="441"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Started as an observability platform, recently launched a Rust-based gateway. Lightweight and fast.&lt;/p&gt;

&lt;p&gt;Built in Rust, Helicone achieves around 8ms P50 latency with sub-5ms overhead even under load, based on what their team shared with me. The gateway ships as a single 15MB binary that runs anywhere.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Run with npx&lt;/span&gt;
npx @helicone/ai-gateway

&lt;span class="c"&gt;# Or with Docker&lt;/span&gt;
docker run &lt;span class="nt"&gt;-p&lt;/span&gt; 8787:8787 helicone/ai-gateway

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The observability is their core strength - request-level tracing, user tracking, cost forecasting, performance analytics, and real-time alerts. It's as comprehensive as Portkey's but with less complexity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Flexible deployment&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cloud-hosted (managed service)&lt;/li&gt;
&lt;li&gt;Self-hosted (full control)&lt;/li&gt;
&lt;li&gt;Hybrid (self-host gateway, use cloud observability)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The consideration&lt;/strong&gt;: The gateway is newer (launched mid-2024). Core routing is solid, but some advanced enterprise features are still developing. For most teams this isn't a problem, but large enterprises might want to validate specific requirements first.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pricing&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Gateway: Open-source and free to self-host&lt;/li&gt;
&lt;li&gt;Observability: Starts free, then $20/month for 100,000 requests&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The separation is smart - you can self-host for free and only pay for observability if you want it.&lt;/p&gt;




&lt;h2&gt;
  
  
  How to Choose
&lt;/h2&gt;

&lt;p&gt;After evaluating these gateways, here's what I learned:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Choose Bifrost if&lt;/strong&gt;: Performance is critical. You're handling 5,000+ requests per second, serving customer-facing features, or building real-time applications where latency matters. The 11 microsecond overhead is hard to beat.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Choose LiteLLM if&lt;/strong&gt;: You're in a Python environment with moderate traffic (under 500 RPS). The provider coverage is unmatched - over 100 models including specialized ones. Great for development, prototyping, and internal tools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Choose Portkey if&lt;/strong&gt;: You're in a regulated industry needing compliance controls (HIPAA, SOC 2) or managing AI across multiple teams. The observability and governance features are excellent, but you'll pay for it in latency (20-40ms overhead).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Choose Kong if&lt;/strong&gt;: You're already using Kong for API management. Otherwise, the learning curve probably isn't worth it unless you're a large enterprise needing infrastructure-level control.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Choose Helicone if&lt;/strong&gt;: You want performance and observability without enterprise complexity. Good for teams with data residency requirements who want self-hosted infrastructure with cloud monitoring.&lt;/p&gt;




&lt;h2&gt;
  
  
  Questions?
&lt;/h2&gt;

&lt;p&gt;Have you deployed LLM gateways in production? What did you choose and why? What surprised you?&lt;/p&gt;

&lt;p&gt;Still evaluating options? I can help with specific questions about performance, integration, or cost modeling at your scale. Leave a comment below.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>programming</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Daily Echo - Your Life in Motion 🎥</title>
      <dc:creator>Varshith V Hegde</dc:creator>
      <pubDate>Sat, 03 Jan 2026 16:23:33 +0000</pubDate>
      <link>https://forem.com/varshithvhegde/daily-echo-your-life-in-motion-2938</link>
      <guid>https://forem.com/varshithvhegde/daily-echo-your-life-in-motion-2938</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/mux-2025-12-03"&gt;DEV's Worldwide Show and Tell Challenge Presented by Mux&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;Daily Echo is a private video journaling app where you record 1-minute daily video diaries. It's like having a conversation with your future self. The app helps you track your mood, reflect on your experiences, and create a visual archive of your life that you can revisit anytime.&lt;/p&gt;

&lt;h2&gt;
  
  
  My Pitch Video
&lt;/h2&gt;

&lt;p&gt;

&lt;iframe src="https://player.mux.com/01jDdJgPd01TNb2NfuOkBkPc027sNGvuzokCpE7g01Qcb4c" width="710" height="399"&gt;
&lt;/iframe&gt;



&lt;/p&gt;

&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Live App&lt;/strong&gt;: &lt;a href="https://dailyecho.varshithvhegde.in/" rel="noopener noreferrer"&gt;https://dailyecho.varshithvhegde.in/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/Varshithvhegde/dailyecho" rel="noopener noreferrer"&gt;https://github.com/Varshithvhegde/dailyecho&lt;/a&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/Varshithvhegde" rel="noopener noreferrer"&gt;
        Varshithvhegde
      &lt;/a&gt; / &lt;a href="https://github.com/Varshithvhegde/dailyecho" rel="noopener noreferrer"&gt;
        dailyecho
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      DailyEcho - A beautiful, private video journaling app that lets you record daily video diary entries, track your mood over time, and relive your memories through immersive story modes and interactive visual walls.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;🎥 Daily Echo&lt;/h1&gt;
&lt;/div&gt;
&lt;p&gt;A beautiful, private video journaling app that lets you record daily video diary entries, track your mood over time, and relive your memories through immersive story modes and interactive visual walls.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;✨ Features&lt;/h2&gt;
&lt;/div&gt;
&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;🎬 Immersive Story Modes (New!)&lt;/h3&gt;
&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Memory Stories&lt;/strong&gt; - Watch your entries in a sequential, story-like format similar to social media.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auto-Curated Playlists&lt;/strong&gt; - Choose from "Recent Moments", "Moments of Joy" (happy/excited/grateful), or "Flashback" (random picks from the past).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Smooth Navigation&lt;/strong&gt; - Interactive progress bars, auto-advance, and gesture/keyboard support.&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;🧱 Echo Wall (Mosaic Mode)&lt;/h3&gt;

&lt;/div&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Living Visual History&lt;/strong&gt; - A dynamic masonry grid of your life in motion.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Living Video Tiles&lt;/strong&gt; - Each tile plays a Mux-generated animated GIF preview simultaneously for a "Harry Potter" newspaper effect.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Interactive Previews&lt;/strong&gt; - Retro CRT scanline overlays and cinematic hover effects.&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;📹 Video Recording &amp;amp; Playback&lt;/h3&gt;

&lt;/div&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Mux-powered streaming&lt;/strong&gt; - Professional-grade video processing and playback with adaptive streaming.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mux GIFs&lt;/strong&gt;…&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/Varshithvhegde/dailyecho" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Testing Credentials:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Email: &lt;code&gt;test@gmail.com&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Password: &lt;code&gt;devtest&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Detailed Explanation
&lt;/h2&gt;

&lt;p&gt;

&lt;iframe src="https://player.mux.com/CCj4qM26bpO6r6Zlx37CFqh01dDZNAaYmt9FaJXDPkEY" width="710" height="399"&gt;
&lt;/iframe&gt;



&lt;/p&gt;

&lt;h2&gt;
  
  
  The Story Behind It
&lt;/h2&gt;

&lt;p&gt;I have a terrible memory. Seriously. Ask me what I did last Tuesday and I'll draw a blank. But I've always been fascinated by the idea of looking back at my life, especially when the end of the year rolls around and everyone's doing their "year in review" thing.&lt;/p&gt;

&lt;p&gt;I wanted to create something that would help me remember the small moments - not just the big events, but the everyday stuff. What was I thinking about on a random Wednesday in March? How did I feel when that thing happened at work? What was going through my mind during that phase of my life?&lt;/p&gt;

&lt;p&gt;The idea was simple: record a 1-minute video every day. Just sit down, talk to the camera like you're talking to a friend, and capture whatever's on your mind. But I didn't want it to feel like a chore. I wanted it to be something I'd actually look forward to doing.&lt;/p&gt;

&lt;p&gt;So I built Daily Echo with features that make revisiting your memories feel magical. The Echo Wall shows all your entries as living video tiles playing simultaneously (like those moving newspapers in Harry Potter). Memory Stories let you watch your entries in sequence, almost like watching a documentary about your own life. And the Time Capsule feature shows you what you were up to exactly one month or one year ago.&lt;/p&gt;

&lt;p&gt;It's been incredibly powerful for me personally. There's something about being able to go back and watch yourself from months ago, seeing how you've grown or changed, or just remembering moments you'd completely forgotten.&lt;/p&gt;

&lt;h2&gt;
  
  
  Technical Highlights
&lt;/h2&gt;

&lt;p&gt;Daily Echo is built with React 18, TypeScript, and Vite on the frontend, with Tailwind CSS and shadcn/ui for the design. The backend runs on Supabase, handling PostgreSQL database, authentication, and edge functions.&lt;/p&gt;

&lt;p&gt;What makes the app special technically:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Living Video Previews Everywhere&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Every entry card in the timeline shows an animated GIF preview that plays automatically. When you hover over the Echo Wall (our mosaic view), you see all your memories playing at once. It creates this incredible "living history" effect that static thumbnails just can't match.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. AI-Powered Insights&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Using OpenAI's GPT-4o-mini, the app automatically analyzes your video transcripts to generate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Two-sentence summaries of each entry&lt;/li&gt;
&lt;li&gt;Emotional sentiment detection&lt;/li&gt;
&lt;li&gt;Personalized daily advice based on what you talked about&lt;/li&gt;
&lt;li&gt;Mood tracking over time&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;3. Immersive Story Modes&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You can watch your entries in different ways:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Recent Moments&lt;/strong&gt;: Your latest recordings in sequence&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Moments of Joy&lt;/strong&gt;: Auto-curated playlist of happy entries&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Flashback&lt;/strong&gt;: Random picks from your past&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each mode has interactive progress bars, auto-advance, and keyboard controls for a cinematic experience.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Gamification That Actually Matters&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Achievement badges like "Zen Master" (recorded before 6 AM), "Night Owl" (recorded after 10 PM), and "Weekend Warrior" (weekend recordings) make the habit more engaging. You can track your recording streaks and see your mood variety over time.&lt;/p&gt;

&lt;h3&gt;
  
  
  Use of Mux
&lt;/h3&gt;

&lt;p&gt;Mux is the heart and soul of Daily Echo. Here's how I'm using it:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Professional Video Infrastructure&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When you record a video, it goes through Mux's direct upload API. No dealing with complicated encoding pipelines or storage headaches. Mux handles everything: transcoding, optimization, and adaptive streaming. The result? Your videos play smoothly on any device, any connection speed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Automatic Transcription&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This was a game-changer. By enabling Mux's transcription feature during upload, I get accurate text transcripts of every video entry automatically. These transcripts power the AI analysis, search functionality, and accessibility features. I didn't have to integrate a separate transcription service or worry about accuracy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Animated GIF Previews&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Instead of static thumbnails, every entry shows a living preview using Mux's GIF generation API. You can watch all your memories playing simultaneously in the Echo Wall view. It's like having a magical photo album where every picture moves. Mux generates these GIFs automatically from your video without any extra work on my end.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Reliable Streaming with Mux Player&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The integrated Mux Player component handles playback with built-in caption support. It just works - no buffering issues, no format compatibility problems, no manual quality switching needed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Webhook Integration&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Mux's webhook system notifies my Supabase edge function when videos are ready, when transcripts are available, and if anything goes wrong. This lets me update the UI in real-time and handle the entire video lifecycle automatically.&lt;/p&gt;

&lt;p&gt;The developer experience with Mux has been fantastic. The documentation is clear, the API is intuitive, and features like automatic transcription and GIF generation saved me weeks of development time. Instead of building video infrastructure, I could focus on making the journaling experience special.&lt;/p&gt;

&lt;p&gt;What really impressed me: I initially thought I'd need separate services for video hosting, transcription, and preview generation. Mux does all of this out of the box, and it scales effortlessly. When a user records their 100th video, it works just as smoothly as their first.&lt;/p&gt;




&lt;p&gt;I hope Daily Echo inspires others to start capturing their daily thoughts. Life moves fast, and our memories fade faster. Having a video archive of your own life is like having a superpower - you can literally go back in time and remember who you were and what mattered to you at any moment.&lt;/p&gt;

&lt;p&gt;Give it a try with the test credentials above, and maybe start your own daily echo habit!&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>muxchallenge</category>
      <category>showandtell</category>
      <category>video</category>
    </item>
    <item>
      <title>My 2025 wrap</title>
      <dc:creator>Varshith V Hegde</dc:creator>
      <pubDate>Wed, 31 Dec 2025 15:16:32 +0000</pubDate>
      <link>https://forem.com/varshithvhegde/my-2025-wrap-ek0</link>
      <guid>https://forem.com/varshithvhegde/my-2025-wrap-ek0</guid>
      <description>&lt;p&gt;2025 was a rollercoaster for me. Looking back, I can clearly divide it into two distinct halves. One that tested me, and another that transformed me.&lt;/p&gt;

&lt;h2&gt;
  
  
  The First Half: Climbing the Corporate Ladder
&lt;/h2&gt;

&lt;p&gt;January started strong. I got promoted at work, which was amazing! Though it was mostly a position bump, I was actually leading a project as an Associate Engineer. The best part? We went from 5 days in the office to just 2 days a week. But honestly, I still chose to work from the office most days because I was so invested in the project.&lt;/p&gt;

&lt;p&gt;Then came February, the "love month," and let's just say things didn't go as planned. I hit one of the lowest points in my life.&lt;/p&gt;

&lt;p&gt;But you know what kept me going? I dove into spirituality and writing poems. These became my anchors during the tough times.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Turning Point: When Everything Changed
&lt;/h2&gt;

&lt;p&gt;And then came the moment that changed everything.&lt;/p&gt;

&lt;p&gt;I had this massive challenge at work. Our tool was taking 8 minutes just to load an MF4 (MDF file). EIGHT MINUTES. And that's before any computation! We were using the asammdf package in Python, which is good, but painfully slow.&lt;/p&gt;

&lt;p&gt;I became obsessed with solving this. I researched everything. Tried JIT compilation, which improved computation time but not the loading. Then I had this wild idea: what if I rewrote the entire package in Rust?&lt;/p&gt;

&lt;p&gt;This wasn't just a work task anymore. This was MY mission. I worked my regular job during the day and coded this project at night. I went deep into understanding how MDF files work at the byte level, implementing a custom package specifically for our project. Countless all-nighters, endless debugging sessions, but when it finally worked? Pure magic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;10 seconds.&lt;/strong&gt; That's all it took to load a 4GB file AND do the computation (which I also replaced with Rust). Only the UI remained in Python.&lt;/p&gt;

&lt;p&gt;This earned me so much respect at work. Due to NDA, I can't share the code or methods (corporate life, you know), but the fact that I pulled this off still makes me proud. This was my turning point.&lt;/p&gt;

&lt;h2&gt;
  
  
  Reigniting the Developer Within
&lt;/h2&gt;

&lt;p&gt;Huge shoutout to Jess and Ben for their weekly "What was your win this week?" posts. Reading those comments and seeing everyone's achievements? That reignited my inner developer. I wanted to be part of that energy again.&lt;/p&gt;

&lt;p&gt;I restarted my dev.to journey, but I was rusty. I couldn't figure out what to write about.&lt;/p&gt;

&lt;p&gt;Then I discovered &lt;strong&gt;DEV Challenges&lt;/strong&gt;, and wow, what a gold mine! I could build, showcase, learn, and enjoy all at once. Every weekend became about tackling a new challenge. This was exactly what I needed to grow and fall in love with coding all over again.&lt;/p&gt;

&lt;h2&gt;
  
  
  Discovering New Heights (Literally!)
&lt;/h2&gt;

&lt;p&gt;In the second half of the year, I found another passion. &lt;strong&gt;Trekking&lt;/strong&gt;. I climbed Asia's 1st and 2nd largest monolithic rocks! Out of 9 total treks, 8 happened in the second half alone.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq2vgqmpc6omlrsica7sb.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq2vgqmpc6omlrsica7sb.jpg" alt="Trekking with friends"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I also did my first solo travel to &lt;strong&gt;Hampi&lt;/strong&gt;. I know it's not far, but for me, it was a huge achievement. Plus, I met some amazing people along the way!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn5r7r2awbzwofsheovsc.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn5r7r2awbzwofsheovsc.jpg" alt="Hampi with friends"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Projects That Keep On Giving
&lt;/h2&gt;

&lt;p&gt;Some of my older projects surprised me this year:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;

&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;a href="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" class="article-body-image-wrapper"&gt;&lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;&lt;/a&gt;
      &lt;a href="https://github.com/Varshithvhegde" rel="noopener noreferrer"&gt;
        Varshithvhegde
      &lt;/a&gt; / &lt;a href="https://github.com/Varshithvhegde/FreeShare" rel="noopener noreferrer"&gt;
        FreeShare
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      FreeShare is a free online file sharing platform designed to simplify the process of sharing files without the need for any sign-up or verification.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;FreeShare: File Sharing Platform&lt;/h1&gt;
&lt;/div&gt;
&lt;p&gt;&lt;a rel="noopener noreferrer" href="https://github.com/Varshithvhegde/FreeShare/./public/assets/landingPage-8e480441-9785-40ca-9f26-d2b48cecc688"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fgithub.com%2FVarshithvhegde%2FFreeShare%2F.%2Fpublic%2Fassets%2FlandingPage-8e480441-9785-40ca-9f26-d2b48cecc688" alt="thumbnail"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;🗂️ Description&lt;/h2&gt;
&lt;/div&gt;
&lt;p&gt;FreeShare is a file sharing platform built with React, Firebase, and Cloud Functions. This project allows users to share files easily and efficiently. It's designed for individuals and teams who need a simple and secure way to share files.&lt;/p&gt;
&lt;p&gt;The platform provides a user-friendly interface for uploading, sharing, and managing files. With FreeShare, you can share files with others by generating a unique link, and recipients can access the files without needing to create an account.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;✨ Key Features&lt;/h2&gt;
&lt;/div&gt;
&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;&lt;strong&gt;File Sharing&lt;/strong&gt;&lt;/h3&gt;

&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;Upload and share files with others via a unique link&lt;/li&gt;
&lt;li&gt;Supports various file types, including documents, images, and videos&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;&lt;strong&gt;Security and Authentication&lt;/strong&gt;&lt;/h3&gt;

&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;Secure file storage using Firebase Storage&lt;/li&gt;
&lt;li&gt;Authentication and authorization using Firebase Authentication&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;&lt;strong&gt;User Interface&lt;/strong&gt;&lt;/h3&gt;

&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;Responsive and user-friendly interface built with React and Material-UI&lt;/li&gt;
&lt;li&gt;Easy navigation and file management&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;🗂️ Folder Structure&lt;/h2&gt;

&lt;/div&gt;

  &lt;div class="js-render-enrichment-target"&gt;
    &lt;div class="render-plaintext-hidden"&gt;
      &lt;pre&gt;graph TD
src--&amp;gt;components
src--&amp;gt;App.test.js;
src--&amp;gt;index.js;
src--&amp;gt;reportWebVitals.js;
src--&amp;gt;setupTests.js;
public--&amp;gt;index.html;
public--&amp;gt;manifest.json;
public--&amp;gt;robots.txt;&lt;/pre&gt;…&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/Varshithvhegde/FreeShare" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
FreeShare is a free online file-sharing platform that needs no sign-up or verification. I built it 2 years ago in college when I was just starting out. I honestly thought it was dead, but when I checked Firebase recently... 10K total users! People are still using it, and my blog post about it still gets views. Sure, it has flaws, but hey, we all start somewhere.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;

&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;a href="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" class="article-body-image-wrapper"&gt;&lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;&lt;/a&gt;
      &lt;a href="https://github.com/Varshithvhegde" rel="noopener noreferrer"&gt;
        Varshithvhegde
      &lt;/a&gt; / &lt;a href="https://github.com/Varshithvhegde/notepage" rel="noopener noreferrer"&gt;
        notepage
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      NotePage is a web application that allows you to easily share code, text, or any content using a unique link.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;&lt;a href="https://notepage.vercel.app" rel="nofollow noopener noreferrer"&gt;NotePage&lt;/a&gt;&lt;/h1&gt;
&lt;/div&gt;
&lt;p&gt;&lt;a rel="noopener noreferrer" href="https://private-user-images.githubusercontent.com/80502833/277675360-a94e6729-3305-4380-94ea-7f2ac01c81be.png?jwt=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NzQ2NDE5NjUsIm5iZiI6MTc3NDY0MTY2NSwicGF0aCI6Ii84MDUwMjgzMy8yNzc2NzUzNjAtYTk0ZTY3MjktMzMwNS00MzgwLTk0ZWEtN2YyYWMwMWM4MWJlLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNjAzMjclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjYwMzI3VDIwMDEwNVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTAyMmQ4MTYxNDA5NDRjMTIxYmJmZWVkODc3Nzg3MjZlOWU3Y2VmMzEwNTE0YmIyNTNiZDdiMDI2NTFjMWQ5MzkmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.Rb2RNP2TOLX327786kGMDNbcapWPnv2JhV4tZY3BbfA"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fprivate-user-images.githubusercontent.com%2F80502833%2F277675360-a94e6729-3305-4380-94ea-7f2ac01c81be.png%3Fjwt%3DeyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NzQ2NDE5NjUsIm5iZiI6MTc3NDY0MTY2NSwicGF0aCI6Ii84MDUwMjgzMy8yNzc2NzUzNjAtYTk0ZTY3MjktMzMwNS00MzgwLTk0ZWEtN2YyYWMwMWM4MWJlLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNjAzMjclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjYwMzI3VDIwMDEwNVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTAyMmQ4MTYxNDA5NDRjMTIxYmJmZWVkODc3Nzg3MjZlOWU3Y2VmMzEwNTE0YmIyNTNiZDdiMDI2NTFjMWQ5MzkmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.Rb2RNP2TOLX327786kGMDNbcapWPnv2JhV4tZY3BbfA" alt="frame_generic_light"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;NotePage&lt;/strong&gt; is a web application that allows you to easily share code, text, or any content using a unique link. You can create new note pages by simply visiting &lt;code&gt;https://notepage.vercel.app&lt;/code&gt;.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;Features&lt;/h3&gt;
&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Custom Pages&lt;/strong&gt;: Create your own custom pages to share content with others. Just use &lt;code&gt;https://notepage.vercel.app/&amp;lt;your-page-name&amp;gt;&lt;/code&gt; and start sharing.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Password Protection&lt;/strong&gt;: Optionally protect your pages with a password, ensuring that only authorized users can access your content.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Real-time Collaboration&lt;/strong&gt;: Collaborate with others in real-time. When multiple users access the same link, any changes made by one user are instantly visible to others, without requiring a page refresh.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Shareable Links&lt;/strong&gt;: Share your pages with others by sending them the unique link.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;Tech Stack&lt;/h3&gt;

&lt;/div&gt;
&lt;p&gt;NotePage is built using the following technologies:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Angular&lt;/strong&gt;: A powerful and popular front-end framework.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Firebase&lt;/strong&gt;: A real-time cloud database, authentication, and hosting platform.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Angular Material&lt;/strong&gt;: A UI component library…&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/Varshithvhegde/notepage" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
This was just a side project I built while learning Angular. The UI wasn't great, and it's full of bugs. But here's the thing: my office has a strict NO ChatGPT policy (we can only use a company AI), and copying/sharing text was difficult. When a friend needed to share something, I suggested notepage, and it blew up within the office! I tried improving the UI, but people love the old version, so I kept it. Now I'm almost hitting free tier limits, but I'll keep it free because projects like these taught me so much and drove me to write for the DEV community.&lt;/p&gt;
&lt;h2&gt;
  
  
  The DEV Community Love
&lt;/h2&gt;

&lt;p&gt;And then came the end of the year. Oh wow.&lt;/p&gt;

&lt;p&gt;I finally reached &lt;strong&gt;10K followers&lt;/strong&gt;! I remember celebrating my first 1K like it was yesterday (it was actually 2 years ago).&lt;/p&gt;

&lt;p&gt;

&lt;/p&gt;
&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/varshithvhegde/achieving-1k-followers-on-devto-my-journey-to-success-201n" class="crayons-story__hidden-navigation-link"&gt;Achieving 1K Followers on dev.to: My Journey to Success&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/varshithvhegde" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F885064%2F4ab304f4-a3f3-409c-8217-9ce130e57c18.jpeg" alt="varshithvhegde profile" class="crayons-avatar__image"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/varshithvhegde" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Varshith V Hegde
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Varshith V Hegde
                &lt;a href="/++"&gt;&lt;img alt="Subscriber" class="subscription-icon" src="https://assets.dev.to/assets/subscription-icon-805dfa7ac7dd660f07ed8d654877270825b07a92a03841aa99a1093bd00431b2.png"&gt;&lt;/a&gt;
              
              &lt;div id="story-author-preview-content-1376964" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/varshithvhegde" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F885064%2F4ab304f4-a3f3-409c-8217-9ce130e57c18.jpeg" class="crayons-avatar__image" alt=""&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Varshith V Hegde&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/varshithvhegde/achieving-1k-followers-on-devto-my-journey-to-success-201n" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;Feb 23 '23&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/varshithvhegde/achieving-1k-followers-on-devto-my-journey-to-success-201n" id="article-link-1376964"&gt;
          Achieving 1K Followers on dev.to: My Journey to Success
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/career"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;career&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/productivity"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;productivity&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/networking"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;networking&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/mentorship"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;mentorship&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/varshithvhegde/achieving-1k-followers-on-devto-my-journey-to-success-201n" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/fire-f60e7a582391810302117f987b22a8ef04a2fe0df7e3258a5f49332df1cec71e.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/raised-hands-74b2099fd66a39f2d7eed9305ee0f4553df0eb7b4f11b01b6b1b499973048fe5.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;24&lt;span class="hidden s:inline"&gt; reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/varshithvhegde/achieving-1k-followers-on-devto-my-journey-to-success-201n#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              Comments


              18&lt;span class="hidden s:inline"&gt; comments&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            3 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;




&lt;p&gt;I was also a &lt;strong&gt;Top Weekly Author twice&lt;/strong&gt;! My DEV profile is practically part of my resume now. Whenever I'm in an interview, I proudly show it and talk about my journey. Why not? I've worked hard for this.&lt;/p&gt;

&lt;p&gt;I participated in 5 DEV Challenges and &lt;strong&gt;won 2 of them&lt;/strong&gt;. These challenges helped me grow immensely. A huge thank you to the entire DEV Team for creating such an amazing initiative!&lt;/p&gt;

&lt;h2&gt;
  
  
  Shoutouts
&lt;/h2&gt;

&lt;p&gt;Some amazing devs/writers whose content I absolutely loved this year:&lt;/p&gt;

&lt;p&gt;

&lt;/p&gt;
&lt;div class="ltag__user ltag__user__id__3226798"&gt;
    &lt;a href="/axrisi" class="ltag__user__link profile-image-link"&gt;
      &lt;div class="ltag__user__pic"&gt;
        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3226798%2F0c0a8594-658c-4146-a639-8068ede85f67.jpg" alt="axrisi image"&gt;
      &lt;/div&gt;
    &lt;/a&gt;
  &lt;div class="ltag__user__content"&gt;
    &lt;h2&gt;
&lt;a class="ltag__user__link" href="/axrisi"&gt;Nikoloz Turazashvili (@axrisi)&lt;/a&gt;Follow
&lt;/h2&gt;
    &lt;div class="ltag__user__summary"&gt;
      &lt;a class="ltag__user__link" href="/axrisi"&gt;Founder &amp;amp; CTO at Vexrail (www. vexrail.com), Axrisi (www.axrisi.com). Opened Chicos restaurant in Tbilisi, Georgia.&lt;/a&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;


&lt;br&gt;


&lt;div class="ltag__user ltag__user__id__941720"&gt;
    &lt;a href="/dumebii" class="ltag__user__link profile-image-link"&gt;
      &lt;div class="ltag__user__pic"&gt;
        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F941720%2Ff316bf93-ef0b-4bc5-aee2-5e062255d5f0.jpg" alt="dumebii image"&gt;
      &lt;/div&gt;
    &lt;/a&gt;
  &lt;div class="ltag__user__content"&gt;
    &lt;h2&gt;
&lt;a class="ltag__user__link" href="/dumebii"&gt;Dumebi Okolo&lt;/a&gt;Follow
&lt;/h2&gt;
    &lt;div class="ltag__user__summary"&gt;
      &lt;a class="ltag__user__link" href="/dumebii"&gt;Confident technical writer with frontend developer skills, marketing skills and developer relations skills. 
I am also a very fun person to hang around with. &lt;/a&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;


&lt;br&gt;


&lt;div class="ltag__user ltag__user__id__965723"&gt;
    &lt;a href="/arindam_1729" class="ltag__user__link profile-image-link"&gt;
      &lt;div class="ltag__user__pic"&gt;
        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F965723%2Fe0982512-4de1-4154-b3c3-1869d19e9ecc.png" alt="arindam_1729 image"&gt;
      &lt;/div&gt;
    &lt;/a&gt;
  &lt;div class="ltag__user__content"&gt;
    &lt;h2&gt;
&lt;a class="ltag__user__link" href="/arindam_1729"&gt;Arindam Majumder &lt;/a&gt;Follow
&lt;/h2&gt;
    &lt;div class="ltag__user__summary"&gt;
      &lt;a class="ltag__user__link" href="/arindam_1729"&gt;Developer Advocate | Technical Writer | 600k+ Reads | Mail for Collabs&lt;/a&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;


&lt;br&gt;


&lt;div class="ltag__user ltag__user__id__889475"&gt;
    &lt;a href="/divyasinghdev" class="ltag__user__link profile-image-link"&gt;
      &lt;div class="ltag__user__pic"&gt;
        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F889475%2F1165be61-6903-4b59-af67-c262acfb1c94.webp" alt="divyasinghdev image"&gt;
      &lt;/div&gt;
    &lt;/a&gt;
  &lt;div class="ltag__user__content"&gt;
    &lt;h2&gt;
&lt;a class="ltag__user__link" href="/divyasinghdev"&gt;Divya&lt;/a&gt;Follow
&lt;/h2&gt;
    &lt;div class="ltag__user__summary"&gt;
      &lt;a class="ltag__user__link" href="/divyasinghdev"&gt;A curious lifelong learner, currently a full-time Masters student persuing Computer Science stream. Enthusiastic about development.&lt;/a&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;




&lt;h2&gt;
  
  
  Looking Ahead to 2026
&lt;/h2&gt;

&lt;p&gt;So yeah, the second half of 2025 was my redemption arc. I really loved this year, and my resolutions are crystal clear:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Get all my friends to participate in DEV Challenges&lt;/li&gt;
&lt;li&gt;Level up my skills in AI and Agentic development&lt;/li&gt;
&lt;li&gt;Travel more (maybe even an international trip!)&lt;/li&gt;
&lt;li&gt;Blog about my journey more consistently&lt;/li&gt;
&lt;li&gt;Start learning rock climbing (need to get in shape first 😅)&lt;/li&gt;
&lt;li&gt;Finally get my driver's license 😭 (It was my 2025 resolution too, but I still haven't done it!)&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;That's all for this year, folks!&lt;/p&gt;

&lt;p&gt;Thank you everyone for the support and love. Here's to an even better 2026!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Happy New Year to ALL!&lt;/strong&gt; 🎉&lt;/p&gt;

</description>
      <category>programming</category>
      <category>beginners</category>
      <category>career</category>
      <category>webdev</category>
    </item>
    <item>
      <title>I Built a Form Backend in a Weekend Because Paying $20/Month for Contact Forms is Stupid</title>
      <dc:creator>Varshith V Hegde</dc:creator>
      <pubDate>Tue, 30 Dec 2025 15:59:39 +0000</pubDate>
      <link>https://forem.com/varshithvhegde/i-built-a-form-backend-in-a-weekend-because-paying-20month-for-contact-forms-is-stupid-1o34</link>
      <guid>https://forem.com/varshithvhegde/i-built-a-form-backend-in-a-weekend-because-paying-20month-for-contact-forms-is-stupid-1o34</guid>
      <description>&lt;p&gt;So here's the thing - I was helping my friend set up his portfolio site last weekend. Everything was going smooth. Nice design, fast site, Vercel hosting on the free tier. Perfect.&lt;/p&gt;

&lt;p&gt;Then he goes "I need a contact form."&lt;/p&gt;

&lt;p&gt;Cool, I say. Just use one of those form backend services. Easy.&lt;/p&gt;

&lt;p&gt;He checks the pricing. "$20 a month?! Just to save some text?"&lt;/p&gt;

&lt;p&gt;And honestly? He's right. When did we all just accept this?&lt;/p&gt;

&lt;h2&gt;
  
  
  This got me thinking
&lt;/h2&gt;

&lt;p&gt;We're paying Netflix money for what's basically a database insert and an email. That's it. Store some text, send a notification. &lt;/p&gt;

&lt;p&gt;I spent more time being annoyed about this than I'd like to admit. Then I figured - you know what, I can probably build this myself. How hard can it be?&lt;/p&gt;

&lt;h2&gt;
  
  
  What I built
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgbxv3sc4hq6fgdistm06.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgbxv3sc4hq6fgdistm06.png" alt="FormRelay" width="800" height="573"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://formrelay.varshithvhegde.in" rel="noopener noreferrer"&gt;FormRelay&lt;/a&gt; is pretty straightforward. You point your HTML form at it, it saves the data, sends you an email, and shows everything in a dashboard. That's the whole thing.&lt;/p&gt;

&lt;p&gt;The difference? You host it yourself. Your Supabase database. Your Vercel deployment. Your data.&lt;/p&gt;

&lt;p&gt;And the best part? Supabase free tier gives you 50k users. Vercel hobby plan is free. Resend gives you 3k emails free per month.&lt;/p&gt;

&lt;p&gt;So yeah. $0/month vs $20/month. You do the math.&lt;/p&gt;

&lt;h2&gt;
  
  
  "But isn't self-hosting complicated?"
&lt;/h2&gt;

&lt;p&gt;This is what everyone says. And look, 10 years ago? Sure. You needed to know server management, deal with security updates, all that stuff.&lt;/p&gt;

&lt;p&gt;But now? With Vercel and Supabase?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fork the repo&lt;/li&gt;
&lt;li&gt;Click deploy&lt;/li&gt;
&lt;li&gt;Copy paste some environment variables&lt;/li&gt;
&lt;li&gt;You're done&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Took me longer to write the README than it takes to deploy this thing.&lt;/p&gt;

&lt;p&gt;Compare that to creating yet another account, entering your credit card, dealing with their dashboard, hitting some arbitrary limit, and then having to migrate everything when they raise prices next year.&lt;/p&gt;

&lt;p&gt;Which one sounds more complicated?&lt;/p&gt;

&lt;h2&gt;
  
  
  Here's where I might lose some of you
&lt;/h2&gt;

&lt;p&gt;I think we've gotten too comfortable renting everything.&lt;/p&gt;

&lt;p&gt;The whole indie web thing used to be about actually owning your stuff. Yeah, it was messier. Yeah, you had to learn things. But your website was &lt;em&gt;yours&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Now? We just subscribe to everything. And sure, time is money, I get that. Not everyone wants to deal with infrastructure.&lt;/p&gt;

&lt;p&gt;But there's this huge gap between "run your own server rack" and "pay someone $300/year to store contact form entries."&lt;/p&gt;

&lt;p&gt;That's the gap I'm trying to fill here.&lt;/p&gt;

&lt;h2&gt;
  
  
  Random things I learned
&lt;/h2&gt;

&lt;p&gt;Next.js 15 is actually good now. After all the App Router drama, it finally feels right. Server actions just work. No more fighting with it.&lt;/p&gt;

&lt;p&gt;Supabase is wild. Real-time updates, auth, and actual good documentation? Sign me up.&lt;/p&gt;

&lt;p&gt;The hardest part wasn't the code. It was making the setup instructions clear enough that anyone could follow them. Spent way too long on that.&lt;/p&gt;

&lt;h2&gt;
  
  
  About the email thing
&lt;/h2&gt;

&lt;p&gt;I used Resend because my domain didn't come with email when I bought it, and I wasn't planning to buy email hosting separately. Resend's free tier (3k emails/month) was perfect for this.&lt;/p&gt;

&lt;p&gt;But here's the thing - if you already have email with your domain, you can just swap Resend for regular SMTP. It's actually simpler in some ways. Just plug in your SMTP credentials and you're good to go.&lt;/p&gt;

&lt;p&gt;So even that "dependency" isn't really a dependency. Use what you've got.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tech stuff if you care
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Next.js 15 (app router)&lt;/li&gt;
&lt;li&gt;Supabase (postgres + realtime + auth)&lt;/li&gt;
&lt;li&gt;Tailwind CSS&lt;/li&gt;
&lt;li&gt;Radix UI&lt;/li&gt;
&lt;li&gt;Resend for emails (or just use SMTP if you have it)&lt;/li&gt;
&lt;li&gt;Lucide icons&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Nothing fancy. Just stuff that works and doesn't break.&lt;/p&gt;

&lt;h2&gt;
  
  
  You can use it
&lt;/h2&gt;

&lt;p&gt;Whole thing's on GitHub: &lt;a href="https://github.com/Varshithvhegde/formrelay" rel="noopener noreferrer"&gt;github.com/Varshithvhegde/formrelay&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Live demo: &lt;a href="https://formrelay.varshithvhegde.in" rel="noopener noreferrer"&gt;formrelay.varshithvhegde.in&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It's MIT licensed. Do whatever you want with it. If you find bugs, let me know. If you want to add features, PRs are open.&lt;/p&gt;

&lt;h2&gt;
  
  
  The actual point
&lt;/h2&gt;

&lt;p&gt;This isn't really about forms or saving money.&lt;/p&gt;

&lt;p&gt;It's about remembering that we can actually build stuff ourselves. We don't need a SaaS product for every little thing. The tools are there. The platforms are free. We know how to code.&lt;/p&gt;

&lt;p&gt;A lot of problems that cost $20/month are actually just weekend projects in disguise.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Quick note: Yes, I know SaaS companies provide value. Support, maintenance, features, etc. But for something as basic as form handling? Come on.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>saas</category>
      <category>webdev</category>
      <category>beginners</category>
    </item>
    <item>
      <title>If you are creating any Multi Agent or AI apps you need to check this out!!!</title>
      <dc:creator>Varshith V Hegde</dc:creator>
      <pubDate>Sun, 21 Dec 2025 15:47:12 +0000</pubDate>
      <link>https://forem.com/varshithvhegde/if-you-are-creating-any-multi-agent-or-ai-apps-you-need-to-check-this-out-gd6</link>
      <guid>https://forem.com/varshithvhegde/if-you-are-creating-any-multi-agent-or-ai-apps-you-need-to-check-this-out-gd6</guid>
      <description>&lt;p&gt;

&lt;/p&gt;
&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/varshithvhegde/bifrost-the-llm-gateway-thats-40x-faster-than-litellm-1763" class="crayons-story__hidden-navigation-link"&gt;Bifrost: The LLM Gateway That's 40x Faster Than LiteLLM&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/varshithvhegde" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F885064%2F4ab304f4-a3f3-409c-8217-9ce130e57c18.jpeg" alt="varshithvhegde profile" class="crayons-avatar__image"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/varshithvhegde" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Varshith V Hegde
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Varshith V Hegde
                &lt;a href="/++"&gt;&lt;img alt="Subscriber" class="subscription-icon" src="https://assets.dev.to/assets/subscription-icon-805dfa7ac7dd660f07ed8d654877270825b07a92a03841aa99a1093bd00431b2.png"&gt;&lt;/a&gt;
              
              &lt;div id="story-author-preview-content-3105139" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/varshithvhegde" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F885064%2F4ab304f4-a3f3-409c-8217-9ce130e57c18.jpeg" class="crayons-avatar__image" alt=""&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Varshith V Hegde&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/varshithvhegde/bifrost-the-llm-gateway-thats-40x-faster-than-litellm-1763" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;Dec 18 '25&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/varshithvhegde/bifrost-the-llm-gateway-thats-40x-faster-than-litellm-1763" id="article-link-3105139"&gt;
          Bifrost: The LLM Gateway That's 40x Faster Than LiteLLM
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/webdev"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;webdev&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/ai"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;ai&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/programming"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;programming&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/agents"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;agents&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/varshithvhegde/bifrost-the-llm-gateway-thats-40x-faster-than-litellm-1763" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/exploding-head-daceb38d627e6ae9b730f36a1e390fca556a4289d5a41abb2c35068ad3e2c4b5.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/multi-unicorn-b44d6f8c23cdd00964192bedc38af3e82463978aa611b4365bd33a0f1f4f3e97.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;49&lt;span class="hidden s:inline"&gt; reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/varshithvhegde/bifrost-the-llm-gateway-thats-40x-faster-than-litellm-1763#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              Comments


              2&lt;span class="hidden s:inline"&gt; comments&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            10 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;




</description>
      <category>webdev</category>
      <category>ai</category>
      <category>programming</category>
      <category>agents</category>
    </item>
    <item>
      <title>Bifrost: The LLM Gateway That's 40x Faster Than LiteLLM</title>
      <dc:creator>Varshith V Hegde</dc:creator>
      <pubDate>Thu, 18 Dec 2025 16:26:42 +0000</pubDate>
      <link>https://forem.com/varshithvhegde/bifrost-the-llm-gateway-thats-40x-faster-than-litellm-1763</link>
      <guid>https://forem.com/varshithvhegde/bifrost-the-llm-gateway-thats-40x-faster-than-litellm-1763</guid>
      <description>&lt;p&gt;&lt;em&gt;A technical deep dive into Bifrost: an open-source, self-hostable Go LLM gateway&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Gateway Overhead in Production LLM Systems
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbs45bxvrgxhffhui7hea.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbs45bxvrgxhffhui7hea.png" alt="Performance comparison chart showing Bifrost's superior latency metrics compared to traditional Python-based gateways"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In most LLM systems, the gateway becomes a shared dependency: it affects tail latency, routing/failover behavior, retries, and cost attribution across providers. LiteLLM works well as a lightweight Python proxy, but in our production-like load tests we started seeing gateway overhead and operational complexity show up at higher concurrency. We moved to Bifrost for lower overhead and for first-class features like governance, cost semantics, and observability built into the gateway.&lt;/p&gt;

&lt;p&gt;In our benchmark setup (with logging/retries enabled), LiteLLM added hundreds of microseconds of overhead per request. Results vary by deployment mode and configuration. When handling thousands of requests per second, this overhead compounds—infrastructure costs increase, tail latency suffers, and operational complexity grows.&lt;/p&gt;

&lt;p&gt;Bifrost takes a different approach.&lt;/p&gt;




&lt;h2&gt;
  
  
  Enter Bifrost
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F75wyxmd1vbgk3crej1ym.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F75wyxmd1vbgk3crej1ym.jpg" alt="Artistic representation of Bifrost bridge from Norse mythology, symbolizing the connection between different AI providers"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Bifrost is an LLM gateway written in Go that adds approximately 11 microseconds of overhead per request in our test environment. That's roughly 40x faster than what we observed with LiteLLM in comparable configurations.&lt;/p&gt;

&lt;p&gt;But the performance improvement is just one part of the story. Bifrost rethinks the control plane for LLM infrastructure—providing governance, cost attribution, and observability as first-class gateway features rather than requiring external tooling or application-level instrumentation.&lt;/p&gt;

&lt;p&gt;Let me walk through the technical details.&lt;/p&gt;




&lt;p&gt;

&lt;/p&gt;
&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/maximhq" rel="noopener noreferrer"&gt;
        maximhq
      &lt;/a&gt; / &lt;a href="https://github.com/maximhq/bifrost" rel="noopener noreferrer"&gt;
        bifrost
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      Fastest enterprise AI gateway (50x faster than LiteLLM) with adaptive load balancer, cluster mode, guardrails, 1000+ models support &amp;amp; &amp;lt;100 µs overhead at 5k RPS.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;Bifrost AI Gateway&lt;/h1&gt;
&lt;/div&gt;
&lt;p&gt;&lt;a href="https://goreportcard.com/report/github.com/maximhq/bifrost/core" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/7f7e70df9fdaaf4f485f59ca6bc0b5cbbf134d03dd5721da4e31f90f618fc304/68747470733a2f2f676f7265706f7274636172642e636f6d2f62616467652f6769746875622e636f6d2f6d6178696d68712f626966726f73742f636f7265" alt="Go Report Card"&gt;&lt;/a&gt;
&lt;a href="https://discord.gg/exN5KAydbU" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/282b7719f04b28f5959f5e1e17aee806d65f8eea3b862b57af350df0ab57be6f/68747470733a2f2f646362616467652e6c696d65732e70696e6b2f6170692f7365727665722f68747470733a2f2f646973636f72642e67672f65784e354b41796462553f7374796c653d666c6174" alt="Discord badge"&gt;&lt;/a&gt;
&lt;a href="https://snyk.io/test/github/maximhq/bifrost" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/fb28496133f724daf03a8107e56978a14f6f2ed7e7283df0573747fa46ff8f86/68747470733a2f2f736e796b2e696f2f746573742f6769746875622f6d6178696d68712f626966726f73742f62616467652e737667" alt="Known Vulnerabilities"&gt;&lt;/a&gt;
&lt;a href="https://codecov.io/gh/maximhq/bifrost" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/8bc2db302c566210d14c09b278639a3f63f07def5fc635a8869e59c996b3100f/68747470733a2f2f636f6465636f762e696f2f67682f6d6178696d68712f626966726f73742f6272616e63682f6d61696e2f67726170682f62616467652e737667" alt="codecov"&gt;&lt;/a&gt;
&lt;a rel="noopener noreferrer nofollow" href="https://camo.githubusercontent.com/b0899925aadfed8626116707178a4015d8cf4aaa0b80acb632cb4782c6dc7272/68747470733a2f2f696d672e736869656c64732e696f2f646f636b65722f70756c6c732f6d6178696d68712f626966726f7374"&gt;&lt;img src="https://camo.githubusercontent.com/b0899925aadfed8626116707178a4015d8cf4aaa0b80acb632cb4782c6dc7272/68747470733a2f2f696d672e736869656c64732e696f2f646f636b65722f70756c6c732f6d6178696d68712f626966726f7374" alt="Docker Pulls"&gt;&lt;/a&gt;
&lt;a href="https://app.getpostman.com/run-collection/31642484-2ba0e658-4dcd-49f4-845a-0c7ed745b916?action=collection%2Ffork&amp;amp;source=rip_markdown&amp;amp;collection-url=entityId%3D31642484-2ba0e658-4dcd-49f4-845a-0c7ed745b916%26entityType%3Dcollection%26workspaceId%3D63e853c8-9aec-477f-909c-7f02f543150e" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/82ccefddb001e2caf9d399f1153fdda561cf3da341bb270e18644d516906bc64/68747470733a2f2f72756e2e7073746d6e2e696f2f627574746f6e2e737667" alt="Run In Postman"&gt;&lt;/a&gt;
&lt;a href="https://artifacthub.io/packages/search?repo=bifrost" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/a6a3c734d6bd57fa8e1d508ac0cdba555bdbcd9191b29b32cf37a964b86b9c67/68747470733a2f2f696d672e736869656c64732e696f2f656e64706f696e743f75726c3d68747470733a2f2f61727469666163746875622e696f2f62616467652f7265706f7369746f72792f626966726f7374" alt="Artifact Hub"&gt;&lt;/a&gt;
&lt;a href="https://github.com/maximhq/bifrost/LICENSE" rel="noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/3cb44c15a532770a066ba8e61bf11506ad5400e5c61d48f6b639101e442bee79/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f6c6963656e73652f6d6178696d68712f626966726f7374" alt="License"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;The fastest way to build AI applications that never go down&lt;/h2&gt;
&lt;/div&gt;
&lt;p&gt;Bifrost is a high-performance AI gateway that unifies access to 15+ providers (OpenAI, Anthropic, AWS Bedrock, Google Vertex, and more) through a single OpenAI-compatible API. Deploy in seconds with zero configuration and get automatic failover, load balancing, semantic caching, and enterprise-grade features.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Quick Start&lt;/h2&gt;
&lt;/div&gt;
&lt;p&gt;&lt;a rel="noopener noreferrer" href="https://github.com/maximhq/bifrost/./docs/media/getting-started.png"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fgithub.com%2Fmaximhq%2Fbifrost%2F.%2Fdocs%2Fmedia%2Fgetting-started.png" alt="Get started"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Go from zero to production-ready AI gateway in under a minute.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 1:&lt;/strong&gt; Start Bifrost Gateway&lt;/p&gt;
&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; Install and run locally&lt;/span&gt;
npx -y @maximhq/bifrost

&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; Or use Docker&lt;/span&gt;
docker run -p 8080:8080 maximhq/bifrost&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Step 2:&lt;/strong&gt; Configure via Web UI&lt;/p&gt;
&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; Open the built-in web interface&lt;/span&gt;
open http://localhost:8080&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Step 3:&lt;/strong&gt; Make your first API call&lt;/p&gt;
&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;curl -X POST http://localhost:8080/v1/chat/completions \
  -H &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Content-Type: application/json&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; \
  -d &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;{&lt;/span&gt;
&lt;span class="pl-s"&gt;    "model": "openai/gpt-4o-mini",&lt;/span&gt;
&lt;span class="pl-s"&gt;    "messages": [{"role": "user", "content": "Hello, Bifrost!"}]&lt;/span&gt;
&lt;span class="pl-s"&gt;  }&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;That's it!&lt;/strong&gt; Your AI gateway is running with a web interface for visual configuration…&lt;/p&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/maximhq/bifrost" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;







&lt;h2&gt;
  
  
  Setup and Deployment
&lt;/h2&gt;

&lt;p&gt;Traditional gateway deployment often involves managing Python environments, dependency chains, and configuration files. Here's the Bifrost approach:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx &lt;span class="nt"&gt;-y&lt;/span&gt; @maximhq/bifrost
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This single command downloads a pre-compiled binary for your platform and starts a production-ready gateway on port 8080 with a web UI for configuration.&lt;/p&gt;

&lt;p&gt;Compare this to typical Python gateway setup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install Python (verify version compatibility)&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;litellm
&lt;span class="c"&gt;# Configure environment variables&lt;/span&gt;
&lt;span class="c"&gt;# Set up configuration file&lt;/span&gt;
&lt;span class="c"&gt;# Install additional dependencies for features&lt;/span&gt;
&lt;span class="c"&gt;# Debug environment-specific issues&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Bifrost uses NPX to download a pre-compiled binary for your platform. No Python interpreter required. No virtual environments. No dependency resolution. A single statically-linked executable that runs immediately.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Go for Gateway Infrastructure
&lt;/h2&gt;

&lt;p&gt;The choice of Go over Python has measurable impacts on production systems, particularly around concurrency, memory efficiency, and operational simplicity.&lt;/p&gt;

&lt;h3&gt;
  
  
  Concurrency Model
&lt;/h3&gt;

&lt;p&gt;Python gateways scale via async and multiple workers. At high concurrency, the tradeoffs show up as higher memory per instance, coordination overhead, and tail-latency under burst.&lt;/p&gt;

&lt;p&gt;Go doesn't have these constraints. Go's goroutines are lightweight threads that can run truly in parallel across all available CPU cores. When a request arrives, Bifrost spawns a goroutine. When a thousand requests arrive simultaneously, Bifrost spawns a thousand goroutines—all running concurrently with minimal overhead.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F74u8fincu1xsulx6yuer.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F74u8fincu1xsulx6yuer.gif" alt="Animated visualization showing parallel goroutines processing multiple requests simultaneously versus sequential Python execution"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Memory Efficiency
&lt;/h3&gt;

&lt;p&gt;A Python process typically requires 30-50MB of memory at startup in most configurations. Add Flask or FastAPI, and baseline memory usage often reaches 100MB+ before handling any requests, though this varies based on the specific setup and dependencies.&lt;br&gt;
The entire Bifrost binary is approximately 20MB. In memory, a single Bifrost instance uses roughly 50MB under sustained load while handling thousands of requests per second.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj7d7v5sm3zfeiv2wargj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj7d7v5sm3zfeiv2wargj.png" alt="Memory usage comparison graph showing Bifrost's 10x improvement over Python-based solutions"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Startup Time
&lt;/h3&gt;

&lt;p&gt;Python applications require time to initialize—import packages, start the interpreter, load configurations. Typical startup time is 2-3 seconds minimum.&lt;/p&gt;

&lt;p&gt;Bifrost starts in milliseconds. This matters for autoscaling, development iteration, and serverless deployments where cold starts impact user experience.&lt;/p&gt;
&lt;h3&gt;
  
  
  Benchmark Results
&lt;/h3&gt;

&lt;p&gt;Here are measurements from a sustained load test on a t3.xlarge EC2 instance at 5,000 requests per second:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;LiteLLM&lt;/th&gt;
&lt;th&gt;Bifrost&lt;/th&gt;
&lt;th&gt;Improvement&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Gateway Overhead&lt;/td&gt;
&lt;td&gt;440 µs&lt;/td&gt;
&lt;td&gt;11 µs&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;40x faster&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Memory Usage&lt;/td&gt;
&lt;td&gt;~500 MB&lt;/td&gt;
&lt;td&gt;~50 MB&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;10x less&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gateway-level Failures&lt;/td&gt;
&lt;td&gt;11%&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;No failures observed&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Queue Wait Time&lt;/td&gt;
&lt;td&gt;47 µs&lt;/td&gt;
&lt;td&gt;1.67 µs&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;28x faster&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Total Latency (with provider)&lt;/td&gt;
&lt;td&gt;2.12 s&lt;/td&gt;
&lt;td&gt;1.61 s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;24% faster&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;These measurements represent sustained load over multiple hours, not synthetic benchmarks.&lt;/p&gt;


&lt;h2&gt;
  
  
  Beyond Performance: Control-Plane Features That Matter in Production
&lt;/h2&gt;

&lt;p&gt;The main reason to move from LiteLLM to Bifrost isn't language; it's control-plane features. Bifrost adds governance (virtual keys, budgets, rate limits), consistent cost attribution, and production-oriented observability at the gateway layer, not scattered across application code.&lt;/p&gt;

&lt;p&gt;This architectural choice centralizes concerns that would otherwise require external services or application-level instrumentation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Governance controls&lt;/strong&gt; managed at the gateway rather than per-application&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost attribution&lt;/strong&gt; with per-request tracking and aggregation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Observability&lt;/strong&gt; with structured logs, metrics, and request tracing built-in&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Failure isolation&lt;/strong&gt; with circuit breakers and automatic failover&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let's examine these features in detail.&lt;/p&gt;


&lt;h2&gt;
  
  
  Production Features
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Automatic Failover
&lt;/h3&gt;

&lt;p&gt;When your primary provider hits rate limits or experiences downtime, requests should seamlessly move to backup providers without manual intervention.&lt;/p&gt;

&lt;p&gt;Bifrost configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"fallbacks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"enabled"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"order"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"openai/gpt-4o-mini"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"anthropic/claude-sonnet-4"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"mistral/mistral-large-latest"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fztd2yda89qwu2uir0pg8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fztd2yda89qwu2uir0pg8.png" alt="Configuration interface showing the automatic failover setup with multiple provider options"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When OpenAI returns a rate limit error, Bifrost automatically retries with Anthropic. If that fails, it tries Mistral. Your application receives a successful response without implementing retry logic.&lt;/p&gt;

&lt;h3&gt;
  
  
  Load Balancing
&lt;/h3&gt;

&lt;p&gt;Distributing load across multiple API keys prevents any single key from hitting rate limits:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"providers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"openai"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"keys"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"key-1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sk-..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"weight"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;2.0&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"key-2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sk-..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"weight"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"key-3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sk-..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"weight"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5x886s6le2rtz70d55k0.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5x886s6le2rtz70d55k0.jpg" alt="Diagram illustrating weighted load balancing distribution across multiple API keys with traffic percentages"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The first key receives 50% of traffic, the other two receive 25% each. When one key approaches its rate limit, Bifrost automatically shifts load to healthy keys.&lt;/p&gt;

&lt;h3&gt;
  
  
  Semantic Caching
&lt;/h3&gt;

&lt;p&gt;Semantic caching isn't a new concept; teams can build it externally, but Bifrost ships it as a first-class gateway feature, reducing moving parts.&lt;/p&gt;

&lt;p&gt;Traditional caching requires exact string matches. But users rarely phrase questions identically:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"What's the weather like?"&lt;/li&gt;
&lt;li&gt;"How's the weather today?"&lt;/li&gt;
&lt;li&gt;"Tell me about current weather conditions"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are semantically equivalent. Bifrost uses vector embeddings to understand semantic similarity:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkqo49836cp7gr5bla2ul.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkqo49836cp7gr5bla2ul.png" alt="Flowchart showing the semantic caching process from request to cache lookup to response delivery"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Request arrives: "What is Python?"&lt;/li&gt;
&lt;li&gt;Bifrost generates an embedding using a fast model&lt;/li&gt;
&lt;li&gt;Checks vector store for similar embeddings&lt;/li&gt;
&lt;li&gt;Finds previous request: "Explain Python to me"&lt;/li&gt;
&lt;li&gt;Returns cached response (similarity score: 0.92)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdxd5zd002y62tk0n9e1d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdxd5zd002y62tk0n9e1d.png" alt="Dashboard showing semantic caching metrics with hit rates and similarity scores"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Result: No LLM call required. Response in approximately 5 milliseconds instead of 2 seconds. Cost: $0.00 instead of $0.0001.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwqr422lm3huy9x7gu88x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwqr422lm3huy9x7gu88x.png" alt="Screenshot displaying successful cache hit with performance metrics and cost savings"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Savings depend on cache hit-rate and workload repetition. Over a million requests with 60% cache hit rate, this saves approximately $60.&lt;/p&gt;

&lt;h3&gt;
  
  
  Unified Interface
&lt;/h3&gt;

&lt;p&gt;Every LLM provider has different API formats. OpenAI uses one schema. Anthropic uses another. Bedrock and Vertex AI each have their own specifications.&lt;/p&gt;

&lt;p&gt;Bifrost provides a single API that works with all providers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;

&lt;span class="c1"&gt;# Change only the base URL
&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;not-needed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://localhost:8080/openai&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Use ANY provider with the same code
&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;anthropic/claude-sonnet-4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Not an OpenAI model
&lt;/span&gt;    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hello&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your application code remains unchanged. Switch providers by modifying one line. No refactoring required. No rewriting integration tests.&lt;/p&gt;

&lt;h3&gt;
  
  
  Model Context Protocol (MCP)
&lt;/h3&gt;

&lt;p&gt;MCP is Anthropic's protocol for letting AI models use external tools. Integration with web search, filesystem access, or database queries:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"enabled"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"servers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"web-search"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@modelcontextprotocol/server-brave-search"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"filesystem"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@modelcontextprotocol/server-filesystem"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/workspace"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This enables AI models to perform actions rather than only generating text responses.&lt;/p&gt;




&lt;h2&gt;
  
  
  Web UI and Observability
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fucwmkjaka1dnxi7y0zvi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fucwmkjaka1dnxi7y0zvi.png" alt="Main dashboard of Bifrost's web interface showing real-time metrics, request analytics, and provider status"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Most gateways provide configuration files and command-line tools. Bifrost includes a comprehensive web interface at &lt;code&gt;http://localhost:8080&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dashboard:&lt;/strong&gt; Real-time metrics showing request counts, error rates, and costs per provider&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Providers:&lt;/strong&gt; Visual configuration for all providers with click-based key management&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9q92sd1rnry03iu4ow1d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9q92sd1rnry03iu4ow1d.png" alt="Provider management interface showing API key configuration with visual controls and status indicators"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Logs:&lt;/strong&gt; Complete request/response history with token usage, searchable and filterable&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnz310f4yja01ybwuzlz5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnz310f4yja01ybwuzlz5.png" alt="Detailed request log viewer with filtering options, showing request/response pairs and token usage"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Settings:&lt;/strong&gt; Configure caching, governance, and plugins without editing configuration files&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0r9ek1qc8v7oxyu5i9k9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0r9ek1qc8v7oxyu5i9k9.png" alt="Settings panel displaying caching configuration, governance rules, and plugin management options"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;All configuration, monitoring, and debugging can be performed through the web interface without SSH access to servers or manual log analysis.&lt;/p&gt;




&lt;h2&gt;
  
  
  Architecture Details
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Request Flow
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa41gvb3yrs5fzbiw7oq9.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa41gvb3yrs5fzbiw7oq9.jpg" alt="Architecture diagram showing the complete request lifecycle through Bifrost's processing pipeline"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Request arrives&lt;/strong&gt; at Bifrost's HTTP server&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Request validation&lt;/strong&gt; happens in microseconds&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cache lookup&lt;/strong&gt; checks semantic cache if enabled&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cache hit?&lt;/strong&gt; Return immediately (approximately 5ms total)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cache miss?&lt;/strong&gt; Continue to provider selection&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Load balancer&lt;/strong&gt; selects API key based on weights and health&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Concurrent request&lt;/strong&gt; dispatched to provider (goroutine spawned)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Response streaming&lt;/strong&gt; begins immediately if enabled&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cache storage&lt;/strong&gt; happens asynchronously (non-blocking)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Response returns&lt;/strong&gt; to client with metadata&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;All operations are non-blocking where possible. Cache lookup doesn't block provider calls in no-store mode. Cache storage doesn't delay response delivery.&lt;/p&gt;

&lt;h3&gt;
  
  
  Concurrency Implementation
&lt;/h3&gt;

&lt;p&gt;Bifrost uses Go's goroutines for concurrency:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Traditional Python Threading:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Request 1 → Thread 1 → Process (limited parallelism)
Request 2 → Thread 2 → Wait/Process (coordination overhead)
Request 3 → Thread 3 → Wait/Process (memory per thread)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Bifrost Goroutines:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Request 1 → Goroutine 1 ⟍
Request 2 → Goroutine 2 ⟋→ All process in parallel → Responses
Request 3 → Goroutine 3 ⟋
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each goroutine uses approximately 2KB of memory. You can run millions concurrently.&lt;/p&gt;

&lt;h3&gt;
  
  
  Vector Store Integration
&lt;/h3&gt;

&lt;p&gt;For semantic caching, Bifrost integrates with Weaviate:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Request arrives with cache key: "user-session-123"&lt;/li&gt;
&lt;li&gt;Bifrost extracts message content&lt;/li&gt;
&lt;li&gt;Generates embedding using fast model (text-embedding-3-small)&lt;/li&gt;
&lt;li&gt;Searches Weaviate for similar embeddings (threshold: 0.8)&lt;/li&gt;
&lt;li&gt;Finds match with similarity 0.92&lt;/li&gt;
&lt;li&gt;Returns cached response with metadata&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Embedding generation: approximately 50ms. Vector search: approximately 10ms. Total: 60ms compared to 2000ms for an actual LLM call.&lt;/p&gt;




&lt;h2&gt;
  
  
  Setting Up Semantic Caching
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Install Weaviate
&lt;/h3&gt;

&lt;p&gt;Use Docker for local development:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; weaviate &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; 8081:8080 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  semitechnologies/weaviate:latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or use Weaviate Cloud (free tier available) at &lt;a href="https://console.weaviate.cloud/" rel="noopener noreferrer"&gt;https://console.weaviate.cloud/&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Configure Bifrost
&lt;/h3&gt;

&lt;p&gt;Update your &lt;code&gt;config.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"providers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"openai"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"keys"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"main"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"env.OPENAI_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"models"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"gpt-4o-mini"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"vector_store"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"enabled"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"weaviate"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"config"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"host"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"localhost:8081"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"scheme"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"http"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"plugins"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"enabled"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"semantic_cache"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"config"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"provider"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"openai"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"embedding_model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"text-embedding-3-small"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"ttl"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"5m"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"threshold"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.8&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Test the Cache
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# First request (cache miss, calls LLM)&lt;/span&gt;
curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8080/v1/chat/completions &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"x-bf-cache-key: user-123"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "model": "openai/gpt-4o-mini",
    "messages": [{"role": "user", "content": "What is Docker?"}]
  }'&lt;/span&gt;

&lt;span class="c"&gt;# Similar request (cache hit, fast response)&lt;/span&gt;
curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8080/v1/chat/completions &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"x-bf-cache-key: user-123"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "model": "openai/gpt-4o-mini",
    "messages": [{"role": "user", "content": "Explain Docker to me"}]
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The second request returns in approximately 60ms instead of 2000ms.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cache Hit Response
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"choices"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"extra_fields"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"cache_debug"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"cache_hit"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"hit_type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"semantic"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"similarity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.94&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"threshold"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"provider_used"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"openai"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"model_used"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"text-embedding-3-small"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The response includes debugging information showing the similarity score and threshold. Adjust the threshold based on your accuracy requirements.&lt;/p&gt;




&lt;h2&gt;
  
  
  Drop-In Replacement
&lt;/h2&gt;

&lt;p&gt;You can replace existing OpenAI or Anthropic SDK calls with Bifrost by changing one parameter.&lt;/p&gt;

&lt;h3&gt;
  
  
  Python Example
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Before:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sk-...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o-mini&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hello&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;After:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;not-needed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://localhost:8080/openai&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;# Only change
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o-mini&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hello&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Node.js Example
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Before:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;OpenAI&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;openai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;OPENAI_API_KEY&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;After:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;OpenAI&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;openai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;not-needed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;baseURL&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;http://localhost:8080/openai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;  &lt;span class="c1"&gt;// Only change&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Benefits
&lt;/h3&gt;

&lt;p&gt;This approach enables:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Adding Bifrost to existing applications without refactoring&lt;/li&gt;
&lt;li&gt;Testing in production with gradual rollout&lt;/li&gt;
&lt;li&gt;Access to all Bifrost features (caching, fallbacks, monitoring) immediately&lt;/li&gt;
&lt;li&gt;Easy rollback if needed&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Monitoring and Observability
&lt;/h2&gt;

&lt;p&gt;Bifrost exposes Prometheus metrics at &lt;code&gt;/metrics&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Request metrics
bifrost_requests_total{provider="openai",model="gpt-4o-mini"} 1543
bifrost_request_duration_seconds{provider="openai"} 1.234

# Cache metrics
bifrost_cache_hits_total{type="semantic"} 892
bifrost_cache_misses_total 651

# Error metrics
bifrost_errors_total{provider="openai",type="rate_limit"} 12
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Grafana Dashboard
&lt;/h3&gt;

&lt;p&gt;Connect Prometheus to Grafana for visualization:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Requests per second by provider&lt;/li&gt;
&lt;li&gt;Latency percentiles (p50, p95, p99)&lt;/li&gt;
&lt;li&gt;Cache hit rates over time&lt;/li&gt;
&lt;li&gt;Cost tracking per provider&lt;/li&gt;
&lt;li&gt;Error rates and types&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Structured Logging
&lt;/h3&gt;

&lt;p&gt;Bifrost logs to structured JSON:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"level"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"info"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"time"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2024-01-15T10:30:00Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"msg"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"request completed"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"provider"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"openai"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"gpt-4o-mini"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"duration_ms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1234&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;456&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"cache_hit"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This format integrates with any log aggregation service (CloudWatch, Datadog, Elasticsearch).&lt;/p&gt;




&lt;h2&gt;
  
  
  Common Configuration Issues
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Issue 1: Missing Cache Key
&lt;/h3&gt;

&lt;p&gt;Semantic caching requires the &lt;code&gt;x-bf-cache-key&lt;/code&gt; header. Without it, every request is a cache miss.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Incorrect:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8080/v1/chat/completions &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{...}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Correct:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8080/v1/chat/completions &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"x-bf-cache-key: user-session-123"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{...}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Issue 2: Threshold Configuration
&lt;/h3&gt;

&lt;p&gt;Start with a threshold of 0.8 and adjust based on cache hit rate and accuracy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"threshold"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.8&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;//&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Starting&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;point&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Monitor your cache hit rate. If below 30%, lower the threshold to 0.75. If you're getting incorrect cached results, raise it to 0.85.&lt;/p&gt;

&lt;h3&gt;
  
  
  Issue 3: Config Store Requirement
&lt;/h3&gt;

&lt;p&gt;Some plugins require a config store:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"config_store"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"enabled"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sqlite"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"config"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"path"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"./config.db"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Issue 4: Weaviate Network Configuration
&lt;/h3&gt;

&lt;p&gt;Ensure Weaviate is accessible from Bifrost. For Docker deployments:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"vector_store"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"enabled"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"weaviate"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"config"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"host"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"weaviate-container:8080"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;//&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Use&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;correct&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;hostname&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"scheme"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"http"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For Weaviate Cloud:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"vector_store"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"enabled"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"weaviate"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"config"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"host"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;weaviate-host&amp;gt;.gcp.weaviate.cloud"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"scheme"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"api_key"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;weaviate-api-key&amp;gt;"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  When to Use Bifrost
&lt;/h2&gt;

&lt;p&gt;Bifrost provides immediate value for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Production systems handling more than 1,000 requests per day&lt;/li&gt;
&lt;li&gt;Applications where tail latency impacts user experience&lt;/li&gt;
&lt;li&gt;Teams that need automatic failover without complex orchestration&lt;/li&gt;
&lt;li&gt;Organizations tracking LLM costs across multiple providers&lt;/li&gt;
&lt;li&gt;Systems requiring governance controls (rate limits, budgets, virtual keys)&lt;/li&gt;
&lt;li&gt;Deployments where operational simplicity reduces maintenance burden&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Even for smaller projects, Bifrost's minimal overhead and built-in features provide a robust foundation that scales without requiring future refactoring.&lt;/p&gt;

&lt;h3&gt;
  
  
  Getting Started
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Run &lt;code&gt;npx -y @maximhq/bifrost&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Open &lt;code&gt;http://localhost:8080&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Add your API keys in the UI&lt;/li&gt;
&lt;li&gt;Point your application to &lt;code&gt;http://localhost:8080/openai&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Monitor performance and costs through the dashboard&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Resources
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/maximhq/bifrost" rel="noopener noreferrer"&gt;github.com/maximhq/bifrost&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Website:&lt;/strong&gt; &lt;a href="https://getbifrost.ai" rel="noopener noreferrer"&gt;getbifrost.ai&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Documentation:&lt;/strong&gt; &lt;a href="https://docs.getbifrost.ai/quickstart/gateway/setting-up" rel="noopener noreferrer"&gt;docs.getbifrost.ai&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Questions or feedback? Please leave a comment below. If you use Bifrost in production, I'd be interested to hear about your experience and any challenges you encounter.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>ai</category>
      <category>programming</category>
      <category>agents</category>
    </item>
    <item>
      <title>HTML tags that NO ONE talks about</title>
      <dc:creator>Varshith V Hegde</dc:creator>
      <pubDate>Sun, 14 Dec 2025 11:01:34 +0000</pubDate>
      <link>https://forem.com/varshithvhegde/html-tags-thatll-make-your-life-easier-no-really-570i</link>
      <guid>https://forem.com/varshithvhegde/html-tags-thatll-make-your-life-easier-no-really-570i</guid>
      <description>&lt;p&gt;Look, I get it. HTML can feel boring sometimes. You throw in your divs and spans, maybe toss in a form here and there, and call it a day. But there are some genuinely cool HTML elements that most of us completely ignore, and honestly? They make things so much simpler.&lt;/p&gt;

&lt;p&gt;Let me show you six HTML tags that actually solve real problems. No fancy jargon, just practical stuff you can start using today.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. The &lt;code&gt;&amp;lt;template&amp;gt;&lt;/code&gt; Element - Your HTML Blueprint
&lt;/h2&gt;

&lt;p&gt;Okay, so you need to add stuff to your page dynamically. Maybe it's a new comment, a product card, whatever. You've probably been doing one of these:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The old way (don't do this):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Using innerHTML - looks easy but it's dangerous&lt;/span&gt;
&lt;span class="nx"&gt;element&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;innerHTML&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;&amp;lt;div class="card"&amp;gt;&amp;lt;h3&amp;gt;&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;userInput&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;&amp;lt;/h3&amp;gt;&amp;lt;/div&amp;gt;&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="c1"&gt;// If userInput has &amp;lt;script&amp;gt; tags? Yeah, you're in trouble.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or maybe you went the "safe but annoying" route:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// createElement - safe but tedious&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;div&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createElement&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;div&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;div&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;className&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;card&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;h3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createElement&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;h3&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;h3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;textContent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;userInput&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nx"&gt;div&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;appendChild&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;h3&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// This gets old real fast with complex HTML&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The better way with &lt;code&gt;&amp;lt;template&amp;gt;&lt;/code&gt;:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;template&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"card-template"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;div&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"card"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;h3&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"card-title"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&amp;lt;/h3&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;p&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"card-description"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&amp;lt;/p&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/div&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/template&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Clone it and use it&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;template&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getElementById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;card-template&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;clone&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;template&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cloneNode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Fill in your data&lt;/span&gt;
&lt;span class="nx"&gt;clone&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;querySelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.card-title&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;textContent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;userInput&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nx"&gt;clone&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;querySelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.card-description&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;textContent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;description&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// Add it to the page&lt;/span&gt;
&lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;appendChild&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;clone&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why this rocks: The HTML stays in HTML where it belongs, it's safer than innerHTML, and it's faster because the browser only has to layout the new element instead of re-parsing everything.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Input Types You Didn't Know You Had
&lt;/h2&gt;

&lt;p&gt;Stop reinventing the wheel with custom date pickers and validation libraries. The humble &lt;code&gt;&amp;lt;input&amp;gt;&lt;/code&gt; tag is way more powerful than you think.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Email and phone validation built right in:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;input&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;"email"&lt;/span&gt; &lt;span class="na"&gt;placeholder=&lt;/span&gt;&lt;span class="s"&gt;"your@email.com"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;input&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;"tel"&lt;/span&gt; &lt;span class="na"&gt;placeholder=&lt;/span&gt;&lt;span class="s"&gt;"555-1234"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On mobile, these automatically show the right keyboard. Plus, you get free validation styling:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight css"&gt;&lt;code&gt;&lt;span class="nt"&gt;input&lt;/span&gt;&lt;span class="nd"&gt;:invalid&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;border-color&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="no"&gt;red&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Want pattern matching without JavaScript?&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;input&lt;/span&gt; 
  &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt; 
  &lt;span class="na"&gt;pattern=&lt;/span&gt;&lt;span class="s"&gt;"[0-9]{3}-[0-9]{3}-[0-9]{4}"&lt;/span&gt;
  &lt;span class="na"&gt;placeholder=&lt;/span&gt;&lt;span class="s"&gt;"123-456-7890"&lt;/span&gt;
&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Date and time pickers that just work:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;input&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;"date"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;input&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;"datetime-local"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;input&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;"month"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;input&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;"week"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;input&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;"time"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No library needed. They work across all modern browsers and look native on each platform.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Other handy types:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;input&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;"search"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;  &lt;span class="c"&gt;&amp;lt;!-- Gets a little X to clear on some browsers --&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;input&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;"range"&lt;/span&gt; &lt;span class="na"&gt;min=&lt;/span&gt;&lt;span class="s"&gt;"0"&lt;/span&gt; &lt;span class="na"&gt;max=&lt;/span&gt;&lt;span class="s"&gt;"100"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;  &lt;span class="c"&gt;&amp;lt;!-- Slider! --&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;input&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;"color"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;  &lt;span class="c"&gt;&amp;lt;!-- Color picker! --&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Autocomplete with &lt;code&gt;&amp;lt;datalist&amp;gt;&lt;/code&gt;:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;input&lt;/span&gt; &lt;span class="na"&gt;list=&lt;/span&gt;&lt;span class="s"&gt;"cities"&lt;/span&gt; &lt;span class="na"&gt;placeholder=&lt;/span&gt;&lt;span class="s"&gt;"Choose a city"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;datalist&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"cities"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;option&lt;/span&gt; &lt;span class="na"&gt;value=&lt;/span&gt;&lt;span class="s"&gt;"New York"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;option&lt;/span&gt; &lt;span class="na"&gt;value=&lt;/span&gt;&lt;span class="s"&gt;"Los Angeles"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;option&lt;/span&gt; &lt;span class="na"&gt;value=&lt;/span&gt;&lt;span class="s"&gt;"Chicago"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;option&lt;/span&gt; &lt;span class="na"&gt;value=&lt;/span&gt;&lt;span class="s"&gt;"Houston"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/datalist&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Type a few letters and get suggestions. Zero JavaScript required.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. The &lt;code&gt;inert&lt;/code&gt; Attribute - Focus Control Made Easy
&lt;/h2&gt;

&lt;p&gt;Ever had a modal popup where users could still tab to stuff behind it? Super annoying, right?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Before (the hard way):&lt;/strong&gt;&lt;br&gt;
You'd have to manually track all focusable elements, disable them, remember their state, re-enable them later... ugh.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;After (the easy way):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;div&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"page-content"&lt;/span&gt; &lt;span class="na"&gt;inert&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="c"&gt;&amp;lt;!-- Everything in here becomes unclickable and unfocusable --&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;button&amp;gt;&lt;/span&gt;Can't click me now&lt;span class="nt"&gt;&amp;lt;/button&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;input&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;  &lt;span class="c"&gt;&amp;lt;!-- Can't focus this either --&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/div&amp;gt;&lt;/span&gt;

&lt;span class="nt"&gt;&amp;lt;div&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"modal"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="c"&gt;&amp;lt;!-- Only stuff in here is interactive --&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;button&amp;gt;&lt;/span&gt;Close&lt;span class="nt"&gt;&amp;lt;/button&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/div&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. One attribute. The browser handles everything, including making it work properly with screen readers.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. The &lt;code&gt;&amp;lt;dialog&amp;gt;&lt;/code&gt; Element - Modals That Don't Suck
&lt;/h2&gt;

&lt;p&gt;Speaking of modals, stop building them from scratch with divs and a prayer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Here's how simple it can be:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;dialog&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"my-dialog"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;h2&amp;gt;&lt;/span&gt;Hello!&lt;span class="nt"&gt;&amp;lt;/h2&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;p&amp;gt;&lt;/span&gt;This is a proper dialog.&lt;span class="nt"&gt;&amp;lt;/p&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;button&lt;/span&gt; &lt;span class="na"&gt;onclick=&lt;/span&gt;&lt;span class="s"&gt;"this.closest('dialog').close()"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;Close&lt;span class="nt"&gt;&amp;lt;/button&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/dialog&amp;gt;&lt;/span&gt;

&lt;span class="nt"&gt;&amp;lt;button&lt;/span&gt; &lt;span class="na"&gt;onclick=&lt;/span&gt;&lt;span class="s"&gt;"document.getElementById('my-dialog').showModal()"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  Open Dialog
&lt;span class="nt"&gt;&amp;lt;/button&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;dialog&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getElementById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;my-dialog&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Show it as a modal (with backdrop, makes everything else inert)&lt;/span&gt;
&lt;span class="nx"&gt;dialog&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;showModal&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// Or show it without the modal behavior&lt;/span&gt;
&lt;span class="nx"&gt;dialog&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;show&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// Close it&lt;/span&gt;
&lt;span class="nx"&gt;dialog&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Control how it closes:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="c"&gt;&amp;lt;!-- Only closes via JavaScript --&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;dialog&lt;/span&gt; &lt;span class="na"&gt;closedby=&lt;/span&gt;&lt;span class="s"&gt;"none"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;...&lt;span class="nt"&gt;&amp;lt;/dialog&amp;gt;&lt;/span&gt;

&lt;span class="c"&gt;&amp;lt;!-- Closes with Escape key or JavaScript --&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;dialog&lt;/span&gt; &lt;span class="na"&gt;closedby=&lt;/span&gt;&lt;span class="s"&gt;"closerequest"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;...&lt;span class="nt"&gt;&amp;lt;/dialog&amp;gt;&lt;/span&gt;

&lt;span class="c"&gt;&amp;lt;!-- Closes with Escape, clicking outside, or JavaScript --&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;dialog&lt;/span&gt; &lt;span class="na"&gt;closedby=&lt;/span&gt;&lt;span class="s"&gt;"any"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;...&lt;span class="nt"&gt;&amp;lt;/dialog&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The backdrop (that dark overlay) comes free, everything else on the page becomes inert automatically, and it's accessible by default.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. The &lt;code&gt;&amp;lt;picture&amp;gt;&lt;/code&gt; Element - Responsive Images Done Right
&lt;/h2&gt;

&lt;p&gt;Stop loading massive desktop images on phones. Your users on data plans will thank you.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Basic responsive images:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;picture&amp;gt;&lt;/span&gt;
  &lt;span class="c"&gt;&amp;lt;!-- Small screens get the mobile version --&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;source&lt;/span&gt; &lt;span class="na"&gt;media=&lt;/span&gt;&lt;span class="s"&gt;"(max-width: 500px)"&lt;/span&gt; &lt;span class="na"&gt;srcset=&lt;/span&gt;&lt;span class="s"&gt;"small-image.jpg"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;

  &lt;span class="c"&gt;&amp;lt;!-- Medium screens get the tablet version --&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;source&lt;/span&gt; &lt;span class="na"&gt;media=&lt;/span&gt;&lt;span class="s"&gt;"(max-width: 1000px)"&lt;/span&gt; &lt;span class="na"&gt;srcset=&lt;/span&gt;&lt;span class="s"&gt;"medium-image.jpg"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;

  &lt;span class="c"&gt;&amp;lt;!-- Fallback for everything else --&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;img&lt;/span&gt; &lt;span class="na"&gt;src=&lt;/span&gt;&lt;span class="s"&gt;"large-image.jpg"&lt;/span&gt; &lt;span class="na"&gt;alt=&lt;/span&gt;&lt;span class="s"&gt;"Description"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/picture&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Handle high-DPI displays:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;picture&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;source&lt;/span&gt; 
    &lt;span class="na"&gt;media=&lt;/span&gt;&lt;span class="s"&gt;"(max-width: 500px)"&lt;/span&gt; 
    &lt;span class="na"&gt;srcset=&lt;/span&gt;&lt;span class="s"&gt;"mobile.jpg 1x, mobile@2x.jpg 2x, mobile@3x.jpg 3x"&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;img&lt;/span&gt; &lt;span class="na"&gt;src=&lt;/span&gt;&lt;span class="s"&gt;"desktop.jpg"&lt;/span&gt; &lt;span class="na"&gt;alt=&lt;/span&gt;&lt;span class="s"&gt;"Description"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/picture&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Different crops for different orientations:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;picture&amp;gt;&lt;/span&gt;
  &lt;span class="c"&gt;&amp;lt;!-- Portrait orientation gets a vertical crop --&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;source&lt;/span&gt; 
    &lt;span class="na"&gt;media=&lt;/span&gt;&lt;span class="s"&gt;"(orientation: portrait)"&lt;/span&gt; 
    &lt;span class="na"&gt;srcset=&lt;/span&gt;&lt;span class="s"&gt;"portrait-crop.jpg"&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;

  &lt;span class="c"&gt;&amp;lt;!-- Landscape gets a horizontal crop --&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;img&lt;/span&gt; &lt;span class="na"&gt;src=&lt;/span&gt;&lt;span class="s"&gt;"landscape-crop.jpg"&lt;/span&gt; &lt;span class="na"&gt;alt=&lt;/span&gt;&lt;span class="s"&gt;"Description"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/picture&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The browser picks the first matching source and only downloads that one image. Smart, right?&lt;/p&gt;

&lt;h2&gt;
  
  
  6. The &lt;code&gt;&amp;lt;output&amp;gt;&lt;/code&gt; Element - Show Your Results
&lt;/h2&gt;

&lt;p&gt;This one's simple but makes your forms way more accessible.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Instead of this:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;span&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"total"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;$0.00&lt;span class="nt"&gt;&amp;lt;/span&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Do this:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;label&amp;gt;&lt;/span&gt;
  Price: 
  &lt;span class="nt"&gt;&amp;lt;input&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;"number"&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"price"&lt;/span&gt; &lt;span class="na"&gt;value=&lt;/span&gt;&lt;span class="s"&gt;"0"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/label&amp;gt;&lt;/span&gt;

&lt;span class="nt"&gt;&amp;lt;label&amp;gt;&lt;/span&gt;
  Quantity: 
  &lt;span class="nt"&gt;&amp;lt;input&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;"number"&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"quantity"&lt;/span&gt; &lt;span class="na"&gt;value=&lt;/span&gt;&lt;span class="s"&gt;"0"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/label&amp;gt;&lt;/span&gt;

&lt;span class="nt"&gt;&amp;lt;output&lt;/span&gt; &lt;span class="na"&gt;for=&lt;/span&gt;&lt;span class="s"&gt;"price quantity"&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"total"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;$0.00&lt;span class="nt"&gt;&amp;lt;/output&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;updateTotal&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;price&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getElementById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;price&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;quantity&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getElementById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;quantity&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getElementById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;total&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;textContent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;$&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;price&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;quantity&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toFixed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why bother? Screen readers understand that this is a calculated result related to those specific inputs. They can announce the change without moving focus away from what the user is doing. It's just better UX.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;p&gt;HTML keeps getting better, but a lot of us are still writing it like it's 2010. These tags aren't exotic or experimental anymore. They work, they're supported, and they solve real problems.&lt;/p&gt;

&lt;p&gt;Next time you reach for a JavaScript library or start building something from scratch, take a minute to check if HTML already has what you need. You might be surprised.&lt;/p&gt;

&lt;p&gt;Now go forth and use some proper HTML tags. Your future self (and your users) will appreciate it.&lt;/p&gt;

</description>
      <category>programming</category>
      <category>webdev</category>
      <category>productivity</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
