<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Abid Ali</title>
    <description>The latest articles on Forem by Abid Ali (@buildwithabid).</description>
    <link>https://forem.com/buildwithabid</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3862016%2F0f5c37f0-d375-4e95-8528-1762f0dba065.jpeg</url>
      <title>Forem: Abid Ali</title>
      <link>https://forem.com/buildwithabid</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/buildwithabid"/>
    <language>en</language>
    <item>
      <title>I Got Tired of Writing Documentation. So I Built a Tool to Do It For Me.</title>
      <dc:creator>Abid Ali</dc:creator>
      <pubDate>Thu, 09 Apr 2026 05:41:58 +0000</pubDate>
      <link>https://forem.com/buildwithabid/i-got-tired-of-writing-documentation-so-i-built-a-tool-to-do-it-for-me-5hkf</link>
      <guid>https://forem.com/buildwithabid/i-got-tired-of-writing-documentation-so-i-built-a-tool-to-do-it-for-me-5hkf</guid>
      <description>&lt;p&gt;Every project I ship has the same problem at the end.&lt;/p&gt;

&lt;p&gt;The code works. The tests pass. And then I have to write the README.&lt;/p&gt;

&lt;p&gt;Not a bad README — a real one. Architecture decisions, API endpoints, setup instructions, module breakdown. The kind of documentation that makes someone else's first hour with your codebase not a nightmare.&lt;/p&gt;

&lt;p&gt;I kept putting it off. Then I'd come back to my own projects two weeks later and spend 20 minutes remembering how my own code worked.&lt;/p&gt;

&lt;p&gt;So I built a tool to generate it automatically.&lt;/p&gt;




&lt;h2&gt;
  
  
  What it does
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;repo2docs&lt;/code&gt; points at a GitHub repo or a local directory and generates three documents:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;README.md&lt;/code&gt; — setup, usage, what the project does&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ARCHITECTURE.md&lt;/code&gt; — how the codebase is structured, entry points, module breakdown&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;API.md&lt;/code&gt; — HTTP endpoints, routes, request/response shapes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One command. Three documents. Done.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Point at a GitHub repo&lt;/span&gt;
repo2docs https://github.com/owner/repository

&lt;span class="c"&gt;# Or a local directory&lt;/span&gt;
repo2docs &lt;span class="nb"&gt;.&lt;/span&gt;

&lt;span class="c"&gt;# Custom output folder&lt;/span&gt;
repo2docs ../my-service &lt;span class="nt"&gt;--output&lt;/span&gt; ./docs-output/my-service
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output goes to &lt;code&gt;repo2docs-output/&amp;lt;repo-name&amp;gt;/&lt;/code&gt; by default. No flags required, no config files, no setup beyond install.&lt;/p&gt;




&lt;h2&gt;
  
  
  What it actually detects
&lt;/h2&gt;

&lt;p&gt;The part I'm most proud of is that it doesn't generate generic documentation. It reads the actual codebase.&lt;/p&gt;

&lt;p&gt;It picks up:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Entry points and important modules&lt;/li&gt;
&lt;li&gt;Package manager, build tools, test setup, linting, CI signals&lt;/li&gt;
&lt;li&gt;Framework detection — Express routes, mounted router prefixes composed into full paths like &lt;code&gt;/api/users&lt;/code&gt; not just raw router-local paths&lt;/li&gt;
&lt;li&gt;Environment files and notable repository patterns&lt;/li&gt;
&lt;li&gt;Language distribution across the project&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last one matters. A generic documentation generator will tell you "this project uses JavaScript." This one tells you which files are entry points, which modules are doing the heavy work, and how the HTTP layer is structured.&lt;/p&gt;




&lt;h2&gt;
  
  
  The problem it solves
&lt;/h2&gt;

&lt;p&gt;Documentation debt is one of those things that compounds silently. You skip the README on Monday because you're shipping. You skip the architecture doc on Tuesday because the code is obvious. By Friday you have a codebase that works perfectly and is completely opaque to anyone who didn't write it — including you in three weeks.&lt;/p&gt;

&lt;p&gt;The real cost isn't the time it takes to write docs. It's the time every future reader spends reconstructing understanding that already existed in your head when you wrote the code.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;repo2docs&lt;/code&gt; captures that understanding at the moment it's cheapest — right after you've shipped — and turns it into documents that stay with the codebase.&lt;/p&gt;




&lt;h2&gt;
  
  
  How to try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install
&lt;/span&gt;npm run build
repo2docs https://github.com/your-repo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It runs against any public GitHub repo, so you can try it on something you already know well and see how accurately it captures the architecture.&lt;/p&gt;

&lt;p&gt;Repo: &lt;a href="https://github.com/BuildWithAbid/repo2docs" rel="noopener noreferrer"&gt;github.com/BuildWithAbid/repo2docs&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Curious what the output looks like on your codebase — drop a comment if you try it.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built with Claude Code. Part of a suite of open-source developer tools at github.com/BuildWithAbid&lt;/em&gt;&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>typescript</category>
      <category>opensource</category>
      <category>todayilearned</category>
    </item>
    <item>
      <title>How I Found $1,240/Month in Wasted LLM API Costs (And Built a Tool to Find Yours)</title>
      <dc:creator>Abid Ali</dc:creator>
      <pubDate>Sun, 05 Apr 2026 08:48:00 +0000</pubDate>
      <link>https://forem.com/buildwithabid/how-i-found-1240month-in-wasted-llm-api-costs-and-built-a-tool-to-find-yours-3041</link>
      <guid>https://forem.com/buildwithabid/how-i-found-1240month-in-wasted-llm-api-costs-and-built-a-tool-to-find-yours-3041</guid>
      <description>&lt;p&gt;I was spending about $2,000/month on OpenAI and Anthropic APIs across a few projects.&lt;/p&gt;

&lt;p&gt;I knew some of it was wasteful. I just couldn't prove it. The provider dashboards show you one number — your total bill. That's like getting an electricity bill with no breakdown. Is it the AC? The lights? The server room? No idea.&lt;/p&gt;

&lt;p&gt;So I built a tool to find out. What it discovered was honestly embarrassing.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I found
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;34% of my summarizer calls were retries.&lt;/strong&gt; The prompt asked for JSON, but the model kept wrapping it in markdown code blocks. My parser rejected it. The retry loop ran the same call again. And again. Each retry cost money. Total waste: about $140/month — from a six-word fix I could have made months ago.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;85% of my classifier calls were duplicates.&lt;/strong&gt; Same input, same output, full price every time. No caching. 723 of 847 weekly calls were completely redundant. A simple cache would have saved $310/month.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My classifier was using GPT-4o for a yes/no task.&lt;/strong&gt; The output was always under 10 tokens — one of five fixed labels. GPT-4o-mini produces identical results at a fraction of the cost. Savings: $71/month.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My chatbot was stuffing the entire conversation history into every call.&lt;/strong&gt; By message 20, the input was 3,200 tokens and growing. Only the last few messages mattered. Truncating to the last 5 saves $155/month.&lt;/p&gt;

&lt;p&gt;Total: &lt;strong&gt;$1,240/month in waste&lt;/strong&gt; out of a $2,847 monthly spend. That's 43%.&lt;/p&gt;

&lt;h2&gt;
  
  
  The tool: LLM Cost Profiler
&lt;/h2&gt;

&lt;p&gt;I packaged all of this into an open-source Python CLI. Here's how it works.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Install
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;llm-spend-profiler
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Wrap your client (2 lines of code)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llm_cost_profiler&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;wrap&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;wrap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Your code works exactly as before. Every API call is now silently logged to a local SQLite database. If logging fails for any reason, it fails silently — your app is never affected.&lt;/p&gt;

&lt;p&gt;Works with Anthropic too:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Anthropic&lt;/span&gt;
&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;wrap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Anthropic&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: See where your money goes
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;llmcost report
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;LLM Cost Report — Last 7 Days
========================================
Total: $847.32 | 2.4M tokens | 12,847 calls

By Feature:
  summarizer         $412.80  (48.7%)  ████████████████████
  chatbot            $203.11  (24.0%)  ████████████
  classifier          $89.40  (10.5%)  █████
  content_gen         $78.22   (9.2%)  ████
  extraction          $41.50   (4.9%)  ██
  untagged            $22.29   (2.6%)  █

Warnings:
  ⚠ summarizer: 34% of calls are retries ($140.15 wasted)
  ⚠ chatbot: avg 3,200 input tokens but only 180 output tokens (context bloat)
  ⚠ classifier: using gpt-4o but output is always &amp;lt;10 tokens (cheaper model works)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Find the waste
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;llmcost optimize
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;LLM Cost Optimization Report
========================================
Current monthly spend (projected): $2,847
Potential savings found: $1,240/month (43.5%)

  #1 CACHE — classifier.py:34                        [SAVE $310/mo]
     85% of calls are exact duplicates (723 of 847/week)
     → Add @cache decorator
     Confidence: HIGH

  #2 RETRY FIX — content_gen.py:112                   [SAVE $180/mo]
     28% retry rate from JSON parse errors
     → Fix prompt to return raw JSON
     Confidence: HIGH

  #3 MODEL DOWNGRADE — classifier.py:34               [SAVE $71/mo]
     Output is always &amp;lt;10 tokens, one of 5 fixed labels
     → Switch gpt-4o to gpt-4o-mini
     Confidence: MEDIUM

  #4 CONTEXT BLOAT — chatbot.py:123                   [SAVE $155/mo]
     Avg 3,200 input tokens, growing over conversation
     → Truncate history to last 5 messages
     Confidence: MEDIUM
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each recommendation includes the exact file and line number, estimated monthly savings, and a confidence level.&lt;/p&gt;

&lt;h2&gt;
  
  
  Other features worth knowing about
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;llmcost hotspots&lt;/code&gt;&lt;/strong&gt; — ranks your code locations by cost. Auto-detected from the Python call stack, no manual annotation needed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Top Cost Hotspots:
  1. features/summarizer.py:47   summarize_doc()    $412.80/week   4,201 calls
  2. api/chat.py:123             handle_message()   $203.11/week   3,892 calls
  3. pipeline/classify.py:34     classify_text()     $89.40/week   2,847 calls
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;&lt;code&gt;llmcost compare&lt;/code&gt;&lt;/strong&gt; — week-over-week comparison to catch sudden spikes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;llmcost dashboard&lt;/code&gt;&lt;/strong&gt; — opens a local web dashboard at localhost:8177 with treemap charts, cost timelines, and an optimization waterfall. Single HTML file, no npm, no build step.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tagging&lt;/strong&gt; — group costs by feature, customer, or environment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llm_cost_profiler&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;tag&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;tag&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;feature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;summarizer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;customer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;acme_corp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(...)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Caching decorator&lt;/strong&gt; — stop paying for duplicate calls:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llm_cost_profiler&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt;

&lt;span class="nd"&gt;@cache&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ttl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3600&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;classify_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(...)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  How it works under the hood
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Wrapper&lt;/strong&gt;: Transparent proxy pattern — intercepts method calls without monkey-patching.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Storage&lt;/strong&gt;: SQLite with WAL mode at &lt;code&gt;~/.llmcost/data.db&lt;/code&gt;. Thread-safe.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pricing&lt;/strong&gt;: Built-in lookup table for OpenAI and Anthropic models.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Call site detection&lt;/strong&gt;: Walks the Python call stack to auto-detect which function triggered each call.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zero dependencies&lt;/strong&gt;: Only uses the Python standard library.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Privacy&lt;/strong&gt;: Everything stays local. Nothing is sent anywhere.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Try it on your codebase
&lt;/h2&gt;

&lt;p&gt;If you're making LLM API calls in any project, I'm genuinely curious what it finds. In my experience, every codebase has at least one surprise — usually duplicate calls that nobody knew about.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/BuildWithAbid/llm-cost-profiler" rel="noopener noreferrer"&gt;https://github.com/BuildWithAbid/llm-cost-profiler&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Install:&lt;/strong&gt; &lt;code&gt;pip install llm-spend-profiler&lt;/code&gt;&lt;br&gt;
&lt;strong&gt;License:&lt;/strong&gt; MIT&lt;/p&gt;

&lt;p&gt;If you find issues or have ideas for what else it should detect, open an issue or drop a comment here. This is my first open-source project and I'd love feedback.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>tutorial</category>
      <category>python</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
