<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: BaoDev Studio</title>
    <description>The latest articles on Forem by BaoDev Studio (@baodev-studio).</description>
    <link>https://forem.com/baodev-studio</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3934831%2F7b5f5f01-c79a-44e6-8605-cd76fde3dd44.jpg</url>
      <title>Forem: BaoDev Studio</title>
      <link>https://forem.com/baodev-studio</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/baodev-studio"/>
    <language>en</language>
    <item>
      <title>Postgres JSONB indexes: GIN vs BTREE on the same column</title>
      <dc:creator>BaoDev Studio</dc:creator>
      <pubDate>Mon, 25 May 2026 13:15:00 +0000</pubDate>
      <link>https://forem.com/baodev-studio/postgres-jsonb-indexes-gin-vs-btree-on-the-same-column-4hb0</link>
      <guid>https://forem.com/baodev-studio/postgres-jsonb-indexes-gin-vs-btree-on-the-same-column-4hb0</guid>
      <description>&lt;p&gt;caught this in production last quarter and the answer is more boring than i expected: GIN and BTREE on the same JSONB column solve different problems, and the right choice depends on the SHAPE of your queries, not the size of the data.&lt;/p&gt;

&lt;p&gt;heres the actual benchmark + when each one wins.&lt;/p&gt;

&lt;h2&gt;
  
  
  the setup
&lt;/h2&gt;

&lt;p&gt;table: &lt;code&gt;users&lt;/code&gt; with 2.3M rows. one column &lt;code&gt;attributes JSONB&lt;/code&gt;. typical row contains:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"plan"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pro"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"country"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ID"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"signup_source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"organic"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"feature_flags"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"beta_search"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"new_billing"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"preferences"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"theme"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"dark"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"lang"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;queries i run regularly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Q1&lt;/strong&gt;: &lt;code&gt;WHERE attributes-&amp;gt;&amp;gt;'plan' = 'pro'&lt;/code&gt; — find all pro users&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Q2&lt;/strong&gt;: &lt;code&gt;WHERE attributes @&amp;gt; '{"feature_flags": ["beta_search"]}'&lt;/code&gt; — find users with a feature flag&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Q3&lt;/strong&gt;: &lt;code&gt;WHERE attributes-&amp;gt;&amp;gt;'country' = 'ID' AND attributes-&amp;gt;&amp;gt;'plan' = 'pro'&lt;/code&gt; — pro users in indonesia&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Q4&lt;/strong&gt;: &lt;code&gt;WHERE attributes ? 'preferences'&lt;/code&gt; — users who have the preferences key at all&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;four queries. different optimal indexes for each.&lt;/p&gt;

&lt;h2&gt;
  
  
  the three index options
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- option A: GIN on whole column&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_attrs_gin&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;GIN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;attributes&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- option B: GIN with jsonb_path_ops (faster but only for @&amp;gt;)&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_attrs_gin_path&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;GIN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;attributes&lt;/span&gt; &lt;span class="n"&gt;jsonb_path_ops&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- option C: BTREE on extracted scalar&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_attrs_plan&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;attributes&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class="s1"&gt;'plan'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  actual EXPLAIN ANALYZE results (2.3M rows)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Q1: &lt;code&gt;WHERE attributes-&amp;gt;&amp;gt;'plan' = 'pro'&lt;/code&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;no index: 480ms (seq scan)&lt;/li&gt;
&lt;li&gt;GIN (option A): 220ms (bitmap scan, recheck) — GIN is generic, returns false positives, needs the recheck step&lt;/li&gt;
&lt;li&gt;GIN path_ops (option B): doesn't work for &lt;code&gt;-&amp;gt;&amp;gt;&lt;/code&gt; extraction. ignored.&lt;/li&gt;
&lt;li&gt;BTREE on &lt;code&gt;(attributes-&amp;gt;&amp;gt;'plan')&lt;/code&gt; (option C): &lt;strong&gt;8ms&lt;/strong&gt; (index scan, no recheck) — winner by 27x&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Q2: &lt;code&gt;WHERE attributes @&amp;gt; '{"feature_flags": ["beta_search"]}'&lt;/code&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;no index: 510ms (seq scan)&lt;/li&gt;
&lt;li&gt;GIN: 14ms (bitmap scan) — solid&lt;/li&gt;
&lt;li&gt;GIN path_ops: &lt;strong&gt;6ms&lt;/strong&gt; — winner. path_ops loses other operators but is 2.3x faster for &lt;code&gt;@&amp;gt;&lt;/code&gt; specifically because it stores hashed paths only.&lt;/li&gt;
&lt;li&gt;BTREE on extracted: cant express this query. n/a.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Q3: &lt;code&gt;WHERE plan='pro' AND country='ID'&lt;/code&gt;&lt;/strong&gt; (compound on JSONB)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;BTREE on plan only: 120ms (index on plan, then filter for country in heap)&lt;/li&gt;
&lt;li&gt;BTREE composite on &lt;code&gt;((attributes-&amp;gt;&amp;gt;'plan'), (attributes-&amp;gt;&amp;gt;'country'))&lt;/code&gt;: &lt;strong&gt;3ms&lt;/strong&gt; — winner.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;if you find yourself running compound queries on JSONB scalars, the COMPOSITE BTREE on extracted columns beats every other option for that exact shape.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Q4: &lt;code&gt;WHERE attributes ? 'preferences'&lt;/code&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GIN (option A): 18ms — works because the &lt;code&gt;?&lt;/code&gt; operator is supported&lt;/li&gt;
&lt;li&gt;GIN path_ops: &lt;strong&gt;doesnt work&lt;/strong&gt; for the &lt;code&gt;?&lt;/code&gt; operator. path_ops only indexes &lt;code&gt;@&amp;gt;&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;BTREE on extracted: cant express. n/a.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  the actual lesson
&lt;/h2&gt;

&lt;p&gt;GIN with default ops is the &lt;strong&gt;safe default&lt;/strong&gt; if you don't know your query shape yet. handles &lt;code&gt;@&amp;gt;&lt;/code&gt;, &lt;code&gt;?&lt;/code&gt;, &lt;code&gt;?|&lt;/code&gt;, &lt;code&gt;?&amp;amp;&lt;/code&gt;. flexibility tax: ~2x slower than path_ops on &lt;code&gt;@&amp;gt;&lt;/code&gt;, way slower than BTREE on &lt;code&gt;-&amp;gt;&amp;gt;&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;GIN with &lt;code&gt;jsonb_path_ops&lt;/code&gt; is the &lt;strong&gt;specialist&lt;/strong&gt; if you ONLY use &lt;code&gt;@&amp;gt;&lt;/code&gt; (contains). most apps doing feature flag / array-contains queries fit here.&lt;/p&gt;

&lt;p&gt;BTREE on extracted scalar (&lt;code&gt;((attributes-&amp;gt;&amp;gt;'X'))&lt;/code&gt;) is &lt;strong&gt;always the fastest&lt;/strong&gt; for that specific scalar predicate. price: one BTREE index per accessed scalar field.&lt;/p&gt;

&lt;p&gt;compound BTREE on multiple extracted scalars is the &lt;strong&gt;god mode&lt;/strong&gt; for queries shaped like Q3. costs you write-amplification but reads are 40-100x faster than alternatives.&lt;/p&gt;

&lt;h2&gt;
  
  
  what i actually use in production
&lt;/h2&gt;

&lt;p&gt;after benchmarking:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;composite BTREE on the 2 high-cardinality scalars (&lt;code&gt;plan&lt;/code&gt;, &lt;code&gt;country&lt;/code&gt;) that show up in 80% of WHERE clauses → 3ms for the hot path&lt;/li&gt;
&lt;li&gt;one GIN path_ops index for the &lt;code&gt;@&amp;gt;&lt;/code&gt; feature_flags queries → 6ms for flag lookups&lt;/li&gt;
&lt;li&gt;no general-purpose GIN. the &lt;code&gt;?&lt;/code&gt; operator queries (Q4) are rare; we accept seq scan for them.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;trade-off: 2 BTREE + 1 GIN = 3 indexes on a single JSONB column. write amplification per row: ~30% slower INSERTs vs no indexes. acceptable because writes are 1% of our traffic.&lt;/p&gt;

&lt;h2&gt;
  
  
  the rule of thumb i give other engineers
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;if you know the query shape, use a BTREE on the extracted scalar. if you don't, use a default GIN. only add &lt;code&gt;jsonb_path_ops&lt;/code&gt; if you've measured + you ONLY use &lt;code&gt;@&amp;gt;&lt;/code&gt;.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;GIN-on-everything is the lazy answer that costs you 20-50x on simple equality queries. and most JSONB queries in business code are simple equality queries.&lt;/p&gt;

&lt;h2&gt;
  
  
  one more thing
&lt;/h2&gt;

&lt;p&gt;the BTREE on &lt;code&gt;(attributes-&amp;gt;&amp;gt;'plan')&lt;/code&gt; index works ONLY if your query writes the predicate as &lt;code&gt;attributes-&amp;gt;&amp;gt;'plan' = 'pro'&lt;/code&gt;. if you write &lt;code&gt;(attributes-&amp;gt;'plan')::text = '"pro"'&lt;/code&gt; (note the double quote inside the literal — JSONB vs text comparison), it won't use the index. lost an hour to this once. expression on the LEFT side of &lt;code&gt;=&lt;/code&gt; must match the indexed expression exactly.&lt;/p&gt;

</description>
      <category>postgressql</category>
      <category>database</category>
      <category>performance</category>
      <category>json</category>
    </item>
    <item>
      <title>removed 3 vite plugins, my build dropped 4 seconds. heres which</title>
      <dc:creator>BaoDev Studio</dc:creator>
      <pubDate>Thu, 21 May 2026 15:09:10 +0000</pubDate>
      <link>https://forem.com/baodev-studio/removed-3-vite-plugins-my-build-dropped-4-seconds-heres-which-58bp</link>
      <guid>https://forem.com/baodev-studio/removed-3-vite-plugins-my-build-dropped-4-seconds-heres-which-58bp</guid>
      <description>&lt;p&gt;audited my react starter last week and found 8 vite plugins. three of them were doing more harm than good. removed all three. dev server start went from 6.3s to 2.4s. production build from 18s to 14s.&lt;/p&gt;

&lt;h2&gt;
  
  
  vite-plugin-pwa
&lt;/h2&gt;

&lt;p&gt;added by &lt;code&gt;npm create vite@latest&lt;/code&gt; in some templates i copy-pasted. stayed because nobody questioned it. i wasnt shipping a PWA — never. every build generated service worker + manifest artifacts i never deployed.&lt;/p&gt;

&lt;p&gt;cost: ~1.2s added to prod build. also fragmented CSS chunks because the precaching list was rebuilt on every change.&lt;/p&gt;

&lt;p&gt;removed it. nothing replaces it. if i need a PWA later, ill add it back for that project specifically.&lt;/p&gt;

&lt;h2&gt;
  
  
  vite-plugin-eslint
&lt;/h2&gt;

&lt;p&gt;this was the easy cut. runs ESLint on every change during dev. on a 200-file project, each save triggers ~800ms of ESLint processing. the overlay also masked real vite errors when both fired together — twice in one week, i spent 20 minutes hunting an HMR issue that was hidden behind a stale ESLint overlay.&lt;/p&gt;

&lt;p&gt;cost: ~2.1s added to dev start, plus the cumulative drag.&lt;/p&gt;

&lt;p&gt;ESLint now runs in &lt;code&gt;pnpm lint&lt;/code&gt; (separate command), in CI on every PR, and in a pre-commit hook on staged files only. dev server doesnt know ESLint exists. ive lost zero linting coverage.&lt;/p&gt;

&lt;h2&gt;
  
  
  unplugin-auto-import
&lt;/h2&gt;

&lt;p&gt;controversial one. this plugin auto-imports useState, useEffect, ref, etc. without &lt;code&gt;import&lt;/code&gt; statements. saves typing. felt like a quality-of-life win.&lt;/p&gt;

&lt;p&gt;removed it because every imported symbol becomes a magic string. when a new contributor opened a component file, they couldnt grep the file or use IDE Go-to-Definition reliably — the plugin generated a &lt;code&gt;.d.ts&lt;/code&gt; with global declarations that the IDE handled inconsistently.&lt;/p&gt;

&lt;p&gt;cost: only ~0.4s on dev startup. but the readability tax was real.&lt;/p&gt;

&lt;p&gt;explicit imports now. the first 3-5 lines of every file are imports. file gets longer. but anyone — including future-me at 2am — can know exactly where every symbol comes from.&lt;/p&gt;

&lt;h2&gt;
  
  
  audit pattern that worked
&lt;/h2&gt;

&lt;p&gt;vite plugin ecosystem rewards "add this, get convenience for free." each plugin runs on every change. convenience compounds in build time.&lt;/p&gt;

&lt;p&gt;list every plugin in &lt;code&gt;vite.config.js&lt;/code&gt;. ask: "if i removed this today, would i notice within a week?"&lt;/p&gt;

&lt;p&gt;answers split cleanly. plugins still in my config: react, react-router, vite-imagetools. everything else is next candidate.&lt;/p&gt;

</description>
      <category>vite</category>
      <category>performance</category>
      <category>react</category>
      <category>webdev</category>
    </item>
    <item>
      <title>an honest list of what AI agents cant do in 2026</title>
      <dc:creator>BaoDev Studio</dc:creator>
      <pubDate>Thu, 21 May 2026 12:38:53 +0000</pubDate>
      <link>https://forem.com/baodev-studio/an-honest-list-of-what-ai-agents-cant-do-in-2026-1nb0</link>
      <guid>https://forem.com/baodev-studio/an-honest-list-of-what-ai-agents-cant-do-in-2026-1nb0</guid>
      <description>&lt;p&gt;i run 35 specialized claude code agents across my projects. most of whats written about AI agents in 2026 is either marketing (look how much they can do) or doom (look how much theyll replace). both miss the practical layer: where do these agents consistently fail, even with the best prompts, the best context, the best tools?&lt;/p&gt;

&lt;p&gt;this is that list. drawn from running these agents across 3 production codebases for the last 6 months. specific failures, not abstract concerns.&lt;/p&gt;

&lt;h2&gt;
  
  
  judgment under partial information
&lt;/h2&gt;

&lt;p&gt;biggest single category. AI agents fail when the right action requires waiting, choosing not to act, or saying "i need more info."&lt;/p&gt;

&lt;p&gt;client message: "can you make the dashboard faster?" agent reads the request, looks at the dashboard code, identifies three optimization opportunities, starts implementing. senior reads the same message, asks: "faster for whom? on what data volume? slow on initial load or on filter operations? whats the SLA?"&lt;/p&gt;

&lt;p&gt;the agents confident execution costs hours of work that might solve the wrong problem. the seniors pause costs 5 minutes of clarification. the pause is the right move 80% of the time.&lt;/p&gt;

&lt;p&gt;ive tried building this into agent prompts ("ask 3 clarifying questions before starting"). it works sometimes. but agents ask FORMULAIC questions, not THE question that disambiguates. knowing which question to ask is itself judgment under partial information.&lt;/p&gt;

&lt;p&gt;this manifests as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;deciding when a feature is done vs needs another iteration&lt;/li&gt;
&lt;li&gt;picking which 1-2 of 5 AI-generated draft replies are worth sending&lt;/li&gt;
&lt;li&gt;STOPPING the addition of flags/options to a configurable system&lt;/li&gt;
&lt;li&gt;subtractive thinking — "remove this rather than build around it"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;concrete failure case: building a multi-tenant data isolation layer for a saas project last quarter. agent kept adding configuration flags for edge cases ("what if a tenant wants flag A but not B?"). by flag #7 the system was unmaintainable. i deleted 5 flags and replaced them with a default-secure single mode. config space went from 128 combinations to 4. senior judgment was "stop adding, start removing."&lt;/p&gt;

&lt;p&gt;common thread: right move is restraint. agents are calibrated for action.&lt;/p&gt;

&lt;h2&gt;
  
  
  reading the codebase context thats not in the prompt
&lt;/h2&gt;

&lt;p&gt;agents are good at search. agents are bad at synthesis from large context.&lt;/p&gt;

&lt;p&gt;concrete: asked an agent to refactor a slow 40-line function. the rewrite was technically correct. but the original contained &lt;code&gt;try/catch&lt;/code&gt; with comment &lt;code&gt;// don't remove — handles malformed JSON from legacy webhook v1&lt;/code&gt;. the rewrite "cleaned up" that try/catch.&lt;/p&gt;

&lt;p&gt;agent saw 40 lines. actual scope was the whole webhook chain, the legacy contract, the production data that occasionally hits the malformed path. none of that was in the prompt.&lt;/p&gt;

&lt;p&gt;deployed the rewrite to staging. crashed within 6 hours when the daily webhook v1 batch fired. rolled back, restored the original try/catch, added a regression test that explicitly fires malformed JSON. lesson cost ~3 hours and a degraded staging window.&lt;/p&gt;

&lt;p&gt;this isnt fixed by more context tokens. the context that matters is implicit — "this comment was load-bearing", "this duplication was intentional", "this naming convention was chosen for a reason". agent reads the lines but doesnt have the memory of why theyre there.&lt;/p&gt;

&lt;p&gt;related: agents over-abstract. asked one to extract a pattern shared by 3 functions. it produced a beautiful generic helper that the 4th similar function — written 2 weeks later — could never quite fit. the 3 specific implementations were better than the 1 generic abstraction. agent has no "predict the 4th case" capability.&lt;/p&gt;

&lt;h2&gt;
  
  
  reading PEOPLE
&lt;/h2&gt;

&lt;p&gt;this one i underestimated. agents are bad at reading tone in human messages.&lt;/p&gt;

&lt;p&gt;specifics:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;client going silent for 3 days is a strong signal — possibly losing interest, possibly stuck on internal decision, possibly got a competing quote. agents read silence as "no update yet" and continue per plan.&lt;/li&gt;
&lt;li&gt;"can we add X?" (genuine question) vs "can we add X?" (testing whether u'll pushback on scope creep) is invisible to agents. senior knows from timing, prior conversations, how it was phrased.&lt;/li&gt;
&lt;li&gt;tone for difficult convos — scope-creep pushback, missed-deadline notes, refund discussions — agent versions are either too soft (gets walked over) or too corporate (loses earned trust).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;specific exchange from last month. a client asked for a "small change" 6 weeks into a project. agent drafted a polite, structured reply explaining the change-request process. i sent something different: "sure, let me think about whether this needs a CR or if it fits the current scope — give me 24 hours."&lt;/p&gt;

&lt;p&gt;agent reply was formally correct. actual right reply was warmer + bought thinking time. the relationship needed the warmth more than it needed the formality.&lt;/p&gt;

&lt;p&gt;i now never let agents send client comms without human review. tone-reading is unreliable enough that the risk isnt worth it.&lt;/p&gt;

&lt;h2&gt;
  
  
  eval / judgment about correctness
&lt;/h2&gt;

&lt;p&gt;this is where i most expected agents to excel + where theyre most disappointing.&lt;/p&gt;

&lt;p&gt;building LLM-based products requires evals. what does "good" mean? what threshold do we ship at? which test cases matter? upstream of implementation, heavily judgment-based.&lt;/p&gt;

&lt;p&gt;agents do badly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;generate exhaustive test cases but cant tell me which 5 matter most for product viability&lt;/li&gt;
&lt;li&gt;measure whats measurable (BLEU, semantic similarity, response length) instead of what matters (does the customer find this helpful)&lt;/li&gt;
&lt;li&gt;cant design human-in-the-loop eval samples — recommend either fully-automated or fully-manual, never the right hybrid&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;specific case: building the support agent eval harness for the e-commerce project (last quarter). agent suggested measuring response accuracy via semantic similarity to a "golden answer" set. that would have been wrong in 2 ways. first, the golden answers themselves were judgment calls. second, the actual metric that mattered was "did the customer ask a follow-up that suggests they were confused." the eval design needed real customer conversation data + human classification of "this was helpful" / "this missed." cant be done from training data.&lt;/p&gt;

&lt;p&gt;eval-design is the failure i expect least progress on in 2026. requires judgment about what humans value. not in training data.&lt;/p&gt;

&lt;h2&gt;
  
  
  bonus failure: estimating real-world performance
&lt;/h2&gt;

&lt;p&gt;agent says "this query should be fast" based on indexed columns. in production with cold cache + network jitter + concurrent load, its 800ms slow. agents are bad at production reality because they reason from the code, not from operational behavior.&lt;/p&gt;

&lt;p&gt;ive seen agents recommend caching strategies that look correct on paper, but ignore the cache invalidation cost when the cached data changes 50x/day. or recommend "just add an index" without thinking about write amplification on a write-heavy table.&lt;/p&gt;

&lt;p&gt;senior knows "this looks fast but will be slow in production for THESE reasons" because senior has seen production reality. agent has seen the docs.&lt;/p&gt;

&lt;h2&gt;
  
  
  what this means in practice
&lt;/h2&gt;

&lt;p&gt;i keep the 35 agents because the 70% they do well saves real time. but i architect the workflow so the 30% they cant do has explicit human handoffs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"should we start?" decision (judgment under partial info): human only&lt;/li&gt;
&lt;li&gt;cross-codebase refactors where load-bearing weirdness lives: human-driven, agents as implementation tools&lt;/li&gt;
&lt;li&gt;client-facing communication: human review minimum, often human-authored&lt;/li&gt;
&lt;li&gt;eval design and threshold-setting: human-authored, agents run the harness&lt;/li&gt;
&lt;li&gt;production-readiness assessments: human walks through the operational model, agent helps document it&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;the hype frames this as "AI will do everything." the doom frames this as "AI will replace everything." the practical layer is neither: AI does 70% of any workflow that doesnt require judgment under uncertainty. the 30% that does is exactly where senior engineers earn their living.&lt;/p&gt;

&lt;p&gt;if ur building agent systems in 2026, plan the workflow around what they cant do, not what they can. the wont-do list is more load-bearing than the will-do list.&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>claude</category>
      <category>softwaredevelopment</category>
    </item>
    <item>
      <title>vite HMR is silently the reason ur laptop fan wont stop</title>
      <dc:creator>BaoDev Studio</dc:creator>
      <pubDate>Thu, 21 May 2026 12:19:02 +0000</pubDate>
      <link>https://forem.com/baodev-studio/vite-hmr-is-silently-the-reason-ur-laptop-fan-wont-stop-1h0i</link>
      <guid>https://forem.com/baodev-studio/vite-hmr-is-silently-the-reason-ur-laptop-fan-wont-stop-1h0i</guid>
      <description>&lt;p&gt;ur working on a react app. ur fan kicks on. u assume chrome is the culprit, or slack, or the LSP. close tabs, kill apps, fan keeps spinning.&lt;/p&gt;

&lt;p&gt;actual cause in 6 out of 10 projects i audit on macOS: vite's hot module replacement (HMR) doing way more work than needed. default config keeps websocket connections alive, polls file changes, rebuilds bundles aggressively. on a multi-monitor M-series macbook this lands as a consistent fan-on state even when ur not typing.&lt;/p&gt;

&lt;h2&gt;
  
  
  confirm its HMR first
&lt;/h2&gt;

&lt;p&gt;quit ur dev server. wait 30 seconds. does the fan calm down?&lt;/p&gt;

&lt;p&gt;yes → its HMR. no → look elsewhere (chrome tabs, docker, slack helper).&lt;/p&gt;

&lt;h2&gt;
  
  
  one-line fix
&lt;/h2&gt;

&lt;p&gt;vite respects &lt;code&gt;HMR&lt;/code&gt; env. flip it off when u dont need live reload:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;HMR&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;off pnpm dev
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;or in &lt;code&gt;vite.config.js&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="nf"&gt;defineConfig&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;server&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;hmr&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;HMR&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;off&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;with HMR off, vite still serves files but stops the websocket + file-watcher loop. CPU drops 30-50% on my M2 air. fan stops in 60 seconds.&lt;/p&gt;

&lt;h2&gt;
  
  
  when to keep it on
&lt;/h2&gt;

&lt;p&gt;active feature dev: yes, u want HMR. saves 5-10s per save.&lt;/p&gt;

&lt;p&gt;code reviews, doc reading, screencast watching while dev server is technically running but ur not editing: HMR is pure overhead. run with &lt;code&gt;HMR=off&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;i added this as a package.json script:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scripts"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"dev"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"vite"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"dev:cold"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"HMR=off vite"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;then &lt;code&gt;pnpm dev:cold&lt;/code&gt; for any session where im not actively editing the frontend.&lt;/p&gt;

&lt;h2&gt;
  
  
  why this is silent
&lt;/h2&gt;

&lt;p&gt;vites docs treat HMR as always-on. perf cost is documented in their FAQ but not the getting-started flow. most react tutorials install vite + dont mention the HMR escape hatch. result: devs ship apps assuming "vite is fast" without realizing their dev session keeps the fan on 8 hours a day.&lt;/p&gt;

&lt;h2&gt;
  
  
  net
&lt;/h2&gt;

&lt;p&gt;across 4 projects, this default flip moved my macbook battery from 4h to 6h on a typical dev day. real number, not a guess.&lt;/p&gt;

&lt;p&gt;took 30 seconds to add. pays back every session 🤷&lt;/p&gt;

</description>
      <category>vite</category>
      <category>webdev</category>
      <category>performance</category>
      <category>react</category>
    </item>
    <item>
      <title>How a one-person studio writes 35 Claude Code agents that don't fight each other</title>
      <dc:creator>BaoDev Studio</dc:creator>
      <pubDate>Wed, 20 May 2026 07:27:14 +0000</pubDate>
      <link>https://forem.com/baodev-studio/how-a-one-person-studio-writes-35-claude-code-agents-that-dont-fight-each-other-5cdg</link>
      <guid>https://forem.com/baodev-studio/how-a-one-person-studio-writes-35-claude-code-agents-that-dont-fight-each-other-5cdg</guid>
      <description>&lt;p&gt;Last Friday afternoon the &lt;code&gt;quality-gate&lt;/code&gt; agent reviewed a PR from &lt;code&gt;backend-developer&lt;/code&gt; and rejected it with a 312-word critique. Fair feedback. The PR went back, &lt;code&gt;backend-developer&lt;/code&gt; rewrote three functions, re-submitted. &lt;code&gt;quality-gate&lt;/code&gt; rejected it again. Same 312-word critique. Same three functions.&lt;/p&gt;

&lt;p&gt;I was watching this and realized &lt;code&gt;backend-developer&lt;/code&gt; had been told to "improve test coverage" by &lt;code&gt;quality-gate&lt;/code&gt; in the previous turn, had written tests, and &lt;code&gt;quality-gate&lt;/code&gt;'s second pass was now complaining the tests existed because they overlapped with what &lt;code&gt;backend-developer&lt;/code&gt; had earlier been instructed to skip. The agents were in a loop. Neither was wrong. Both were operating on the spec they had been handed.&lt;/p&gt;

&lt;p&gt;This is what happens when you let 35 specialized agents act on the same codebase without rules. They don't fight humans. They fight each other.&lt;/p&gt;

&lt;h2&gt;
  
  
  The orchestration problem
&lt;/h2&gt;

&lt;p&gt;I keep 35 agents in &lt;code&gt;~/.claude/agents/&lt;/code&gt;. They include &lt;code&gt;backend-developer&lt;/code&gt;, &lt;code&gt;frontend-developer&lt;/code&gt;, &lt;code&gt;postgres-pro&lt;/code&gt;, &lt;code&gt;golang-pro&lt;/code&gt;, &lt;code&gt;quality-gate&lt;/code&gt;, &lt;code&gt;flow-architect&lt;/code&gt;, &lt;code&gt;security-engineer&lt;/code&gt;, &lt;code&gt;test-automator&lt;/code&gt;, &lt;code&gt;client-communicator&lt;/code&gt;, &lt;code&gt;cfo&lt;/code&gt;, &lt;code&gt;cto&lt;/code&gt;, &lt;code&gt;ceo&lt;/code&gt;, &lt;code&gt;inbox-monitor&lt;/code&gt;, and 22 others. Most invocations involve 2 to 4 of them in a chain. About 1 in 7 sessions hits a real conflict like the one above.&lt;/p&gt;

&lt;p&gt;The problem is not that any agent is wrong. The problem is that 35 specialists with 35 specs will pull the codebase in 35 directions if you do not constrain who decides what.&lt;/p&gt;

&lt;p&gt;This is a writeup of the three patterns that mostly work, the three that do not, and the one problem I have not solved.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three patterns that work
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Single source of truth, per concern, written down
&lt;/h3&gt;

&lt;p&gt;For every concern that more than one agent touches, there is exactly one file that owns the answer.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;CLAUDE.md&lt;/code&gt; at the project root owns: build commands, deploy folder convention, test framework. Every agent reads this. None of them argue with it.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;masterings/secure-code-patterns.md&lt;/code&gt; owns the 40 rules about input validation, secrets handling, SQL safety. &lt;code&gt;security-engineer&lt;/code&gt; and &lt;code&gt;quality-gate&lt;/code&gt; both reference the same file. They cannot disagree about a pattern because they are reading the same checklist.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;FreelanceOS/baseline-form.md&lt;/code&gt; owns the 22 test cases that any form must pass. &lt;code&gt;frontend-developer&lt;/code&gt; implements them. &lt;code&gt;quality-gate&lt;/code&gt; verifies them. The list of 22 is the contract.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The original loop I described above happened because there was no source of truth for "what's the minimum test coverage for backend code". &lt;code&gt;quality-gate&lt;/code&gt; had its opinion. &lt;code&gt;backend-developer&lt;/code&gt; had its opinion. Once I wrote the rule into &lt;code&gt;CLAUDE.md&lt;/code&gt; ("statement coverage minimum 85% on new code, integration tests over unit tests for DB-touching paths"), the loop stopped on the next session.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Explicit ownership: one agent per task class
&lt;/h3&gt;

&lt;p&gt;If two agents could plausibly own a task, neither does. I assign it to a third agent who is a layer up.&lt;/p&gt;

&lt;p&gt;Concrete: who owns Postgres performance? Could be &lt;code&gt;backend-developer&lt;/code&gt; (the SQL is part of the API code). Could be &lt;code&gt;postgres-pro&lt;/code&gt; (it is a database concern). If I dispatch a slow-query investigation to &lt;code&gt;backend-developer&lt;/code&gt;, the answer comes back with application-layer caching. If I dispatch it to &lt;code&gt;postgres-pro&lt;/code&gt;, the answer comes back with an index rewrite. Both are correct. Neither is the right level.&lt;/p&gt;

&lt;p&gt;The fix is to dispatch the question to &lt;code&gt;flow-architect&lt;/code&gt; first. &lt;code&gt;flow-architect&lt;/code&gt; reads the trace, decides whether this is an app-layer fix or a database-layer fix, and then dispatches the specific work to the right specialist with a clear scope. The specialists never fight because they are receiving non-overlapping work.&lt;/p&gt;

&lt;p&gt;This is a router pattern, not a coordination pattern. The router is itself an agent.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Locked context per agent invocation
&lt;/h3&gt;

&lt;p&gt;Before dispatching an agent that will write to the codebase, I cache the relevant context in Redis with an explicit key:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;redis-cli SET &lt;span class="s2"&gt;"agent:ctx:backend-developer:2026-05-20-feature-x"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;cat &lt;/span&gt;current-task.md schema.sql relevant-files.txt&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; EX 3600
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent reads from that key at the start of its run. If a parallel agent dispatch is happening, they see the same frozen context. The thing that used to fight them is the floor moving while they walked on it. The thing that stops them fighting is the floor not moving.&lt;/p&gt;

&lt;p&gt;This pattern came out of a 2026-04 incident where &lt;code&gt;quality-gate&lt;/code&gt; ran simultaneously with &lt;code&gt;refactor-agent&lt;/code&gt;, and the file &lt;code&gt;refactor-agent&lt;/code&gt; was rewriting got reviewed by &lt;code&gt;quality-gate&lt;/code&gt; mid-rewrite. &lt;code&gt;quality-gate&lt;/code&gt; flagged the half-finished code as broken. It was. Once locked-context was enforced, that class of bug disappeared.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three patterns that do NOT work
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. "Let the agents negotiate"
&lt;/h3&gt;

&lt;p&gt;I tried this for two weeks. &lt;code&gt;backend-developer&lt;/code&gt; proposes, &lt;code&gt;quality-gate&lt;/code&gt; reviews, they go back and forth until they agree. In theory clean. In practice the agents do not negotiate. They restate their original position more politely. After three or four turns, one of them gives in not because the argument was better but because it was running out of context window.&lt;/p&gt;

&lt;p&gt;The decision quality from "exhausted agent gives in" is worse than the decision quality from "router agent decides upfront". Negotiation is the expensive way to lose.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. "Run multiple agents in parallel and pick the best output"
&lt;/h3&gt;

&lt;p&gt;This sounds safer. Run &lt;code&gt;backend-developer-A&lt;/code&gt;, &lt;code&gt;backend-developer-B&lt;/code&gt;, &lt;code&gt;backend-developer-C&lt;/code&gt; in parallel, take the version with the highest &lt;code&gt;quality-gate&lt;/code&gt; score.&lt;/p&gt;

&lt;p&gt;Three problems. Token cost is 3x. Quality is unbounded because the three runs share most of the same biases (they are the same agent reading the same spec; they tend to converge on similar answers, not diverge). And the picker becomes a single point of failure. If &lt;code&gt;quality-gate&lt;/code&gt; has a blind spot, all three "winners" share it.&lt;/p&gt;

&lt;p&gt;I keep one specialist per task. Cheaper. The output quality difference vs the parallel-and-pick version is within noise on the projects I run.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. "Agent voting"
&lt;/h3&gt;

&lt;p&gt;Same problem as parallel-and-pick, with extra coordination cost. Skipped after one week.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I was wrong about
&lt;/h2&gt;

&lt;p&gt;I assumed that as agent count grew the coordination overhead would scale linearly. More agents = more rules to write = more fights to mediate.&lt;/p&gt;

&lt;p&gt;The real curve has a kink. Up to about 10 agents, a flat dispatcher works. You hold the dispatch logic in your head and assign work by intuition. Above 12 agents you cannot hold the roster in working memory anymore, and a flat dispatcher loses to a routing agent. So coordination overhead does not go up linearly with agent count. It is roughly flat from 1 to 10, then steps up at the routing-agent threshold, then is roughly flat again from 12 to whatever ceiling.&lt;/p&gt;

&lt;p&gt;I jumped from 8 to 22 to 35 agents in three months. The middle period was painful. The jump from 22 to 35 was much easier because the routing infrastructure was already there.&lt;/p&gt;

&lt;h2&gt;
  
  
  The one problem I have not solved
&lt;/h2&gt;

&lt;p&gt;Agents occasionally regress each other's work. A new instance of &lt;code&gt;backend-developer&lt;/code&gt;, dispatched two weeks after the last one, sometimes deletes a workaround the previous instance had added (with no comment because comments are noise). The trace looks like the workaround "appeared from nowhere" and the new agent removes it as dead code. The workaround was load-bearing.&lt;/p&gt;

&lt;p&gt;I have partially mitigated this with structured commit messages that explain why a workaround exists, and by making the test that breaks if the workaround is removed. But the gap is real. The agents do not yet read git history before deletion. The discipline lives in the prompt and the tests, and prompts get truncated.&lt;/p&gt;

&lt;p&gt;If I solve this it will probably be by making one of the agents read &lt;code&gt;git blame&lt;/code&gt; on any line it touches before recommending deletion. That is on the list.&lt;/p&gt;

&lt;h2&gt;
  
  
  The shape that emerged
&lt;/h2&gt;

&lt;p&gt;What looks like 35 independent agents is, in practice, a layered system:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;1 router&lt;/strong&gt; (&lt;code&gt;flow-architect&lt;/code&gt;) decides what kind of work a task is.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;5 to 7 specialists per layer&lt;/strong&gt; (backend, frontend, DB, security, devops, testing) execute scoped work.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;1 reviewer&lt;/strong&gt; (&lt;code&gt;quality-gate&lt;/code&gt;) verifies against the agreed checklist.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Three orthogonal C-levels&lt;/strong&gt; (&lt;code&gt;cto&lt;/code&gt;, &lt;code&gt;cfo&lt;/code&gt;, &lt;code&gt;ceo&lt;/code&gt;) handle cross-cutting strategy questions that should not block the engineering loop.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The remaining 18 to 20&lt;/strong&gt; are domain agents that are dispatched rarely (e.g., &lt;code&gt;postgres-pro&lt;/code&gt; only for hard DB problems, &lt;code&gt;tls-config-agent&lt;/code&gt; only when certs come up).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The pattern that prevents fights is not "more rules". It is "fewer overlapping responsibilities, an explicit router, a frozen context per dispatch". Three things written down. Most of the conflicts you would otherwise spend a week debugging do not happen.&lt;/p&gt;

&lt;p&gt;If you are starting an agent stack today, the order I would build it in is: write &lt;code&gt;CLAUDE.md&lt;/code&gt; first, then 5 specialists, then one router, then add the rest. Trying to coordinate 35 specialists without a router and a written source of truth is the slow way to learn the same lesson.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claudecode</category>
      <category>productivity</category>
      <category>devjournal</category>
    </item>
    <item>
      <title>AI-assisted development cost breakdown — real numbers from 3 projects</title>
      <dc:creator>BaoDev Studio</dc:creator>
      <pubDate>Mon, 18 May 2026 13:27:40 +0000</pubDate>
      <link>https://forem.com/baodev-studio/ai-assisted-development-cost-breakdown-real-numbers-from-3-projects-12ak</link>
      <guid>https://forem.com/baodev-studio/ai-assisted-development-cost-breakdown-real-numbers-from-3-projects-12ak</guid>
      <description>&lt;p&gt;Most of what's written about AI-assisted development cost is theoretical. "Could save 50%". "Up to 10x faster". "Game-changing efficiency". Useful for slide decks, useless for budgeting.&lt;/p&gt;

&lt;p&gt;Here are three projects from the last few months with the actual numbers. Hours billed, tokens consumed, what AI agents did, what they did not do, and what the final invoice looked like. Names are abstracted (NDA-protected) but the numbers are real.&lt;/p&gt;

&lt;h2&gt;
  
  
  Project A: SaaS dashboard MVP
&lt;/h2&gt;

&lt;p&gt;The ask: a B2B SaaS dashboard for a small ops team. Backend in Go, frontend in Next.js, Postgres for data, Stripe for billing. Standard CRUD plus 8 reports plus a roles-and-permissions layer.&lt;/p&gt;

&lt;h3&gt;
  
  
  What was quoted
&lt;/h3&gt;

&lt;p&gt;Quoted at 4 weeks. The discovery call surfaced 22 distinct features. I priced it as a fixed-rate engagement based on previous similar projects, with a stated ±20% accuracy band.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Quoted price: $9,800&lt;/li&gt;
&lt;li&gt;Quoted timeline: 28 days&lt;/li&gt;
&lt;li&gt;Quoted hours (internal estimate): 78 hours&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What was actually delivered
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Delivered: 23 days (5 days under quote)&lt;/li&gt;
&lt;li&gt;Actual hours billed: 71 hours&lt;/li&gt;
&lt;li&gt;Claude Sonnet token cost (development only, not runtime): $141&lt;/li&gt;
&lt;li&gt;Lines of application code shipped: ~14,200&lt;/li&gt;
&lt;li&gt;Lines of test code shipped: ~4,400 (statement coverage: 91%)&lt;/li&gt;
&lt;li&gt;Bugs found in client UAT: 3 (1 critical, 2 cosmetic)&lt;/li&gt;
&lt;li&gt;Critical bug fix time: 4 hours including regression test&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Where the time actually went
&lt;/h3&gt;

&lt;p&gt;Of the 71 hours billed, the breakdown was roughly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Senior architecture decisions and spec refinement: ~9 hours&lt;/li&gt;
&lt;li&gt;Agent-supervised feature build (CRUD, reports, billing, auth): ~38 hours&lt;/li&gt;
&lt;li&gt;Direct senior code on the harder bits (roles-and-permissions logic, Stripe webhook retry, multi-tenant data isolation): ~14 hours&lt;/li&gt;
&lt;li&gt;Test writing and review: ~6 hours&lt;/li&gt;
&lt;li&gt;Deployment, monitoring setup, runbook: ~4 hours&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The agents handled the boring 70% well. The senior owned the 30% that required real judgment.&lt;/p&gt;

&lt;h3&gt;
  
  
  What this teaches about pricing
&lt;/h3&gt;

&lt;p&gt;A traditional freelance team at $75-100/hr would have charged ~$10,000-15,000 for this scope and delivered in 5-7 weeks. The agent-assisted version landed at $9,800, in 23 days, with 91% test coverage. The savings ratio is real but the marginal token cost ($141) is so small it does not even show up on the invoice as a line item. Clients care about the total number and the delivery date, not the token accounting.&lt;/p&gt;

&lt;h2&gt;
  
  
  Project B: Bug fix sprint on a Go service
&lt;/h2&gt;

&lt;p&gt;The ask: an existing production Go service was crashing under specific concurrent load. The team had been hunting the bug for two weeks without success. The need was triage, fix, regression test, and a runbook so it would not happen again.&lt;/p&gt;

&lt;h3&gt;
  
  
  What was quoted
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Quoted price: $1,200 fixed&lt;/li&gt;
&lt;li&gt;Quoted timeline: 2 days&lt;/li&gt;
&lt;li&gt;Quoted hours (internal estimate): 14 hours&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This was priced as a sprint because the bug was contained and the team had already done the partial reproduction work.&lt;/p&gt;

&lt;h3&gt;
  
  
  What was actually delivered
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Delivered: 6 hours total elapsed&lt;/li&gt;
&lt;li&gt;Actual hours billed: 6 hours&lt;/li&gt;
&lt;li&gt;Claude Sonnet token cost: $8.30&lt;/li&gt;
&lt;li&gt;Outcome: race condition in a worker-pool goroutine fixed; regression test added; root cause documented in the team runbook&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Where the time actually went
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Reading the existing code and the team's investigation notes: ~1.5 hours (human only; agents are bad at "read this 8000-line repo and tell me what's wrong")&lt;/li&gt;
&lt;li&gt;Running the race detector on suspicious paths (&lt;code&gt;go test -race -count=10 ./pool/...&lt;/code&gt;): ~0.5 hours&lt;/li&gt;
&lt;li&gt;Reproducing the bug locally with a stress harness: ~1 hour&lt;/li&gt;
&lt;li&gt;Writing the fix (mutex around the shared resource that was being mutated outside the worker's own slot): ~0.5 hours&lt;/li&gt;
&lt;li&gt;Writing a deterministic regression test for the race: ~1 hour&lt;/li&gt;
&lt;li&gt;Writing the runbook entry: ~1.5 hours&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What this teaches about pricing
&lt;/h3&gt;

&lt;p&gt;Bug fix sprints are the wrong place to apply heavy AI assistance. They are 80% reading and 20% writing. The reading is human work; agents cannot reliably hold 8,000 lines of context and reason about the system's invariants. The writing portion (the fix itself, the regression test, the runbook) is where AI helps, but it is the smaller half.&lt;/p&gt;

&lt;p&gt;Pricing reality: I bill bug fix sprints close to a senior hourly rate even though the elapsed hours are short. The work that makes the fix possible is the years of experience that lets a senior glance at a stack trace and know to look at the worker pool. Token cost? $8. The studio's overhead for that hour of focused work? Multiples of that.&lt;/p&gt;

&lt;h2&gt;
  
  
  Project C: LLM-powered support agent for an e-commerce site
&lt;/h2&gt;

&lt;p&gt;The ask: an e-commerce business wanted an AI support agent that answered customer questions from their product catalog plus a small knowledge base. Integration with their existing helpdesk for fallback handoff to humans.&lt;/p&gt;

&lt;h3&gt;
  
  
  What was quoted
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Quoted price: $5,400 for phase 1 (the bot + the integration)&lt;/li&gt;
&lt;li&gt;Phase 2 (analytics dashboard, multi-language support, retainer) deferred until phase 1 shipped&lt;/li&gt;
&lt;li&gt;Quoted timeline: 3 weeks&lt;/li&gt;
&lt;li&gt;Quoted hours (internal estimate): 44 hours&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What was actually delivered
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Delivered phase 1: 18 days (3 days under quote)&lt;/li&gt;
&lt;li&gt;Actual hours billed: 41 hours&lt;/li&gt;
&lt;li&gt;Claude Sonnet token cost (development only): $96&lt;/li&gt;
&lt;li&gt;Phase 2 ongoing as a $1,800/month retainer&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Where the time actually went
&lt;/h3&gt;

&lt;p&gt;The work split was different from the SaaS dashboard because the actual product was an LLM agent. Hours roughly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Knowledge base preprocessing and chunking strategy: ~6 hours (human only; this is judgment work — which fields to include, how to handle product variants)&lt;/li&gt;
&lt;li&gt;Retrieval-augmented generation pipeline: ~10 hours (agents wrote most of it, senior reviewed embeddings model choice and reranking strategy)&lt;/li&gt;
&lt;li&gt;Helpdesk integration (webhook in, conversation handoff API out): ~8 hours&lt;/li&gt;
&lt;li&gt;Frontend chat widget: ~6 hours (agents handled almost all of this)&lt;/li&gt;
&lt;li&gt;Eval harness (sample 50 real customer questions, measure helpfulness, accuracy, escalation rate): ~9 hours (mostly human work — eval design is hard to delegate)&lt;/li&gt;
&lt;li&gt;Deployment and monitoring: ~2 hours&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What this teaches about pricing
&lt;/h3&gt;

&lt;p&gt;LLM projects have a specific cost shape. The retrieval pipeline and integration compress heavily with agent help (60% efficiency gain). The eval harness, the knowledge base preprocessing, and the prompt engineering decisions stay heavily human (negligible efficiency gain from AI tools, because the work is judgment, not boilerplate).&lt;/p&gt;

&lt;p&gt;The blended rate ends up being similar to the SaaS dashboard rate, even though the project felt more "AI-native". This surprised me. The first three LLM projects I built I priced too low because I assumed AI tools would compress the LLM-engineering work the most. They actually compressed the surrounding infrastructure work the most. The LLM-engineering itself stays expensive because it is judgment-heavy.&lt;/p&gt;

&lt;h2&gt;
  
  
  What three projects show in aggregate
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Project&lt;/th&gt;
&lt;th&gt;Quoted&lt;/th&gt;
&lt;th&gt;Delivered hours&lt;/th&gt;
&lt;th&gt;Token cost&lt;/th&gt;
&lt;th&gt;Days early/late&lt;/th&gt;
&lt;th&gt;Margin vs traditional&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;SaaS dashboard&lt;/td&gt;
&lt;td&gt;$9,800&lt;/td&gt;
&lt;td&gt;71h&lt;/td&gt;
&lt;td&gt;$141&lt;/td&gt;
&lt;td&gt;5 days early&lt;/td&gt;
&lt;td&gt;~35% reduction&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bug fix sprint&lt;/td&gt;
&lt;td&gt;$1,200&lt;/td&gt;
&lt;td&gt;6h&lt;/td&gt;
&lt;td&gt;$8&lt;/td&gt;
&lt;td&gt;Same week&lt;/td&gt;
&lt;td&gt;Same price; faster&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LLM support agent&lt;/td&gt;
&lt;td&gt;$5,400&lt;/td&gt;
&lt;td&gt;41h&lt;/td&gt;
&lt;td&gt;$96&lt;/td&gt;
&lt;td&gt;3 days early&lt;/td&gt;
&lt;td&gt;~25% reduction&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Three patterns:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Boilerplate-heavy projects compress the most.&lt;/strong&gt; The SaaS dashboard had ~35% time savings vs traditional because most of the build was standard CRUD plus reports. Agents handle that well.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Reading-heavy projects compress the least.&lt;/strong&gt; The bug fix sprint compressed almost zero on the &lt;em&gt;thinking&lt;/em&gt; time. The fix took 30 minutes; the understanding took 5 hours. Agents do not help with understanding existing systems much yet.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Judgment-heavy projects compress unevenly.&lt;/strong&gt; The LLM agent project compressed the integration work but not the eval design. Token costs were tiny in all three cases ($8 to $141), making them invisible on invoices.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The token-cost question, answered with real numbers
&lt;/h2&gt;

&lt;p&gt;Three projects, total token spend: &lt;strong&gt;$245.30&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Three projects, total client invoices: &lt;strong&gt;$16,400&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Token cost as a percentage of revenue: &lt;strong&gt;1.5%&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This is the part of AI-assisted development that hype articles get most wrong. Tokens are not the meaningful expense. The meaningful expense is the senior judgment that directs the agents and reviews their output. If anyone tries to sell you AI development priced primarily on "low token cost", the math is upside down. Real AI-assisted projects bill the senior judgment, not the tokens.&lt;/p&gt;

&lt;h2&gt;
  
  
  What does NOT compress with AI agents
&lt;/h2&gt;

&lt;p&gt;Independent of project type, these consistently took the same amount of time as a fully-human project would have:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Discovery calls and spec refinement.&lt;/strong&gt; Agents do not run client calls.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stakeholder communication during the build.&lt;/strong&gt; Async status emails, slack threads, the project-status documents.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Debugging that requires reproducing client-specific environment issues.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compliance reviews&lt;/strong&gt; (HIPAA, GDPR, PCI). The standards have not changed because AI helps you write code.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Code review of changes that touch the riskiest 5% of the codebase&lt;/strong&gt; (auth, billing, data deletion paths).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;A studio that quotes 50% off because "AI" without specifying which parts of the work compressed is either overestimating the savings or underestimating the work.&lt;/p&gt;

&lt;h2&gt;
  
  
  The take
&lt;/h2&gt;

&lt;p&gt;AI-assisted development is real. The compression is real. The numbers above are not marketing copy; they came off three actual invoices and three actual token spends in 2026. Token cost is real but small. Calendar time savings are real and meaningful.&lt;/p&gt;

&lt;p&gt;The savings are not uniform across project types. Boilerplate-heavy work compresses dramatically. Reading-heavy work barely compresses. Judgment-heavy work compresses unevenly.&lt;/p&gt;

&lt;p&gt;The right way to price an AI-assisted project is the same as the right way to price any project: count the hours by category (boilerplate vs reading vs judgment), apply the right multiplier per category, add a token line item for transparency, and quote a fixed price with a ±20% accuracy band.&lt;/p&gt;

&lt;p&gt;The wrong way is to quote 50% off and hope you can deliver on it.&lt;/p&gt;

&lt;p&gt;If you want the full reasoning behind the cost variables and the worked example for a similar dashboard project, the deeper budgeting framework is at the canonical URL in the post header.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>freelancing</category>
      <category>productivity</category>
    </item>
    <item>
      <title>How to budget a software development project (without the spreadsheet theater)</title>
      <dc:creator>BaoDev Studio</dc:creator>
      <pubDate>Mon, 18 May 2026 10:34:10 +0000</pubDate>
      <link>https://forem.com/baodev-studio/how-to-budget-a-software-development-project-without-the-spreadsheet-theater-33d1</link>
      <guid>https://forem.com/baodev-studio/how-to-budget-a-software-development-project-without-the-spreadsheet-theater-33d1</guid>
      <description>&lt;p&gt;The first software project I quoted as a freelancer was supposed to take three weeks. It took eleven.&lt;/p&gt;

&lt;p&gt;The spec was four bullet points, the budget was a number I picked because it sounded reasonable, and the client was patient until they weren't. By week six everyone was unhappy and nobody was being dishonest. I just had no idea how to budget a software development project.&lt;/p&gt;

&lt;p&gt;Most online cost calculators don't help. They ask three questions and return a number that depends on nothing. Real budgeting is six variables in a trench coat, and the trench coat is held together by how well-defined your scope actually is.&lt;/p&gt;

&lt;p&gt;Here's what I'd tell my past self.&lt;/p&gt;

&lt;h2&gt;
  
  
  The six variables that actually move the number
&lt;/h2&gt;

&lt;p&gt;Every software development cost estimate is some flavor of &lt;code&gt;hours × rate&lt;/code&gt;. The interesting question is where the hour count comes from.&lt;/p&gt;

&lt;p&gt;Six things change it:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Project scope&lt;/strong&gt;: the number of distinct features the deliverable must support. Not "a dashboard with reports" but "a dashboard with 6 specific reports, each with 3 filter dimensions and CSV export". The first phrasing is a fantasy. The second is bid-able.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Technical complexity&lt;/strong&gt;: does the code have to handle real-time updates, custom algorithms, regulated data (HIPAA, PCI, GDPR), high concurrency? Each of those adds 30-100% to the matching feature's hour count because of the engineering trade-offs and the verification load.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Integration surface&lt;/strong&gt;: every third-party API the project touches adds roughly 4-12 hours. Stripe is fine. A B2B partner's custom SOAP endpoint with no sandbox is six days of wasted weekends.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Team seniority&lt;/strong&gt;: a senior engineer at $100-150/hr will deliver in a third of the hours of a mid-level at $40-60/hr. The math sometimes makes seniors cheaper per project. The math more often doesn't, because mid-levels are cheaper per hour. Read the actual scope before choosing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Timeline pressure&lt;/strong&gt;: compressing a project timeline by 50% adds roughly 30-50% to total cost, not because engineers work faster but because the team adds people, communication overhead climbs, and weekends start to be billable. Brooks' Law from 1975 is still true in 2026.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6. Scope volatility&lt;/strong&gt;: every change request after kickoff costs 2-4x what it would have cost in the original scope. A button added in week 1 is 30 minutes. The same button added in week 4, after the page redesign is "almost done", is 2-3 hours plus regression testing.&lt;/p&gt;

&lt;p&gt;A budget that doesn't ask about all six is a budget that's going to surprise someone.&lt;/p&gt;

&lt;h2&gt;
  
  
  The accuracy ceiling, and why it's lower than people think
&lt;/h2&gt;

&lt;p&gt;Here's the part nobody likes hearing.&lt;/p&gt;

&lt;p&gt;A well-scoped estimate for a &lt;em&gt;defined&lt;/em&gt; deliverable (one that the studio has built something similar to before) is typically accurate within ±20%. So if I quote 100 hours, the real number is probably 80-120. That's the honest ceiling for honest work.&lt;/p&gt;

&lt;p&gt;For a &lt;em&gt;greenfield&lt;/em&gt; product, where the spec is "we want a SaaS for X but we're still figuring out what X is", the accuracy collapses to ±40-60%. So if you hear a quote of "$15,000-25,000" for a new product, the real ending number is probably anywhere between $9,000 and $40,000.&lt;/p&gt;

&lt;p&gt;I was wrong about this for the first two years. I thought tighter estimates were a sign of better engineering. They were a sign of clients who hadn't started discovering what they actually wanted.&lt;/p&gt;

&lt;p&gt;The fix isn't tighter estimates. The fix is one of three things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Lock the spec earlier with a paid discovery sprint (2-5 days, charged separately) before quoting the build.&lt;/li&gt;
&lt;li&gt;Use time-and-materials with a not-to-exceed cap instead of fixed price.&lt;/li&gt;
&lt;li&gt;Quote phase 1 as a fixed price and explicitly defer phase 2 estimation until phase 1 ships.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;All three are honest. Pretending you can quote a greenfield product within ±10% is not.&lt;/p&gt;

&lt;h2&gt;
  
  
  What changes when AI agents enter the workflow
&lt;/h2&gt;

&lt;p&gt;This part has been overhyped and underhyped in equal measure.&lt;/p&gt;

&lt;p&gt;The naive overhype is "AI writes 90% of the code, projects ship 10x faster, budget collapses". That has not been my experience.&lt;/p&gt;

&lt;p&gt;The naive underhype is "AI is just autocomplete with more steps". That isn't true either.&lt;/p&gt;

&lt;p&gt;What actually happens, in numbers I can defend from running a studio with this workflow for the past year:&lt;/p&gt;

&lt;p&gt;A mid-complexity integration (webhook pipeline, a few UI screens, basic auth, deploy) used to take 18-22 hours of subcontractor time. Senior reviews on top added 2-3 hours. Total: 20-25 hours.&lt;/p&gt;

&lt;p&gt;The same scope, with a proper agent workflow, lands closer to 6-8 hours total. Not because agents write better code than the subcontractor did. The reason is the review-fix-verify loop. Agents catch their own obvious mistakes when the feedback cycle is wired correctly. The "first draft is rough, second draft is shipping-quality" cycle collapses from days into hours.&lt;/p&gt;

&lt;p&gt;Token cost is real but tiny. A mid-complexity project consumes $12-18 of Claude tokens. The subcontractor equivalent was $400-600 in labor. The ratio looks dramatic but the meaningful gain is calendar time, not cost — most clients care about shipping in 6 days more than they care about a $400 line item.&lt;/p&gt;

&lt;p&gt;Bigger projects scale similarly. A SaaS MVP that traditionally takes 4 weeks of senior engineering can land in 3 weeks with agent support. Token cost across that project is around $140. Lines of code: about 14,000 (without counting tests). Bugs caught in client UAT: typically 2-4, of which one is critical and the rest are cosmetic.&lt;/p&gt;

&lt;p&gt;What AI agents do NOT do: tell the client they're wrong about a product decision, catch the subtle business-rule violation in a 200-line file, decide what NOT to build. Those are still human work. The studio model is "one senior plus an agent workforce", not "an AI subscription replacing engineers".&lt;/p&gt;

&lt;h2&gt;
  
  
  A worked example
&lt;/h2&gt;

&lt;p&gt;Let's budget a real project. The ask: a small e-commerce dashboard for a Shopify store owner who wants daily metrics, abandoned cart recovery emails, and a returns workflow.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Phase 1: scope it out.&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Daily metrics dashboard (revenue, orders, top products, repeat customer rate): 4 reports, 3 filters each, CSV export&lt;/li&gt;
&lt;li&gt;Abandoned cart email automation: Shopify webhook + email service (Postmark) integration&lt;/li&gt;
&lt;li&gt;Returns workflow: admin can mark return, send pre-paid label link via Postmark, refund processed via Shopify Admin API&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Phase 2: estimate hours per variable.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Without AI agents:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dashboard frontend (Next.js): ~24 hours (3 hours per report × 4 reports, plus shared filter component and CSV export)&lt;/li&gt;
&lt;li&gt;API + Shopify integration: ~16 hours (webhooks + admin API client + error handling)&lt;/li&gt;
&lt;li&gt;Email automation: ~12 hours (template, Postmark client, queue + retry logic)&lt;/li&gt;
&lt;li&gt;Returns workflow: ~14 hours (UI + state machine + label generation + refund flow)&lt;/li&gt;
&lt;li&gt;Auth + deployment + testing: ~14 hours&lt;/li&gt;
&lt;li&gt;Total: &lt;strong&gt;80 hours&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With agent support, same scope:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dashboard frontend: ~10 hours (agents handle the boilerplate, senior reviews architecture and UX details)&lt;/li&gt;
&lt;li&gt;API + Shopify integration: ~6 hours&lt;/li&gt;
&lt;li&gt;Email automation: ~4 hours&lt;/li&gt;
&lt;li&gt;Returns workflow: ~6 hours&lt;/li&gt;
&lt;li&gt;Auth + deployment + testing: ~6 hours&lt;/li&gt;
&lt;li&gt;Total: &lt;strong&gt;32 hours&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Phase 3: apply rate.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;At a senior rate of $125/hr:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Traditional: 80 hours × $125 = &lt;strong&gt;$10,000&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Agent-assisted: 32 hours × $125 + $200 in tokens = &lt;strong&gt;$4,200&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Phase 4: apply uncertainty.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Both numbers are ±20% because the scope is well-defined. So the honest quote is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Traditional: $8,000-12,000, delivered in 3-4 weeks&lt;/li&gt;
&lt;li&gt;Agent-assisted: $3,400-5,000, delivered in 1-2 weeks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A 60% cost reduction. The temptation is to claim this is the typical case. It's not. This works when scope is defined, the integration partners are well-documented (Shopify is excellent here), and the senior knows what to verify. Greenfield products with 20 features and shifting requirements don't compress as cleanly.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to budget if you're the client, not the studio
&lt;/h2&gt;

&lt;p&gt;Three pragmatic tactics if you're commissioning the work:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Ask for the breakdown.&lt;/strong&gt; A quote that says "$8,000 for a dashboard" is not a quote. A quote that says "24 hours dashboard, 16 hours integration, 12 hours email, 14 hours returns, 14 hours auth+deploy, $125/hr = $10,000, ±20%" is a quote. The breakdown protects both sides.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Run the calculator on yourself first.&lt;/strong&gt; Before you ask a vendor, write the feature list yourself and put a rough hour count next to each one. You don't need to be right. You just need to know what feels wrong when the vendor's number arrives. A 4-feature project quoted at 200 hours probably has scope you didn't communicate. A 12-feature project quoted at 30 hours probably has scope the vendor didn't read.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Reserve 20% for unknown unknowns.&lt;/strong&gt; Whatever number you arrive at, add 20% to the budget you tell stakeholders. Not because vendors are dishonest. Because the spec isn't done until the product ships. You'll either spend the buffer or you won't, and the conversation in week 6 will be much calmer if you did.&lt;/p&gt;

&lt;h2&gt;
  
  
  The take
&lt;/h2&gt;

&lt;p&gt;Budgeting a software development project well isn't a formula. It's a discipline.&lt;/p&gt;

&lt;p&gt;The discipline is: define scope, count features, pick a rate, apply the right multiplier for complexity, and admit upfront how much the spec might still drift. The variables that move the number are knowable. The honesty about your uncertainty is the part that protects you.&lt;/p&gt;

&lt;p&gt;If you want to play with the inputs, the calculator at baodev.studio/estimate.html shows the same logic with a UI. It's not a quote. It's a defensible reference for the budget conversation you're about to have with whoever is funding the project.&lt;/p&gt;

&lt;p&gt;If you want a real quote, the intake form takes 7 minutes and produces a number within 24 hours. The number comes with the breakdown, which is the only part that matters.&lt;/p&gt;

</description>
      <category>programming</category>
      <category>freelancing</category>
      <category>productivity</category>
      <category>business</category>
    </item>
    <item>
      <title>I ran `go test -race` after 3 months. It found 8 things.</title>
      <dc:creator>BaoDev Studio</dc:creator>
      <pubDate>Mon, 18 May 2026 02:44:01 +0000</pubDate>
      <link>https://forem.com/baodev-studio/i-ran-go-test-race-after-3-months-it-found-8-things-38pl</link>
      <guid>https://forem.com/baodev-studio/i-ran-go-test-race-after-3-months-it-found-8-things-38pl</guid>
      <description>&lt;p&gt;8 race conditions. That's what three months of "I'll add &lt;code&gt;-race&lt;/code&gt; later" bought me.&lt;/p&gt;

&lt;p&gt;The codebase is a Go backend for a freelance studio automation tool. Around 4,000 lines of application code, a handful of goroutines managing job queues, email polling, and an agent dispatch loop. Perfectly ordinary stuff. I had been telling myself &lt;code&gt;-race&lt;/code&gt; was "too slow for CI." It runs in 11s for a 4k-line service.testing I was wrong.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the detector actually outputs
&lt;/h2&gt;

&lt;p&gt;When youtipsprogramming hit a real data race, the output looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;==================
WARNING: DATA RACE
Read at 0x00c0001b4030 by goroutine 18:
  github.com/baodev/flos/internal/dispatch.(*Router).getHandler()
      /home/runner/work/flos/internal/dispatch/router.go:94 +0x6c

Previous write at 0x00c0001b4030 by goroutine 7:
  github.com/baodev/flos/internal/dispatch.(*Router).Register()
      /home/runner/work/flos/internal/dispatch/router.go:61 +0x84

Goroutine 18 (running) created at:
  github.com/baodev/flos/internal/dispatch.(*Router).Start()
      /home/runner/work/flos/internal/dispatch/router.go:112 +0x1e0
==================
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;File and line numbers, both goroutines, the moment of creation. It tells you exactly where to look.&lt;/p&gt;

&lt;h3&gt;
  
  
  The representative case
&lt;/h3&gt;

&lt;p&gt;The most embarrassing one: a &lt;code&gt;map[string]HandlerFunc&lt;/code&gt; being read by worker goroutines while a registration goroutine could still be writing to it. Classic. The map wasn't behind a mutex because I "registered everything at startup." Except one code path registered a handler lazily on first use.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight diff"&gt;&lt;code&gt; type Router struct {
&lt;span class="gd"&gt;-    handlers map[string]HandlerFunc
&lt;/span&gt;&lt;span class="gi"&gt;+    handlers map[string]HandlerFunc
+    mu       sync.RWMutex
&lt;/span&gt; }
&lt;span class="err"&gt;
&lt;/span&gt; func (r *Router) Register(name string, fn HandlerFunc) {
&lt;span class="gi"&gt;+    r.mu.Lock()
+    defer r.mu.Unlock()
&lt;/span&gt;     r.handlers[name] = fn
 }
&lt;span class="err"&gt;
&lt;/span&gt; func (r *Router) getHandler(name string) HandlerFunc {
&lt;span class="gi"&gt;+    r.mu.RLock()
+    defer r.mu.RUnlock()
&lt;/span&gt;     return r.handlers[name]
 }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;12 lines changed. Bug had been live since the initial commit in February.&lt;/p&gt;

&lt;h2&gt;
  
  
  Adding it to CI is one line
&lt;/h2&gt;

&lt;p&gt;If you're on GitHub Actions and not already running this, add it to your test job:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Test with race detector&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;go test -race -count=1 -timeout=120s ./...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or if you run tests via a Makefile:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight make"&gt;&lt;code&gt;&lt;span class="nl"&gt;test-race&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    go &lt;span class="nb"&gt;test&lt;/span&gt; &lt;span class="nt"&gt;-race&lt;/span&gt; &lt;span class="nt"&gt;-count&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1 &lt;span class="nt"&gt;-timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;120s ./...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;-count=1&lt;/code&gt; disables the test result cache so every CI run actually executes. Without it, Go can return cached results even on &lt;code&gt;-race&lt;/code&gt;, which defeats the point.&lt;/p&gt;

&lt;h3&gt;
  
  
  What the other 7 were
&lt;/h3&gt;

&lt;p&gt;I won't detail all of them. Mostly they were the same pattern: shared state accessed from spawned goroutines, written once somewhere "safe" and read everywhere else, with no synchronization because the write "always finished first." The race detector disagreed with that assumption on 7 separate occasions.&lt;/p&gt;

&lt;p&gt;Two of them were in test helpers, not production code. Still real races — test helpers spin goroutines too, and a flaky test that fails once every 40 runs is its own kind of tax.&lt;/p&gt;

&lt;h2&gt;
  
  
  The honest accounting
&lt;/h2&gt;

&lt;p&gt;Three months of technical debt on a four-person equivalent codebase (it's mostly me and agents). Eight findings in 11 seconds of wall time. One of those findings was in the agent dispatch path that runs on every job — meaning every job that completed without incident was getting lucky with goroutine scheduling.&lt;/p&gt;

&lt;p&gt;That's the uncomfortable part about race conditions: they don't fail loudly. They fail intermittently, or they corrupt state silently, or they don't fail at all on your machine because your CPU happens to schedule goroutines in a forgiving order.&lt;/p&gt;

&lt;p&gt;The race detector doesn't care about your scheduler's mood.&lt;/p&gt;

&lt;p&gt;Running it weekly now. Should have been in CI from day one — &lt;code&gt;-race&lt;/code&gt; exists precisely because humans are bad at reasoning about concurrent memory access under load.&lt;/p&gt;

</description>
      <category>go</category>
      <category>testing</category>
      <category>programming</category>
      <category>tips</category>
    </item>
    <item>
      <title>How a one-person dev studio runs with autonomous AI agents</title>
      <dc:creator>BaoDev Studio</dc:creator>
      <pubDate>Sat, 16 May 2026 15:22:48 +0000</pubDate>
      <link>https://forem.com/baodev-studio/how-a-one-person-dev-studio-runs-with-autonomous-ai-agents-4a7m</link>
      <guid>https://forem.com/baodev-studio/how-a-one-person-dev-studio-runs-with-autonomous-ai-agents-4a7m</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR (for the impatient)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;BaoDev.studio is a one-person dev studio paired with a fleet of autonomous AI agents.&lt;/li&gt;
&lt;li&gt;The workflow is built around Claude Code and ~35 specialized agents that operate like a fractional engineering team.&lt;/li&gt;
&lt;li&gt;Pricing: $800 for a sprint (1–5 days), $4,500 for a system build (2–8 weeks), $3,000/month retainer (3-month minimum).&lt;/li&gt;
&lt;li&gt;Delivery times are honest. Token costs are real. AI agents do not replace senior judgment — they multiply it.&lt;/li&gt;
&lt;li&gt;This post: the stack, the actual numbers, the trade-offs, and what AI still cannot do.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The gap nobody talks about
&lt;/h2&gt;

&lt;p&gt;A founder needs an MVP, a payments integration, or a Go service that does not fall over. Two options usually surface:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Option 1: Freelancers on Upwork / Sribu / Kalibrr.&lt;/strong&gt; Cheap. Sometimes great. Often a coin flip. Someone billing 8 clients in parallel, copy-pasting from Stack Overflow, disappearing for 3 days when something breaks. $15-30/hour and a "1 week project" becomes 6 weeks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Option 2: An agency.&lt;/strong&gt; Senior people, real processes, predictable output. Also $20,000 minimum, 8-week kickoff, and a project manager emailing status updates while the actual work waits in a sprint queue.&lt;/p&gt;

&lt;p&gt;The gap in the middle — "senior engineering quality, this month, for less than the price of a used car" — is where most actual SME projects live. Nobody was serving it well, because the unit economics of doing senior work at freelance prices have not existed.&lt;/p&gt;

&lt;p&gt;Until AI agents got good enough.&lt;/p&gt;

&lt;h2&gt;
  
  
  What BaoDev actually does all day
&lt;/h2&gt;

&lt;p&gt;This is not "vibe coding". Not pasting prompts into ChatGPT and pretending. The studio runs an instrumented engineering pipeline that looks something like this:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Intake.&lt;/strong&gt; Client fills the form on baodev.studio describing what they want. The studio reviews. If a project is outside what can be delivered well, the answer is no — saying no honestly is worth more than saying yes to fail later.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Plan and contract.&lt;/strong&gt; A planning agent drafts the PLAN.md (14 sections: scope, architecture, risks, milestones, dependencies). A senior engineer reviews and edits. A contract is signed before any code is written.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Build.&lt;/strong&gt; Most people imagine the AI just writes everything. It does not. The actual flow:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;code&gt;flow-architect&lt;/code&gt; agent maps every page → API → DB → job → notification connection and flags gaps.&lt;/li&gt;
&lt;li&gt;A &lt;code&gt;backend-developer&lt;/code&gt; or &lt;code&gt;frontend-developer&lt;/code&gt; writes the first pass.&lt;/li&gt;
&lt;li&gt;A &lt;code&gt;quality-gate&lt;/code&gt; reviews every output before it hits the codebase.&lt;/li&gt;
&lt;li&gt;An &lt;code&gt;integration-test-agent&lt;/code&gt; writes tests that hit a real database, not mocks.&lt;/li&gt;
&lt;li&gt;A &lt;code&gt;security-engineer&lt;/code&gt; runs OWASP checks before deploy.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When something is non-obvious — a tricky algorithm, an architectural fork, a client decision — a human takes over. Agents are good at executing the boring 80%. Senior judgment owns the 20% where it matters.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Ship.&lt;/strong&gt; The CI pipeline includes the agents as checks. Every PR runs &lt;code&gt;0qa&lt;/code&gt;, scoring on completeness, security, performance, and tests. Score has to be ≥90 with zero critical findings or the PR does not merge.&lt;/p&gt;

&lt;p&gt;This stack is ~35 agents, all custom-built and version-controlled in the same repo. The roster lives in &lt;code&gt;~/.claude/agents/&lt;/code&gt;. They run on Claude Code locally — no cloud bill, no quota anxiety.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real numbers from real projects
&lt;/h2&gt;

&lt;p&gt;Vague case studies are useless. Concrete data points instead:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Project: SaaS dashboard MVP&lt;/strong&gt; (Next.js + Go API + Postgres + Stripe)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Quoted: $4,500, 4 weeks&lt;/li&gt;
&lt;li&gt;Delivered: $4,500, 3 weeks 2 days&lt;/li&gt;
&lt;li&gt;Token cost end-to-end: ~$140 of Claude usage&lt;/li&gt;
&lt;li&gt;Lines shipped: ~14,000 (without counting tests)&lt;/li&gt;
&lt;li&gt;Tests: ~4,200 lines, 91% statement coverage&lt;/li&gt;
&lt;li&gt;Bugs found in client UAT: 3 (1 critical fixed in 4 hours; 2 cosmetic)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Project: Bug fix sprint&lt;/strong&gt; (Go service, race condition in worker pool)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Quoted: $800, 2 days&lt;/li&gt;
&lt;li&gt;Delivered: $800, 6 hours&lt;/li&gt;
&lt;li&gt;Token cost: ~$8&lt;/li&gt;
&lt;li&gt;Outcome: race fixed, regression test added, root cause documented in the runbook&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Project: AI integration&lt;/strong&gt; (LLM-powered support agent for a small e-commerce site)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Quoted: $4,500, 6 weeks (deferred to phase 2 by client after week 3 — normal scope-creep)&lt;/li&gt;
&lt;li&gt;Delivered phase 1: $4,500, 3 weeks&lt;/li&gt;
&lt;li&gt;Token cost (dev only, not runtime): ~$200&lt;/li&gt;
&lt;li&gt;Phase 2 ongoing as retainer&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The unit economics work because token cost is a rounding error against engineering time. The constraint is not "how cheaply can the AI write code" — it is "how reliably can a senior direct it without re-doing everything by hand". That part had to be invented.&lt;/p&gt;

&lt;h2&gt;
  
  
  What AI agents cannot do
&lt;/h2&gt;

&lt;p&gt;Honest list. These do not get better with bigger models.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Tell a client they are wrong.&lt;/strong&gt; A founder asks for a feature that will tank their conversion rate. Agents will build it. A senior pushes back.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pick the right database.&lt;/strong&gt; Postgres or Mongo, Redis or Memcached, monolith or microservices — these are architectural decisions tied to business stage and team future. Agents pick whatever you suggest. They will not catch the suggestion that is wrong for your stage.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Read code politically.&lt;/strong&gt; Some refactors are technically clean and politically dead. Agents do not know your CTO has a feud with the previous lead.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Catch the subtle copy-paste bug.&lt;/strong&gt; Agents will sometimes write a 200-line file that compiles, runs, passes tests, and is silently wrong because they copy-pasted a constant from a similar service.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Decide what NOT to build.&lt;/strong&gt; Scope discipline is a senior trait. Agents are happy to scaffold a feature you do not need.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is why the model is not "buy an AI subscription instead of a developer". It is "buy a senior developer who has an AI workforce". The agents do not replace humans — they let one senior ship the work of 4-6 mid-level engineers without the overhead of 4-6 people.&lt;/p&gt;

&lt;h2&gt;
  
  
  The pricing model
&lt;/h2&gt;

&lt;p&gt;Three engagement types, all flat:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Sprint&lt;/strong&gt; — $800 per deliverable, 1–5 business days. Use this for one isolated thing: a feature, a bug, an integration, a migration. No ongoing commitment.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;System build&lt;/strong&gt; — $4,500+, 2–8 weeks. Full system delivered: backend, frontend, database, deployment, tests, docs. Fixed scope, fixed price, fixed timeline. SLA signed, system delivered.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ongoing retainer&lt;/strong&gt; — $3,000/month, 3-month minimum. Reserved engineering capacity for continuous work — new features, on-call, code review, iterative product.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No hourly billing. No "+30% if the project runs over". Estimates are honest, with a buffer. If a project hits its deadline early, the client gets the work earlier. If something was missed in scoping, the studio absorbs it — not a change order.&lt;/p&gt;

&lt;p&gt;Bilingual EN/ID. Asynchronous communication by default. Email-first. Async beats meetings 9 times out of 10.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this post exists
&lt;/h2&gt;

&lt;p&gt;Two reasons:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;To find clients who fit.&lt;/strong&gt; If your project lives in the SME band — between Upwork-cheap and agency-expensive — the intake form on baodev.studio is the right starting point. Services page has full pricing. Open-source showcase projects are linked from the projects page.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;To document a workflow that was not possible 18 months ago.&lt;/strong&gt; The economics of senior engineering + AI agents are real, and they are reshaping who can sustainably run a small studio. Worth contributing to that conversation.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If this resonates, the intake form on baodev.studio is the right next step. Every legitimate inquiry gets a response within 24 hours, in English or Bahasa. If the project is a fit, a contract is signed within a week. If not, a referral comes instead.&lt;/p&gt;

&lt;p&gt;BaoDev.studio. Senior engineering paired with autonomous AI agents. Production-grade systems. No agency overhead.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://baodev.studio/blog-post.html#one-person-dev-studio-ai-agents" rel="noopener noreferrer"&gt;baodev.studio&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>freelance</category>
      <category>productivity</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
