<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Abhishek</title>
    <description>The latest articles on Forem by Abhishek (@mrcssdev).</description>
    <link>https://forem.com/mrcssdev</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3289582%2F39b6e94f-19a3-4eaa-957d-24372e25917c.jpg</url>
      <title>Forem: Abhishek</title>
      <link>https://forem.com/mrcssdev</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/mrcssdev"/>
    <language>en</language>
    <item>
      <title>Why AI Agents Cost More Than LLMs (And How to Stop Bleeding Tokens)</title>
      <dc:creator>Abhishek</dc:creator>
      <pubDate>Mon, 11 May 2026 07:23:58 +0000</pubDate>
      <link>https://forem.com/mrcssdev/why-ai-agents-cost-more-than-llms-and-how-to-stop-bleeding-tokens-4e4g</link>
      <guid>https://forem.com/mrcssdev/why-ai-agents-cost-more-than-llms-and-how-to-stop-bleeding-tokens-4e4g</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4trtoxgrfk7lh19hm5h7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4trtoxgrfk7lh19hm5h7.png" alt="AI Agents vs LLM pricing"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I was building a small bookmark app last weekend. You send it a URL, Gemini&lt;br&gt;
summarizes and tags the page, the result goes into Postgres. A few hundred lines&lt;br&gt;
of TypeScript.&lt;/p&gt;

&lt;p&gt;The first version cost almost nothing. One LLM call per URL, that's it. Then I&lt;br&gt;
added "tools" so the model could fetch pages, look up similar bookmarks, or&lt;br&gt;
check things against Google Search.&lt;/p&gt;

&lt;p&gt;My token bill quadrupled.&lt;/p&gt;

&lt;p&gt;That's where most people building agents land for the first time. Going from a&lt;br&gt;
plain chat call to an agent loop is way more expensive than docs make it sound,&lt;br&gt;
and the reason isn't obvious until you watch the round trips happen one by one.&lt;br&gt;
Let's do that.&lt;/p&gt;
&lt;h2&gt;
  
  
  What a plain LLM call costs
&lt;/h2&gt;

&lt;p&gt;Here's the simplest LLM call in TypeScript with &lt;code&gt;@google/genai&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;GoogleGenAI&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@google/genai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ai&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;GoogleGenAI&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;GEMINI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generateContent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gemini-2.5-flash&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Summarize this article: ...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One request out, one response back. You pay for two things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Input tokens&lt;/strong&gt; for your prompt&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Output tokens&lt;/strong&gt; for the model's reply&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's it. Two numbers on your bill. If your prompt is 500 tokens and the answer&lt;br&gt;
is 200, you pay for 700 tokens. Done.&lt;/p&gt;
&lt;h2&gt;
  
  
  Now add a single tool
&lt;/h2&gt;

&lt;p&gt;Tools are how the model talks to the outside world. Calling an API, querying a&lt;br&gt;
database, fetching a URL, anything. You describe each tool with a small JSON&lt;br&gt;
schema, and the model can ask to "call" one mid-conversation. You actually run&lt;br&gt;
the function, send the result back, and the model writes its final answer using&lt;br&gt;
that result.&lt;/p&gt;

&lt;p&gt;The basic version:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;GoogleGenAI&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;Type&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@google/genai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;tools&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;
  &lt;span class="na"&gt;functionDeclarations&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;getWeather&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Get the weather of any city&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Type&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;OBJECT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;location&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Type&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;STRING&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;location&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;}],&lt;/span&gt;
&lt;span class="p"&gt;}];&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;first&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generateContent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gemini-2.5-flash&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;What is the weather in Tokyo?&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;tools&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;first&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// → undefined&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;undefined&lt;/code&gt;?&lt;/p&gt;

&lt;p&gt;The model didn't answer. It returned a structured request:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nx"&gt;first&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;functionCalls&lt;/span&gt;
&lt;span class="c1"&gt;// → [{ name: 'getWeather', args: { location: 'Tokyo' } }]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the part that surprises people. The model got asked a question, and&lt;br&gt;
instead of answering, it asked &lt;strong&gt;you&lt;/strong&gt; to run a function. So you do that and&lt;br&gt;
ship the result back:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getWeather&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Tokyo&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;  &lt;span class="c1"&gt;// { temperature: 23, condition: 'sunny' }&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;second&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generateContent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gemini-2.5-flash&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="na"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;What is the weather in Tokyo?&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}]&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;model&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;functionCall&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;getWeather&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;location&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Tokyo&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}]&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="na"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;functionResponse&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;getWeather&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;response&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}]&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;tools&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;second&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// → "It's 23°C and sunny in Tokyo."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two LLM calls. One question. That's the agent tax.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why we can't just do it in one call
&lt;/h2&gt;

&lt;p&gt;The first reaction (mine too): why can't the model just answer in one shot?&lt;/p&gt;

&lt;p&gt;The reason is simple. The model can't predict what the tool will return. The&lt;br&gt;
temperature in Tokyo isn't in its training data, the API hasn't been hit yet,&lt;br&gt;
the result doesn't exist. You can't write "It's 23°C in Tokyo" before you know&lt;br&gt;
it's 23°C.&lt;/p&gt;

&lt;p&gt;So turn 1 is "decide what to do." Turn 2 is "use what you learned." They can't&lt;br&gt;
be merged. The model has no memory between calls.&lt;/p&gt;

&lt;p&gt;One exception is worth knowing about: server-side tools. Things like&lt;br&gt;
&lt;code&gt;googleSearch&lt;/code&gt; or &lt;code&gt;urlContext&lt;/code&gt; in Gemini run inside Google's own servers, and&lt;br&gt;
the API returns one merged response. From your side it looks like a single call.&lt;br&gt;
You lose some control (you can't see exactly what got searched), but you save&lt;br&gt;
a round trip.&lt;/p&gt;
&lt;h2&gt;
  
  
  Counting the actual tokens
&lt;/h2&gt;

&lt;p&gt;Here's where the cost lives. Look at what turn 2 has to send compared to turn 1:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Turn 1 in&lt;/th&gt;
&lt;th&gt;Turn 1 out&lt;/th&gt;
&lt;th&gt;Turn 2 in&lt;/th&gt;
&lt;th&gt;Turn 2 out&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;System prompt&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;yes, billed again&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tool schemas&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;yes, billed again&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;User question&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;yes, billed again&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Model's tool call&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes, as input&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Your tool result&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Final answer&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Your system prompt and tool definitions get sent to the API &lt;strong&gt;twice&lt;/strong&gt;. Turn 1&lt;br&gt;
doesn't free you from re-sending everything in turn 2, because the model is&lt;br&gt;
stateless. It forgets the whole conversation between calls.&lt;/p&gt;

&lt;p&gt;Real numbers from my bookmark agent:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;System prompt: ~200 tokens&lt;/li&gt;
&lt;li&gt;4 tool declarations: ~400 tokens&lt;/li&gt;
&lt;li&gt;User question: ~50 tokens&lt;/li&gt;
&lt;li&gt;Tool result (a few rows from Postgres): ~300 tokens
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Plain LLM call:    ~650 in  +  ~200 out  =  ~850 tokens
One-tool agent:   ~1300 in  +  ~230 out  = ~1530 tokens (about 1.8x)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;And that's the best case. Exactly one tool call, no follow-ups. Real agents are&lt;br&gt;
worse. A lot worse.&lt;/p&gt;
&lt;h2&gt;
  
  
  Real agents grow quadratically
&lt;/h2&gt;

&lt;p&gt;The bookmark agent does three things on a new URL:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Fetch the page (&lt;code&gt;fetchUrl&lt;/code&gt; tool)&lt;/li&gt;
&lt;li&gt;Look for similar existing bookmarks in the DB (&lt;code&gt;searchSimilar&lt;/code&gt; tool)&lt;/li&gt;
&lt;li&gt;Pick a category from the user's existing taxonomy (&lt;code&gt;getTaxonomy&lt;/code&gt; tool)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's 4 LLM turns total. Ask, get tool calls, send back results, ask again,&lt;br&gt;
get more calls, send results, finally write the summary.&lt;/p&gt;

&lt;p&gt;What the cumulative input size looks like each turn:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Turn&lt;/th&gt;
&lt;th&gt;What gets sent&lt;/th&gt;
&lt;th&gt;Input tokens&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;system + schemas + URL&lt;/td&gt;
&lt;td&gt;700&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;+ previous calls + &lt;code&gt;fetchUrl&lt;/code&gt; result (~1500 of page)&lt;/td&gt;
&lt;td&gt;2200&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;+ &lt;code&gt;searchSimilar&lt;/code&gt; result&lt;/td&gt;
&lt;td&gt;2400&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;+ &lt;code&gt;getTaxonomy&lt;/code&gt; result&lt;/td&gt;
&lt;td&gt;2600&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Total input across all turns: about &lt;strong&gt;7900 tokens&lt;/strong&gt; to summarize one webpage.&lt;/p&gt;

&lt;p&gt;For comparison, a plain &lt;code&gt;generateContent({ contents: "summarize this:\n" + pageText })&lt;/code&gt;&lt;br&gt;
costs ~1500 input + 200 output. About &lt;strong&gt;1700 tokens&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Same task. Almost 5x the bill.&lt;/p&gt;

&lt;p&gt;It gets worse. Cost grows &lt;strong&gt;quadratically&lt;/strong&gt; with the number of turns, because&lt;br&gt;
each turn replays everything that came before. A 10-turn agent isn't 10x the&lt;br&gt;
cost. It's closer to 30x.&lt;/p&gt;
&lt;h2&gt;
  
  
  Three ways to stop the bleeding
&lt;/h2&gt;

&lt;p&gt;You're not stuck. Here's what actually works.&lt;/p&gt;
&lt;h3&gt;
  
  
  1. Prompt caching
&lt;/h3&gt;

&lt;p&gt;The biggest lever by far. Every major provider supports it now: OpenAI,&lt;br&gt;
Anthropic, Google. The system prompt and tool schemas don't change between&lt;br&gt;
turns, so cache them once and pay about 25% of the input cost on every reuse.&lt;/p&gt;

&lt;p&gt;With &lt;code&gt;@google/genai&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cache&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;caches&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gemini-2.5-flash&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;systemInstruction&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;You are a bookmark organizer...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// these never change across turns&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generateContent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gemini-2.5-flash&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;history&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;cachedContent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For my 4-turn flow this cuts input costs by roughly half. Anthropic and OpenAI&lt;br&gt;
do the same thing with different syntax.&lt;/p&gt;

&lt;p&gt;Gemini also has &lt;em&gt;implicit&lt;/em&gt; caching. It auto-caches recent prefixes for you with&lt;br&gt;
zero code changes. You just see cheaper retries. Check if your provider has it&lt;br&gt;
on before reinventing the wheel.&lt;/p&gt;
&lt;h3&gt;
  
  
  2. Different model per turn
&lt;/h3&gt;

&lt;p&gt;The "decide which tool to call" turn is dumb work. It barely needs reasoning.&lt;br&gt;
It's pattern matching on a question. The final synthesis turn is where you&lt;br&gt;
actually want a smart model.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Cheap, fast: decides what to do&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;decision&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generateContent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gemini-2.5-flash-lite&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Smarter: writes the actual answer&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;finalAnswer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generateContent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gemini-2.5-pro&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In a 4-turn flow, three of the turns can run on the cheap model. Only the last&lt;br&gt;
one, the user-facing answer, needs the expensive one. For high-volume agents&lt;br&gt;
this saves more than caching does.&lt;/p&gt;
&lt;h3&gt;
  
  
  3. Parallel tool calls
&lt;/h3&gt;

&lt;p&gt;The model can ask for multiple tools in a single response. Code I see in&lt;br&gt;
tutorials usually does &lt;code&gt;functionCalls[0]&lt;/code&gt; and silently drops the rest, turning&lt;br&gt;
what could be one round trip into many.&lt;/p&gt;

&lt;p&gt;The fix is one line of &lt;code&gt;Promise.all&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="nx"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;functionCalls&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;response&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;dispatchers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;args&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;}))&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For "summarize all my React bookmarks from last month," the model might call&lt;br&gt;
&lt;code&gt;searchBookmarks&lt;/code&gt; and &lt;code&gt;getDateRange&lt;/code&gt; in parallel. Handle both, and you save a&lt;br&gt;
whole round trip.&lt;/p&gt;

&lt;h2&gt;
  
  
  What you can't optimize away
&lt;/h2&gt;

&lt;p&gt;Tools have a real cost, and they buy you real value. The reason you reach for&lt;br&gt;
them is the same reason they're expensive. You're forcing the model to use&lt;br&gt;
facts that exist outside its head instead of making them up.&lt;/p&gt;

&lt;p&gt;A plain LLM call will happily tell you the weather in Tokyo. It'll just be&lt;br&gt;
wrong.&lt;/p&gt;

&lt;p&gt;Quick way to think about it when picking an architecture:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Plain LLM&lt;/strong&gt; is a guess from training data. Cheap, fast, hallucinates.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tools / agent&lt;/strong&gt; is real data. Expensive, slower, honest.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Most apps shouldn't be agents. If your task is "summarize this text I'm pasting&lt;br&gt;
in" or "rewrite this email," you don't need tools. You need one call. A lot of&lt;br&gt;
agent frameworks make it really easy to add tools by default, which makes it&lt;br&gt;
really easy to spend 5x what you should.&lt;/p&gt;

&lt;p&gt;Tools earn their cost when you have side effects (writing to a DB, sending a&lt;br&gt;
message), grounded data (today's weather, this user's bookmarks, current docs),&lt;br&gt;
or chained reasoning where intermediate steps actually need verification.&lt;/p&gt;

&lt;p&gt;They don't earn it on anything you could solve with one good prompt.&lt;/p&gt;

&lt;h2&gt;
  
  
  The receipt
&lt;/h2&gt;

&lt;p&gt;Last week I added one tool to a Gemini call and watched the cost go from 850&lt;br&gt;
tokens to 1530 for the same question. Once I started parallelizing calls and&lt;br&gt;
caching the system prompt, I got the bookmark agent down to about 4500 tokens&lt;br&gt;
across all four turns. Still 2.5x a plain call, but way better than the 7900&lt;br&gt;
the naive version was burning.&lt;/p&gt;

&lt;p&gt;Your agent isn't a smarter LLM. It's the same LLM with a longer receipt. Once&lt;br&gt;
you can read the receipt, every optimization becomes obvious.&lt;/p&gt;

&lt;p&gt;If you like my content support by like and share 💟 also dont forget to follow me on &lt;a href="https://x.com/0bhishek" rel="noopener noreferrer"&gt;Twitter/X&lt;/a&gt; and &lt;a href="https://www.linkedin.com/in/0bhishek" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;. If you want me to connect, checkout &lt;a href="https://0bhishek.tech/" rel="noopener noreferrer"&gt;my site&lt;/a&gt;. See you in next one.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
    </item>
    <item>
      <title>How I Build an NPM Package that lets you Scaffold React Apps</title>
      <dc:creator>Abhishek</dc:creator>
      <pubDate>Wed, 03 Sep 2025 17:31:48 +0000</pubDate>
      <link>https://forem.com/mrcssdev/how-i-build-an-npm-package-that-lets-you-scaffold-react-apps-5a5h</link>
      <guid>https://forem.com/mrcssdev/how-i-build-an-npm-package-that-lets-you-scaffold-react-apps-5a5h</guid>
      <description>&lt;p&gt;Okay, So I finnaly made an NPM package so that you don't need to find each and every dependencies. In this blog, I am going to walkthrough the process how you can try this package yourself. &lt;br&gt;
I will also share how I came to making one more npm thingy, here we go. &lt;/p&gt;

&lt;p&gt;Let first start with a question: &lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;strong&gt;What is An NPM Package?&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;=&amp;gt; NPM is a package manager for Node. js packages, or modules if you like. &lt;a href="http://www.npmjs.com" rel="noopener noreferrer"&gt;www.npmjs.com&lt;/a&gt; hosts thousands of free packages to download and use. The NPM program is installed on your computer when you install Node.js. If you installed Node. &lt;br&gt;
That's a preety boring definition, here is the simpler version&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;An npm package is basically a &lt;u&gt;bundle of reusable code&lt;/u&gt; that you (or anyone) can install and use in a Node.js project. Think of it as a little Lego piece—easy to plug in, whether it’s for adding a button component, handling dates, or powering a whole framework. &lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;
  
  
  &lt;strong&gt;My First NPM package&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;I found out the process of setting up each package one by one, why not make a scaffold tool like a thing so that anyone can just hit the command, do some'enter...enter' and get their react project as it it.&lt;br&gt;
I tried to implement some CI/CD files and vercel.json to make the process further helpful. &lt;/p&gt;

&lt;p&gt;To install the npm package, you need to enter this command in your terminal- &lt;/p&gt;

&lt;p&gt;&lt;code&gt;npx react-starter-plus&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;The prompts you need to follow&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Project name&lt;/strong&gt; → &lt;code&gt;my-react-app&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Language&lt;/strong&gt; → JavaScript / TypeScript&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Git setup&lt;/strong&gt; → Initialize repo &amp;amp; push to GitHub (provide remote URL)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Extras&lt;/strong&gt; →&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;CI/CD with GitHub Actions?&lt;/li&gt;
&lt;li&gt;Zustand for state?&lt;/li&gt;
&lt;li&gt;React Testing Library?

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Deployment&lt;/strong&gt; → Deploy with Vercel (make sure you’re logged in with &lt;code&gt;vercel login&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deploy now or later&lt;/strong&gt; → Your call.&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Get the Summary &amp;amp; Setup
&lt;/h3&gt;

&lt;p&gt;The CLI shows a summary before proceeding. If everything looks good, it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Installs dependencies&lt;/li&gt;
&lt;li&gt;Sets up Tailwind + routing&lt;/li&gt;
&lt;li&gt;Initializes Git &amp;amp; pushes to remote&lt;/li&gt;
&lt;li&gt;Configures CI/CD&lt;/li&gt;
&lt;li&gt;Deploys to Vercel&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  And done
&lt;/h3&gt;

&lt;p&gt;You’ll end up with something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;✔ Deployment successful:
https://johndoes-project.vercel.app

→ Run it locally with `npm run dev`
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then you can choose how your React project gonna be- &lt;br&gt;
and finnaly you get this-&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foc82kx8f1tclk1kwlw7o.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foc82kx8f1tclk1kwlw7o.png" alt=" " width="800" height="388"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That's all for this project, if you find this helpful at any point, &lt;a href="https://github.com/iCoderabhishek/npm-package/tree/main/react-starter-plus" rel="noopener noreferrer"&gt;leave a start 🌟&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>react</category>
      <category>cli</category>
      <category>npm</category>
    </item>
  </channel>
</rss>
