<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Alexandros Gounis</title>
    <description>The latest articles on Forem by Alexandros Gounis (@alexandros_gounis_dfacceb).</description>
    <link>https://forem.com/alexandros_gounis_dfacceb</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2205025%2Fc258e36b-4ccf-4ce4-aaf3-905c07b59314.jpg</url>
      <title>Forem: Alexandros Gounis</title>
      <link>https://forem.com/alexandros_gounis_dfacceb</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/alexandros_gounis_dfacceb"/>
    <language>en</language>
    <item>
      <title>MCP Is Costing You 37% More Tokens Than Necessary</title>
      <dc:creator>Alexandros Gounis</dc:creator>
      <pubDate>Thu, 19 Mar 2026 13:50:44 +0000</pubDate>
      <link>https://forem.com/alexandros_gounis_dfacceb/mcp-is-costing-you-37-more-tokens-than-necessary-a5d</link>
      <guid>https://forem.com/alexandros_gounis_dfacceb/mcp-is-costing-you-37-more-tokens-than-necessary-a5d</guid>
      <description>&lt;h3&gt;
  
  
  The Problem
&lt;/h3&gt;

&lt;p&gt;MCP is powerful, but there's a cost issue that is often neglected.&lt;/p&gt;

&lt;p&gt;MCP servers contain tools, and each tool comes with a JSON schema. As more tools are added, input token usage increases significantly. &lt;/p&gt;

&lt;p&gt;Every MCP request sends the full JSON schema for &lt;strong&gt;every registered tool&lt;/strong&gt;, including the ones the model won't actaully use. There has to be a more efficient way to do this. &lt;/p&gt;

&lt;p&gt;I built a tool called &lt;a href="https://github.com/AlexandrosGounis/clifast" rel="noopener noreferrer"&gt;clifast&lt;/a&gt; that converts functions into CLI commands and I benchmarked exactly how much MCP costs vs. a CLI approach. Here's the &lt;a href="https://github.com/alexandrosGounis/mcp-vs-cli" rel="noopener noreferrer"&gt;repository&lt;/a&gt;, so you can test it yourself.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;With just 5 tools:&lt;/strong&gt; &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;37% more input tokens&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;74% larger tool definitions&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;it scales linearly&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Analysis
&lt;/h2&gt;

&lt;p&gt;MCP tool discovery works like this: &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;your client calls &lt;code&gt;listTools()&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;gets back full JSON schemas for every tool&lt;/li&gt;
&lt;li&gt;converts them to the Anthropic format&lt;/li&gt;
&lt;li&gt;sends &lt;strong&gt;all of them on every API call&lt;/strong&gt;. No caching, no partial loading.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The CLI alternative: one generic &lt;code&gt;run_command&lt;/code&gt; tool with a text description of available commands. Fixed size regardless of how many commands you have.&lt;/p&gt;

&lt;h2&gt;
  
  
  Two approaches, same result
&lt;/h2&gt;

&lt;p&gt;Both approaches use the same weather API, same model, same prompt. The only difference is how tools are presented to Claude.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MCP&lt;/strong&gt; — 5 individual tools, each with a full Zod schema:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nx"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;registerTool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;get_forecast&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Get daily weather forecast for a location for 1 to 3 days ahead&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;inputSchema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;location&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;describe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;City name or location&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="na"&gt;days&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;number&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;describe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Number of days to forecast (1-3)&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;location&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;days&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetchWeatherData&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;errorResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Weather API error&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;successResponse&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="cm"&gt;/* ... */&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// ... repeat for all 5 tools&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;CLI&lt;/strong&gt; — one tool, plain text description:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;weatherCliToolDefinition&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Anthropic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Tool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;run_command&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Run a weather CLI command. Available commands:
  getCurrentWeather &amp;lt;location&amp;gt;
  getForecast &amp;lt;location&amp;gt; [days]
  getHourlyForecast &amp;lt;location&amp;gt; [day]
  getAstronomy &amp;lt;location&amp;gt;
  getWindAndPressure &amp;lt;location&amp;gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;input_schema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;object&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Command name&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;arguments&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;array&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Command arguments&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;command&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;arguments&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The CLI tools are generated from plain TypeScript exports using &lt;a href="https://github.com/AlexandrosGounis/clifast" rel="noopener noreferrer"&gt;clifast&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getCurrentWeather&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="cm"&gt;/* ... */&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getForecast&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;days&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;3&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="cm"&gt;/* ... */&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getHourlyForecast&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;day&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;0&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="cm"&gt;/* ... */&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getAstronomy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="cm"&gt;/* ... */&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getWindAndPressure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="cm"&gt;/* ... */&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One command generates the CLI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx clifast generate cli/get-weather.ts &lt;span class="nt"&gt;-y&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;CLI&lt;/th&gt;
&lt;th&gt;MCP&lt;/th&gt;
&lt;th&gt;Difference&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Input tokens&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1,498&lt;/td&gt;
&lt;td&gt;2,368&lt;/td&gt;
&lt;td&gt;CLI uses 37% fewer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Output tokens&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;228&lt;/td&gt;
&lt;td&gt;230&lt;/td&gt;
&lt;td&gt;Roughly equal&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total tokens&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1,726&lt;/td&gt;
&lt;td&gt;2,598&lt;/td&gt;
&lt;td&gt;CLI uses 34% fewer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Latency&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;7.3s&lt;/td&gt;
&lt;td&gt;7.1s&lt;/td&gt;
&lt;td&gt;Comparable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Tool definition size&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;479 B&lt;/td&gt;
&lt;td&gt;1,851 B&lt;/td&gt;
&lt;td&gt;CLI is 74% smaller&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Benchmarked on &lt;code&gt;claude-sonnet-4-6&lt;/code&gt;. Both produced equivalent, correct answers. The token gap comes entirely from tool definitions sent on each request.&lt;/p&gt;

&lt;h3&gt;
  
  
  The scaling math:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;MCP grows linearly: each new tool adds its full schema to every request. &lt;/li&gt;
&lt;li&gt;CLI stays constant: adding a command is one more line of text. &lt;/li&gt;
&lt;li&gt;At 20 tools, you're looking at ~4x the schema overhead on every single API call, across every turn of the conversation.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The hallucination risk (and how to mitigate it)
&lt;/h2&gt;

&lt;p&gt;Let's be honest, the CLI approach trades schema precision for token efficiency. With MCP, Claude gets structured type information for every parameter. With CLI, it reads a text description and could hallucinate a command that doesn't exist or pass wrong argument types.&lt;/p&gt;

&lt;p&gt;In practice, models like Claude are really good at understanding prompt intent. But if you need stronger guarantees:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Group by domain:&lt;/strong&gt; instead of one mega-tool with 50 commands, create a few CLI tools per domain (&lt;code&gt;run_weather_command&lt;/code&gt;, &lt;code&gt;run_billing_command&lt;/code&gt;). You still get massive token savings over MCP while reducing the hallucination surface.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Validate before execution:&lt;/strong&gt; check the command name against a known list before running it. Return a clear error if it doesn't match.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Be specific in descriptions:&lt;/strong&gt; include argument types and constraints in the text (Claude respects these).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The sweet spot is usually 5-15 commands per CLI tool. Enough to save tokens, few enough that the model doesn't get confused.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to use which
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Use MCP&lt;/strong&gt; when interoperability matters: shared tool servers consumed by multiple clients, strict parameter validation, or when you need the ecosystem.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use CLI&lt;/strong&gt; when cost and speed matter: high-volume production, many tools per agent, token-constrained budgets, or when you control the full stack.&lt;/p&gt;

&lt;p&gt;They're not mutually exclusive. You can use MCP for discovery and path execution through CLI tools (mix and match).&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it yourself
&lt;/h2&gt;

&lt;p&gt;Benchmark repository: &lt;a href="https://github.com/AlexandrosGounis/mcp-vs-cli.git" rel="noopener noreferrer"&gt;https://github.com/AlexandrosGounis/mcp-vs-cli.git&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The benchmark outputs a timestamped JSON report and prints the full comparison to stdout. MCP is a great protocol. But abstractions have costs, and in AI tooling, those costs show up on every request. Measure yours.&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>cli</category>
      <category>codemode</category>
      <category>claude</category>
    </item>
  </channel>
</rss>
