<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: vinmay</title>
    <description>The latest articles on Forem by vinmay (@vinmay).</description>
    <link>https://forem.com/vinmay</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F202000%2F65ee2a0d-c9a2-466d-baae-d1795223c454.jpeg</url>
      <title>Forem: vinmay</title>
      <link>https://forem.com/vinmay</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/vinmay"/>
    <language>en</language>
    <item>
      <title>Your `pip install` Just Stole Your SSH Keys: The LiteLLM Supply Chain Attack Explained</title>
      <dc:creator>vinmay</dc:creator>
      <pubDate>Tue, 24 Mar 2026 20:19:13 +0000</pubDate>
      <link>https://forem.com/vinmay/your-pip-install-just-stole-your-ssh-keys-the-litellm-supply-chain-attack-explained-4l52</link>
      <guid>https://forem.com/vinmay/your-pip-install-just-stole-your-ssh-keys-the-litellm-supply-chain-attack-explained-4l52</guid>
      <description>&lt;p&gt;A single &lt;code&gt;pip install litellm==1.82.8&lt;/code&gt; was enough to drain everything off your machine. No suspicious imports. No weird prompts. Just a package install, and your AWS credentials, SSH keys, and API keys were already heading to an attacker's server.&lt;/p&gt;

&lt;p&gt;Here's what happened, why it's scary, and what you can actually do about it.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Happened
&lt;/h2&gt;

&lt;p&gt;On March 24, 2026, LiteLLM version 1.82.8 landed on PyPI with a malicious file bundled inside: &lt;code&gt;litellm_init.pth&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;That &lt;code&gt;.pth&lt;/code&gt; extension is why this attack is so nasty.&lt;/p&gt;

&lt;p&gt;Python automatically runs &lt;code&gt;.pth&lt;/code&gt; files in your &lt;code&gt;site-packages&lt;/code&gt; directory &lt;strong&gt;every time the Python interpreter starts&lt;/strong&gt;. No import needed, no user interaction. The attacker hid a double base64-encoded payload inside this file. The moment Python ran, the payload ran too.&lt;/p&gt;

&lt;p&gt;What did it grab? Pretty much everything:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;All your environment variables (&lt;code&gt;OPENAI_API_KEY&lt;/code&gt;, &lt;code&gt;AWS_SECRET_ACCESS_KEY&lt;/code&gt;, all of it)&lt;/li&gt;
&lt;li&gt;SSH private keys (&lt;code&gt;~/.ssh/id_rsa&lt;/code&gt;, &lt;code&gt;id_ed25519&lt;/code&gt;, and more)&lt;/li&gt;
&lt;li&gt;AWS, GCP, Azure, and Kubernetes credentials&lt;/li&gt;
&lt;li&gt;Git credentials and &lt;code&gt;.gitconfig&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Shell history (&lt;code&gt;~/.bash_history&lt;/code&gt;, &lt;code&gt;~/.zsh_history&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Docker configs, npm tokens, database passwords&lt;/li&gt;
&lt;li&gt;Crypto wallet files&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then it encrypted everything with AES-256, wrapped the key with a hardcoded RSA-4096 public key, and shipped it all off to &lt;code&gt;https://models.litellm.cloud/&lt;/code&gt;. Note that domain: &lt;code&gt;litellm.cloud&lt;/code&gt;, not the real &lt;code&gt;litellm.ai&lt;/code&gt;. Classic.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why the Scale Is Scary
&lt;/h2&gt;

&lt;p&gt;LiteLLM gets &lt;strong&gt;97 million downloads per month&lt;/strong&gt;. That alone is a huge problem.&lt;/p&gt;

&lt;p&gt;But supply chain attacks don't stop at the direct install. They travel through dependency trees. If you installed &lt;code&gt;dspy&lt;/code&gt;, &lt;code&gt;langchain&lt;/code&gt;, or any of the other popular AI packages that depend on &lt;code&gt;litellm&amp;gt;=1.64.0&lt;/code&gt;, you were also exposed without ever typing &lt;code&gt;pip install litellm&lt;/code&gt; yourself.&lt;/p&gt;

&lt;p&gt;The attack was only live for about an hour. It got discovered almost by accident: a developer's machine ran out of RAM and crashed because the payload was executing inside Cursor through an MCP plugin that pulled in litellm as a transitive dependency. A bug in the attacker's own code gave them away.&lt;/p&gt;

&lt;p&gt;If that bug wasn't there, this could have run quietly for days or weeks across thousands of CI/CD pipelines, dev machines, and prod servers. Nobody would have noticed.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Real Problem: You Can't See What's Inside Your Dependencies
&lt;/h2&gt;

&lt;p&gt;When you run &lt;code&gt;pip install something&lt;/code&gt;, you're not just installing one thing. You're pulling in a whole tree of packages, and any one of them could be compromised.&lt;/p&gt;

&lt;p&gt;This isn't new, but it's getting worse as the AI package ecosystem keeps exploding. New packages, new versions, new dependencies dropping every single day. The attack surface is growing way faster than anyone can audit it.&lt;/p&gt;

&lt;p&gt;We're taught to think of dependencies as a good thing. Reusable building blocks, standing on the shoulders of giants, all that. The LiteLLM incident is a reminder that every dependency is also a trust decision, and most of us are making those decisions without really thinking about it.&lt;/p&gt;




&lt;h2&gt;
  
  
  What You Should Actually Do
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;If you installed litellm 1.82.8:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Check for &lt;code&gt;litellm_init.pth&lt;/code&gt; in your &lt;code&gt;site-packages/&lt;/code&gt; directory&lt;/li&gt;
&lt;li&gt;Rotate everything: every API key, SSH key, and cloud credential that was on that machine&lt;/li&gt;
&lt;li&gt;Check any CI/CD environment where litellm might have been installed too&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Going forward:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pin your dependencies.&lt;/strong&gt; Exact version locks (&lt;code&gt;==&lt;/code&gt;) in production instead of &lt;code&gt;&amp;gt;=&lt;/code&gt;. It won't stop a poisoned release from getting in if you're on that version, but it stops silent upgrades pulling in something bad later.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use lockfiles.&lt;/strong&gt; &lt;code&gt;pip-compile&lt;/code&gt;, &lt;code&gt;poetry.lock&lt;/code&gt;, &lt;code&gt;uv.lock&lt;/code&gt;, whatever fits your setup. Know exactly what you're running.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audit transitive dependencies.&lt;/strong&gt; &lt;code&gt;pip-audit&lt;/code&gt; and &lt;code&gt;safety&lt;/code&gt; scan your full dependency tree for known issues. Worth running in CI.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Don't pip install as root.&lt;/strong&gt; Limits how much damage a compromise can actually do.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Keep an eye out for &lt;code&gt;.pth&lt;/code&gt; files.&lt;/strong&gt; They're a legit Python feature, but they're also a perfect delivery mechanism for malware. If you see one in &lt;code&gt;site-packages&lt;/code&gt; from a package you don't recognise, that's worth investigating.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Can We Do Better Than Grep?
&lt;/h2&gt;

&lt;p&gt;Most of the advice above is reactive. It helps you recover or reduce damage after something gets in. What's actually hard is knowing &lt;em&gt;before&lt;/em&gt; you run anything that a package can even reach your credentials.&lt;/p&gt;

&lt;p&gt;This is the gap I've been trying to close with something I'm building called &lt;a href="https://github.com/vinmay/reachscan" rel="noopener noreferrer"&gt;ReachScan&lt;/a&gt;. The idea is pretty simple: instead of just matching against a list of known bad packages, it maps what a codebase or its dependencies can actually &lt;em&gt;reach&lt;/em&gt;: filesystem paths, environment variables, system resources. If a package has no business touching &lt;code&gt;~/.ssh/&lt;/code&gt;, you should know that before it runs, not after.&lt;/p&gt;

&lt;p&gt;It won't catch everything. But knowing the capability surface of what you're about to install is a lot better than just hoping nothing in the tree is malicious.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Uncomfortable Truth
&lt;/h2&gt;

&lt;p&gt;Karpathy put it well after this incident: the way we think about dependencies needs to change. The whole "building pyramids from bricks" model assumes the bricks are trustworthy. In 2026, that's a harder assumption to stand behind.&lt;/p&gt;

&lt;p&gt;That doesn't mean stop using dependencies. That's not realistic. It just means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Be deliberate about what you pull in&lt;/li&gt;
&lt;li&gt;Actually understand what each dependency can do on your machine&lt;/li&gt;
&lt;li&gt;Have a rotation plan for credentials that treats compromise as a when, not an if&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The LiteLLM attack got caught by luck. The next one might not be.&lt;/p&gt;

</description>
      <category>security</category>
      <category>ai</category>
      <category>python</category>
      <category>llm</category>
    </item>
    <item>
      <title>I built "npm audit" for AI agents</title>
      <dc:creator>vinmay</dc:creator>
      <pubDate>Sat, 21 Mar 2026 02:23:50 +0000</pubDate>
      <link>https://forem.com/vinmay/i-built-npm-audit-for-ai-agents-5fc8</link>
      <guid>https://forem.com/vinmay/i-built-npm-audit-for-ai-agents-5fc8</guid>
      <description>&lt;p&gt;I was adding MCP tools to a project when I realized something uncomfortable: I had no idea what the code I was installing could actually do.&lt;/p&gt;

&lt;p&gt;The README said "connects Claude to Blender." What it didn't say was that one of the registered tools passes a raw string parameter to Python's &lt;code&gt;exec()&lt;/code&gt; with no builtin restriction. The LLM doesn't get "Blender API access." It gets full Python execution on the host machine.&lt;/p&gt;

&lt;p&gt;I wanted a way to know this &lt;em&gt;before&lt;/em&gt; running the code. So I built one.&lt;/p&gt;

&lt;h2&gt;
  
  
  What reachscan does
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/vinmay/reachscan" rel="noopener noreferrer"&gt;reachscan&lt;/a&gt; is a static analysis CLI for Python and TypeScript/JavaScript AI agent codebases. Point it at a repo, a PyPI package, or an MCP endpoint, and it tells you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What the code can do (shell exec, file access, network calls, credential access, dynamic code execution)&lt;/li&gt;
&lt;li&gt;Which of those capabilities the LLM can actually trigger (reachability analysis)&lt;/li&gt;
&lt;li&gt;The exact call path from the LLM entry point to the dangerous code
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;reachscan

&lt;span class="c"&gt;# Scan a GitHub repo&lt;/span&gt;
reachscan https://github.com/user/repo

&lt;span class="c"&gt;# Scan a PyPI package before installing&lt;/span&gt;
reachscan pypi:some-agent-package

&lt;span class="c"&gt;# Scan local code&lt;/span&gt;
reachscan ./my-agent
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. No config, no API keys, no cloud service. It runs offline and produces a report in about 2 seconds.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem
&lt;/h2&gt;

&lt;p&gt;When you give an LLM tools, you're granting it real-world capabilities like file access, shell commands, network calls, credential reads. Most frameworks make it easy to add tools and hard to audit what you've exposed.&lt;/p&gt;

&lt;p&gt;Here's real code from a popular MCP server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@mcp.tool&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;execute_blender_code&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;code&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Execute arbitrary Python code in Blender.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;blender&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_blender_connection&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;blender&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send_command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;execute_code&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;code&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;code&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That &lt;code&gt;code: str&lt;/code&gt; parameter? It ends up here:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nf"&gt;exec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;code&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bpy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;bpy&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;  &lt;span class="c1"&gt;# No __builtins__ restriction
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;namespace = {"bpy": bpy}&lt;/code&gt; &lt;em&gt;looks&lt;/em&gt; like a sandbox. It isn't. Without explicitly setting &lt;code&gt;__builtins__&lt;/code&gt;, Python injects the full builtins module. The LLM can &lt;code&gt;import os&lt;/code&gt;, run &lt;code&gt;subprocess&lt;/code&gt;, read your files — anything.&lt;/p&gt;

&lt;p&gt;Here's what reachscan shows for this server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  DYNAMIC  exec()                  server.py:431  reachable
           path: execute_blender_code → send_command → execute_code

  EXECUTE  subprocess.run()        addon.py:89    reachable

  SEND     requests.post()         server.py:198  reachable
           path: generate_3d_model → _call_api

  SECRETS  os.environ[...]         server.py:12   module_level
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;reachable&lt;/code&gt; tag is the key part. It means the LLM can trigger this code through a registered tool and not just that the code exists somewhere in the repo. &lt;code&gt;module_level&lt;/code&gt; means it runs on import. &lt;code&gt;unreachable&lt;/code&gt; means the code exists but no LLM call path leads to it.&lt;/p&gt;

&lt;h2&gt;
  
  
  How it works (briefly)
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Detectors&lt;/strong&gt; scan the AST for 7 capability categories: EXECUTE, READ, WRITE, SEND, SECRETS, DYNAMIC, AUTONOMY&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Entry point detection&lt;/strong&gt; finds the functions exposed to the LLM — &lt;code&gt;@tool&lt;/code&gt;, &lt;code&gt;@mcp.tool()&lt;/code&gt;, &lt;code&gt;@function_tool&lt;/code&gt;, &lt;code&gt;BaseTool&lt;/code&gt; subclasses, etc. across LangChain, OpenAI Agents SDK, MCP, Pydantic AI, CrewAI, Semantic Kernel, and AutoGen&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Call graph + BFS&lt;/strong&gt; traces up to 8 hops from each entry point to determine which capabilities are actually reachable&lt;/li&gt;
&lt;li&gt;Every finding gets one of 5 states: &lt;code&gt;reachable&lt;/code&gt;, &lt;code&gt;unreachable&lt;/code&gt;, &lt;code&gt;module_level&lt;/code&gt;, &lt;code&gt;unknown&lt;/code&gt;, &lt;code&gt;no_entry_points&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The false positive rate is 0.47% across 1,912 labeled findings on 10 real-world repos. I care about this number a lot because a noisy scanner is a useless scanner.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I built it
&lt;/h2&gt;

&lt;p&gt;The short version: I was evaluating third-party MCP servers and realized there was no &lt;code&gt;npm audit&lt;/code&gt; equivalent for AI agent code. I could run &lt;code&gt;pip audit&lt;/code&gt; to check for known vulnerabilities in dependencies, but nothing told me "this package gives the LLM shell access on your machine."&lt;/p&gt;

&lt;p&gt;The existing tools I found either:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Require API calls per scan (expensive, not offline)&lt;/li&gt;
&lt;li&gt;Produce flat capability lists without reachability context (noisy)&lt;/li&gt;
&lt;li&gt;Don't handle the MCP/agent-specific entry point patterns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So I built the tool I wanted.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it found across 50 real MCP servers
&lt;/h2&gt;

&lt;p&gt;I ran reachscan against 50 of the most popular MCP server repos:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;1 in 3&lt;/strong&gt; has shell execution capability&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;1 in 3&lt;/strong&gt; has outbound network I/O&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;1 in 4&lt;/strong&gt; accesses credentials from environment variables&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;10 of 50&lt;/strong&gt; had 4+ capabilities active simultaneously&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The highest-risk combination: credential access + network egress. That appeared in 8 of 50 repos. If the LLM can read your AWS keys AND make HTTP calls, that's an exfiltration path.&lt;/p&gt;

&lt;p&gt;Not all of these are bugs. An AWS MCP server &lt;em&gt;should&lt;/em&gt; talk to AWS. The question is whether the LLM can misuse those capabilities — and whether you know about them before you deploy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;reachscan

&lt;span class="c"&gt;# Scan any GitHub repo&lt;/span&gt;
reachscan https://github.com/ahujasid/blender-mcp

&lt;span class="c"&gt;# Scan a PyPI package before installing&lt;/span&gt;
reachscan pypi:openai-agents

&lt;span class="c"&gt;# JSON output for CI&lt;/span&gt;
reachscan &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--json&lt;/span&gt; &lt;span class="nt"&gt;--severity&lt;/span&gt; high
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Apache 2.0, pure Python, runs offline. No API keys, no cloud service.&lt;/p&gt;

&lt;p&gt;If something looks wrong — false positive, missed pattern, bad output — &lt;a href="https://github.com/vinmay/reachscan/issues" rel="noopener noreferrer"&gt;open an issue&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/vinmay/reachscan" rel="noopener noreferrer"&gt;vinmay/reachscan&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;PyPI:&lt;/strong&gt; &lt;a href="https://pypi.org/project/reachscan/" rel="noopener noreferrer"&gt;reachscan&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Full scan results (50 repos):&lt;/strong&gt; &lt;a href="https://medium.com/@vinmayN/i-scanned-50-mcp-servers-to-see-what-they-can-actually-do-46144659ceca" rel="noopener noreferrer"&gt;Medium writeup&lt;/a&gt;&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>agents</category>
      <category>security</category>
      <category>python</category>
    </item>
    <item>
      <title>I scanned 50 MCP servers to see what they can actually do — here's what I found</title>
      <dc:creator>vinmay</dc:creator>
      <pubDate>Thu, 12 Mar 2026 03:35:44 +0000</pubDate>
      <link>https://forem.com/vinmay/i-scanned-50-mcp-servers-to-see-what-they-can-actually-do-heres-what-i-found-49m5</link>
      <guid>https://forem.com/vinmay/i-scanned-50-mcp-servers-to-see-what-they-can-actually-do-heres-what-i-found-49m5</guid>
      <description>&lt;p&gt;One of the 50 MCP servers I scanned gives the LLM a full Python shell &lt;br&gt;
on your machine. The tool is called &lt;code&gt;execute_blender_code&lt;/code&gt;. The &lt;code&gt;exec()&lt;/code&gt; &lt;br&gt;
call has no builtin restriction. I verified it — imports, file reads, &lt;br&gt;
subprocess execution all work.&lt;/p&gt;

&lt;p&gt;That's what I built &lt;a href="https://github.com/vinmay/reachscan" rel="noopener noreferrer"&gt;reachscan&lt;/a&gt; to find.&lt;/p&gt;
&lt;h2&gt;
  
  
  The problem
&lt;/h2&gt;

&lt;p&gt;MCP servers aren't plugins in a sandboxed browser extension model. They &lt;br&gt;
run as normal OS processes with your user permissions. If a server calls &lt;br&gt;
&lt;code&gt;subprocess.run()&lt;/code&gt;, the LLM can trigger shell commands. If it calls &lt;br&gt;
&lt;code&gt;exec()&lt;/code&gt; without restricting builtins, the LLM gets Python execution on &lt;br&gt;
your machine.&lt;/p&gt;

&lt;p&gt;Most people don't know which of the servers they're running fall into &lt;br&gt;
which category.&lt;/p&gt;
&lt;h2&gt;
  
  
  What reachscan does
&lt;/h2&gt;

&lt;p&gt;It's a static analysis CLI for Python and TypeScript/JavaScript agent &lt;br&gt;
codebases. It maps seven capability categories:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Capability&lt;/th&gt;
&lt;th&gt;What it means&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;EXECUTE&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Shell execution via subprocess, os.system&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;READ/WRITE&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Local filesystem access&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;SEND&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Outbound HTTP, sockets&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;SECRETS&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Env var credential access&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;DYNAMIC&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;exec(), eval(), importlib&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;AUTONOMY&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Background threads, autonomous loops&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Entry points&lt;/td&gt;
&lt;td&gt;LLM-callable tool registrations&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The key distinction from a linter: reachscan tracks &lt;strong&gt;reachability&lt;/strong&gt;. &lt;br&gt;
A &lt;code&gt;subprocess.run()&lt;/code&gt; buried in dead code is a different risk than one &lt;br&gt;
called directly from a registered MCP tool.&lt;/p&gt;

&lt;p&gt;Install it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pipx &lt;span class="nb"&gt;install &lt;/span&gt;reachscan
reachscan https://github.com/any-org/any-mcp-server
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;False positive rate: &lt;strong&gt;0.47%&lt;/strong&gt; across a labeled corpus of ~3,900 findings.&lt;/p&gt;

&lt;h2&gt;
  
  
  The headline finding: blender-mcp
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/ahujasid/blender-mcp" rel="noopener noreferrer"&gt;blender-mcp&lt;/a&gt; has a registered &lt;br&gt;
MCP tool called &lt;code&gt;execute_blender_code&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@mcp.tool&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;execute_blender_code&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;code&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Execute arbitrary Python code in Blender.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;blender&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_blender_connection&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;blender&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send_command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;execute_code&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;code&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;code&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That string travels over a local TCP socket to the Blender addon:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;execute_code&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;code&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;namespace&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bpy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;bpy&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="nf"&gt;exec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;code&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;namespace&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# ← line 431
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Why &lt;code&gt;namespace = {"bpy": bpy}&lt;/code&gt; doesn't protect you
&lt;/h2&gt;

&lt;p&gt;This looks like a sandbox. It isn't.&lt;/p&gt;

&lt;p&gt;When you call &lt;code&gt;exec(code, namespace)&lt;/code&gt; without setting &lt;br&gt;
&lt;code&gt;namespace["__builtins__"]&lt;/code&gt;, Python automatically injects the full &lt;br&gt;
builtins module. I verified this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;namespace&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bpy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;object&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;

&lt;span class="nf"&gt;exec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;import os; print(os.getcwd())&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;namespace&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# → /home/user/...  ✓
&lt;/span&gt;
&lt;span class="nf"&gt;exec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;print(open(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/etc/hostname&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;).read())&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;namespace&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# → my-machine  ✓
&lt;/span&gt;
&lt;span class="nf"&gt;exec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;import subprocess; r = subprocess.run([&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;], &lt;/span&gt;&lt;span class="se"&gt;\
&lt;/span&gt;&lt;span class="s"&gt;capture_output=True, text=True); print(r.stdout)&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;namespace&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# → uid=1000(user) gid=1000(user)  ✓
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The LLM doesn't just get Blender API access. It gets full Python &lt;br&gt;
execution on the host.&lt;/p&gt;
&lt;h2&gt;
  
  
  The same primitive, done right
&lt;/h2&gt;

&lt;p&gt;Finding &lt;code&gt;exec()&lt;/code&gt; in a scan result doesn't automatically mean critical. &lt;br&gt;
Two projects in this dataset handle it correctly:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;awslabs/mcp&lt;/strong&gt; — strips builtins before exec():&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;namespace&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;__builtins__&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;_SAFE_BUILTINS&lt;/span&gt;
&lt;span class="c1"&gt;# Removes: __import__, exec, eval, compile, open, getattr...
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;StarRocks MCP&lt;/strong&gt; — validates the AST before eval():&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nf"&gt;validate_plotly_expr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;plotly_expr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# must be exactly px.&amp;lt;method&amp;gt;(...)
&lt;/span&gt;&lt;span class="n"&gt;fig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;eval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;plotly_expr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;px&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;px&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="n"&gt;local_vars&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same pattern. Very different trust boundary.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results across 50 repos
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;1,139 findings across 18 repos with non-zero activity&lt;/li&gt;
&lt;li&gt;1 in 3 repos has shell execution capability&lt;/li&gt;
&lt;li&gt;1 in 4 accesses credentials from environment variables&lt;/li&gt;
&lt;li&gt;10 of 50 had 4+ capabilities active simultaneously&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The TypeScript gap
&lt;/h2&gt;

&lt;p&gt;22 repos showed 0 capability findings — many are TypeScript-only. &lt;br&gt;
reachscan detects TypeScript entry points (452 found) but doesn't yet &lt;br&gt;
analyze TypeScript function bodies. "Clean" for a TS-heavy repo means &lt;br&gt;
"no Python capability findings" — not verified safe.&lt;/p&gt;

&lt;p&gt;TypeScript capability analysis is next.&lt;/p&gt;
&lt;h2&gt;
  
  
  Main takeaway
&lt;/h2&gt;

&lt;p&gt;Adoption is moving faster than visibility. Treat MCP servers like &lt;br&gt;
privileged code, not plugins. Audit tool boundaries before you deploy.&lt;/p&gt;

&lt;p&gt;Try it on any MCP server or agent repo:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pipx &lt;span class="nb"&gt;install &lt;/span&gt;reachscan
reachscan https://github.com/any-org/any-mcp-server
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;GitHub: &lt;a href="https://github.com/vinmay/reachscan" rel="noopener noreferrer"&gt;github.com/vinmay/reachscan&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Full writeup with all 50 results and methodology: &lt;br&gt;
&lt;a href="https://medium.com/@vinmayN/i-scanned-50-mcp-servers-to-see-what-they-can-actually-do-46144659ceca" rel="noopener noreferrer"&gt;Medium article&lt;/a&gt;&lt;/p&gt;

</description>
      <category>security</category>
      <category>mcp</category>
      <category>python</category>
      <category>agents</category>
    </item>
  </channel>
</rss>
