<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: William Baker </title>
    <description>The latest articles on Forem by William Baker  (@asterview).</description>
    <link>https://forem.com/asterview</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3912742%2Fe6c5de6a-38ad-462c-88c9-1c05579eb3b2.png</url>
      <title>Forem: William Baker </title>
      <link>https://forem.com/asterview</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/asterview"/>
    <language>en</language>
    <item>
      <title>Building a Multi-Agent Fleet with No Central Server</title>
      <dc:creator>William Baker </dc:creator>
      <pubDate>Fri, 08 May 2026 23:20:02 +0000</pubDate>
      <link>https://forem.com/asterview/building-a-multi-agent-fleet-with-no-central-server-12fp</link>
      <guid>https://forem.com/asterview/building-a-multi-agent-fleet-with-no-central-server-12fp</guid>
      <description>&lt;p&gt;Most multi-agent architectures have the same shape: a coordinator talks to workers through a central hub. The hub is usually a message queue, a shared database, or an orchestration service like Ray or Temporal.&lt;/p&gt;

&lt;p&gt;That hub is also the first thing that breaks. It's a single point of failure, a scaling bottleneck, and an operational cost you pay even when the agents aren't working.&lt;/p&gt;

&lt;p&gt;Here's how to build a fleet where agents find each other and route tasks without any central intermediary.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Central Hub Problem
&lt;/h2&gt;

&lt;p&gt;When you're spinning up a 5-agent prototype, a central coordinator makes sense. It's simple, debuggable, and gets out of your way.&lt;/p&gt;

&lt;p&gt;At 50 agents it starts to fray. At 500 it becomes your hardest reliability problem.&lt;/p&gt;

&lt;p&gt;The hub becomes a global lock. Every message goes through it. Every failure cascades through it. Every scaling decision has to account for it.&lt;/p&gt;

&lt;p&gt;The alternative — having agents discover and contact each other directly — sounds appealing but has historically been hard. How does Agent A know Agent B's address? How do you handle NAT traversal? How do you authenticate the connection?&lt;/p&gt;

&lt;p&gt;These are solved problems in networking. We just haven't applied the solutions to agents until now.&lt;/p&gt;




&lt;h2&gt;
  
  
  Peer-to-Peer at the Session Layer
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://pilotprotocol.network/" rel="noopener noreferrer"&gt;Pilot Protocol&lt;/a&gt; operates at OSI Layer 5 — the session layer, the same slot TLS occupies for the web. It gives each agent:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A permanent 48-bit address (&lt;code&gt;0:A91F.0000.7C2E&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Automatic NAT traversal (STUN → hole-punch → relay fallback for symmetric NATs)&lt;/li&gt;
&lt;li&gt;End-to-end encrypted tunnels (X25519 key exchange, AES-256-GCM, Ed25519 identity)&lt;/li&gt;
&lt;li&gt;A global directory (the backbone) for agent discovery&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With Pilot, the hub isn't a server you run. It's the network itself — and the network is maintained by the protocol, not by your ops team.&lt;/p&gt;




&lt;h2&gt;
  
  
  A Fleet Pattern That Actually Works
&lt;/h2&gt;

&lt;p&gt;Here's a concrete pattern for a research fleet:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Coordinator agent
    ↓ Pilot (P2P, encrypted)
[Specialist A] [Specialist B] [Specialist C]
    ↓                ↓               ↓
  Papers           FX data       News feeds
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each specialist registers its capabilities on the Pilot backbone when it starts. The coordinator queries the backbone — "I need a peer that can resolve academic citations" — and gets back the address of Specialist A. Direct connection from there.&lt;/p&gt;

&lt;p&gt;No service registry you maintain. No hardcoded addresses. No configuration file you update when a worker moves.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Code
&lt;/h2&gt;

&lt;p&gt;Getting an agent online:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://pilotprotocol.network/install.sh | sh
pilotctl daemon start &lt;span class="nt"&gt;--hostname&lt;/span&gt; coordinator
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. The agent is addressable, authenticated, and reachable from any other Pilot peer — regardless of NAT, firewall, or cloud region.&lt;/p&gt;

&lt;p&gt;For the specialists:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# On each worker node&lt;/span&gt;
pilotctl daemon start &lt;span class="nt"&gt;--hostname&lt;/span&gt; specialist-papers
pilotctl daemon start &lt;span class="nt"&gt;--hostname&lt;/span&gt; specialist-fx
pilotctl daemon start &lt;span class="nt"&gt;--hostname&lt;/span&gt; specialist-news
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each one joins the backbone automatically. The coordinator can ping them:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pilotctl ping specialist-papers
&lt;span class="c"&gt;# ✓ reply from 0:4B2E.0000.1A3D · 22ms&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Self-Organization: How Groups Work
&lt;/h2&gt;

&lt;p&gt;Beyond individual peer connections, Pilot has a concept of groups — clusters of agents that self-organize around a shared domain.&lt;/p&gt;

&lt;p&gt;A trading fleet might form a TRADING group. A research fleet might join RESEARCH. Agents within a group can broadcast to all members or route to the most relevant peer within the domain.&lt;/p&gt;

&lt;p&gt;This is closer to how human organizations actually work: a new employee joins the company and immediately has access to colleagues in their department, not just a single manager they have to route everything through.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://polo.pilotprotocol.network" rel="noopener noreferrer"&gt;Pilot network status&lt;/a&gt; page shows these groups live: BACKBONE, TRAVEL, TRADING, RESEARCH, INSURANCE, and more, with real-time agent counts.&lt;/p&gt;




&lt;h2&gt;
  
  
  What You Give Up
&lt;/h2&gt;

&lt;p&gt;Centralized orchestration isn't all downside. You give up some things going P2P:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Observability.&lt;/strong&gt; A central hub is easy to instrument. A P2P mesh requires distributed tracing from day one. Plan for this.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Debuggability.&lt;/strong&gt; When something goes wrong, "what was the message queue state at time T" is easier to answer than "what was the P2P graph state." Log aggressively at the agent level.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Simplicity.&lt;/strong&gt; For a 3-agent prototype, a coordinator is simpler. P2P earns its complexity at scale.&lt;/p&gt;




&lt;h2&gt;
  
  
  When to Switch
&lt;/h2&gt;

&lt;p&gt;The right time to move to a P2P architecture is usually later than you think but earlier than you want. Signals that you're ready:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You're spending meaningful eng time on coordinator reliability&lt;/li&gt;
&lt;li&gt;Agents in different cloud regions are paying latency costs to route through a central server&lt;/li&gt;
&lt;li&gt;You want agents from different operators to collaborate without giving either access to your infrastructure&lt;/li&gt;
&lt;li&gt;Your fleet is growing fast enough that a central bottleneck is becoming a scaling conversation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If two or more of those are true, the session-layer approach is worth the investment.&lt;/p&gt;




&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://pilotprotocol.network/docs/" rel="noopener noreferrer"&gt;Pilot Protocol documentation&lt;/a&gt; — addressing, groups, NAT traversal&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://pilotprotocol.network/for/setups" rel="noopener noreferrer"&gt;Multi-agent setups on Pilot&lt;/a&gt; — pre-wired fleet configurations&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://pilotprotocol.network/blog/ietf-internet-draft-pilot-protocol" rel="noopener noreferrer"&gt;The IETF Internet-Draft&lt;/a&gt; — the protocol spec if you want to go deep&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The network is live: ~163,000 agents, 12.7B+ requests routed, +28% growth in the past week.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;One line to get started: &lt;code&gt;curl -fsSL https://pilotprotocol.network/install.sh | sh&lt;/code&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>architecture</category>
      <category>distributedsystems</category>
    </item>
    <item>
      <title>Stop Making Your AI Agent Scrape the Web. There's a Better Way.</title>
      <dc:creator>William Baker </dc:creator>
      <pubDate>Fri, 08 May 2026 23:17:56 +0000</pubDate>
      <link>https://forem.com/asterview/stop-making-your-ai-agent-scrape-the-web-theres-a-better-way-36fl</link>
      <guid>https://forem.com/asterview/stop-making-your-ai-agent-scrape-the-web-theres-a-better-way-36fl</guid>
      <description>&lt;p&gt;There's an absurd loop at the heart of most AI agent architectures right now:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Agent needs data (a research paper, an FX rate, a flight status, a CVE)&lt;/li&gt;
&lt;li&gt;Agent calls a web scraper or fires an HTTP request to a public endpoint&lt;/li&gt;
&lt;li&gt;The endpoint returns HTML designed for a human to read in a browser&lt;/li&gt;
&lt;li&gt;Agent burns tokens parsing, cleaning, and extracting the actual value&lt;/li&gt;
&lt;li&gt;Agent retries when the scraper breaks because the page layout changed&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We've built genuinely intelligent agents and then made them spend half their time doing remedial text processing on documents that weren't meant for them.&lt;/p&gt;

&lt;p&gt;Let me show you what the alternative looks like.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Root Cause: Wrong Layer
&lt;/h2&gt;

&lt;p&gt;HTTP is a Layer 7 protocol built in 1991 to serve documents to human-operated browsers. It's brilliant at that. Every design decision — HTML rendering, cookies, sessions, REST conventions — optimizes for a human reading a page.&lt;/p&gt;

&lt;p&gt;Agents don't read pages. They consume structured data. They don't need the presentation layer, the session cookies, or the retry logic that only exists because the web assumed humans would be patient with slow servers.&lt;/p&gt;

&lt;p&gt;The right fix isn't a better scraper. It's operating at a different layer — one where agents talk directly to other agents that have already done the hard work of acquiring, normalizing, and maintaining the data you need.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Specialized Data Agents Look Like in Practice
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://pilotprotocol.network/" rel="noopener noreferrer"&gt;Pilot Protocol&lt;/a&gt; runs a network of ~163,000 agents. About 350 of them are specialized data service agents — peers that exist to answer a specific category of query cleanly and fast.&lt;/p&gt;

&lt;p&gt;Here's what a few of them replace:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Crossref specialist&lt;/strong&gt;&lt;br&gt;
Resolves a DOI against the global paper registry in one call. No scraping PubMed, no HTML parsing, no fighting rate limits. If you're building a legal research agent that needs to verify citations, this is one hop instead of a brittle pipeline.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Historical FX specialist&lt;/strong&gt;&lt;br&gt;
Spot rate at an arbitrary timestamp. Not today's rate from a public API that expires — the actual rate at the moment a transaction happened. Replaces three bank statement screenshots and a manual lookup.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Aviation weather specialist&lt;/strong&gt;&lt;br&gt;
Real-time METAR data for any airport. If your agent is managing travel or logistics, it gets structured weather data directly from a peer that's already watching the feeds, not from scraping a flight status page.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;crt.sh / certificate transparency specialist&lt;/strong&gt;&lt;br&gt;
Streams CT hits on your domains. Your security agent gets new certificate issuances the moment they appear, not after the next cron runs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;FDA recalls specialist&lt;/strong&gt;&lt;br&gt;
Filters against the live recall feed for a specific condition or ingredient. No crawling FDA's website, no pagination, no HTML tables.&lt;/p&gt;

&lt;p&gt;The pattern is consistent: instead of your agent scraping a source and parsing the result, a specialist on the network has already done that work — once, for everyone — and serves structured answers directly.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Network Effect That Makes This Work
&lt;/h2&gt;

&lt;p&gt;The reason this improves over time is the same reason any network improves: each new agent adds value for every existing one.&lt;/p&gt;

&lt;p&gt;When a new operator connects their SEC filing parser to Pilot, every agent on the network gains access to cleaner financial data without writing any code. When a localization agent joins that has a native speaker in Manchester on the other end, every agent building for UK markets benefits.&lt;/p&gt;

&lt;p&gt;Pilot calls this "a hive mind that gets smarter with every new agent." It's less poetic if you think about it mechanically: it's a network with positive externalities, where the marginal cost of adding a new data source approaches zero for consumers.&lt;/p&gt;

&lt;p&gt;Compare that to the current model, where every agent team independently builds and maintains scrapers for the same 20 data sources. The waste is staggering.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Latency Numbers
&lt;/h2&gt;

&lt;p&gt;From the Pilot benchmarks: &lt;strong&gt;12 seconds on Pilot vs 51 seconds via the web&lt;/strong&gt; for equivalent data retrieval tasks.&lt;/p&gt;

&lt;p&gt;That's not a small difference. It's a 4x reduction in wall-clock time for the same result. In an agentic pipeline where you're making dozens of these calls, that's the difference between a task that completes in a minute and one that takes five.&lt;/p&gt;

&lt;p&gt;The speed comes from two places:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;No parsing overhead&lt;/strong&gt; — the data arrives structured, not as HTML you have to strip&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;UDP transport&lt;/strong&gt; — Pilot runs peer-to-peer over UDP with its own reliable-stream layer, avoiding the head-of-line blocking that makes TCP slow for parallel requests&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Getting Your Agent Connected
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install Pilot (single static binary, no SDK, no API key)&lt;/span&gt;
curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://pilotprotocol.network/install.sh | sh

&lt;span class="c"&gt;# Start the daemon&lt;/span&gt;
pilotctl daemon start &lt;span class="nt"&gt;--hostname&lt;/span&gt; my-research-agent

&lt;span class="c"&gt;# Your agent is now on the network&lt;/span&gt;
&lt;span class="c"&gt;# Address: 0:A91F.0000.7C2E&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From there, your agent can query the backbone for any of the 350+ service agents by capability. No URL directory to maintain, no API keys to manage per-service.&lt;/p&gt;




&lt;h2&gt;
  
  
  When You Still Need the Web
&lt;/h2&gt;

&lt;p&gt;To be direct: Pilot doesn't replace the web for everything. If you need to take a screenshot of a specific page, or submit a form on a site that has no API, you still need a browser or a scraper.&lt;/p&gt;

&lt;p&gt;But for structured data — the kind that lives behind an API or in a database somewhere — the web route is almost never the right choice for an agent. The data exists, someone has it clean, and there's now an agent network where you can get it directly.&lt;/p&gt;

&lt;p&gt;The scraping loop is a workaround. The network is the fix.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Pilot Protocol: &lt;a href="https://pilotprotocol.network/" rel="noopener noreferrer"&gt;pilotprotocol.network&lt;/a&gt; — peer-to-peer encrypted tunnels for agents, one line of code, no central dependency.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>webdev</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Why Your MCP Server Needs a Network Layer (And How to Add One in 30 Seconds)</title>
      <dc:creator>William Baker </dc:creator>
      <pubDate>Fri, 08 May 2026 23:14:00 +0000</pubDate>
      <link>https://forem.com/asterview/why-your-mcp-server-needs-a-network-layer-and-how-to-add-one-in-30-seconds-3mbh</link>
      <guid>https://forem.com/asterview/why-your-mcp-server-needs-a-network-layer-and-how-to-add-one-in-30-seconds-3mbh</guid>
      <description>&lt;p&gt;You've got an MCP server running. Locally, it's perfect. Then someone asks: "Can another agent on a different machine call it?"&lt;/p&gt;

&lt;p&gt;You spin up a VPN. Or punch a hole in the firewall. Or route it through a cloud proxy. Half a day gone, and now you've got a central dependency you didn't want.&lt;/p&gt;

&lt;p&gt;There's a cleaner way.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem with MCP's Transport Layer
&lt;/h2&gt;

&lt;p&gt;MCP is genuinely great at what it does: connecting an agent to its tools via a clean, structured protocol. But it was designed with a human-run server in mind. The transport story is essentially "use HTTP" or "use stdio." Both assume you control both endpoints and they can reach each other.&lt;/p&gt;

&lt;p&gt;In 2026, that assumption breaks constantly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Agent A is on AWS, Agent B is behind a corporate NAT&lt;/li&gt;
&lt;li&gt;You want two agents from different operators to collaborate without either exposing a public endpoint&lt;/li&gt;
&lt;li&gt;You're building a fleet where agents need to discover &lt;em&gt;and&lt;/em&gt; call each other dynamically&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;MCP doesn't solve this. It isn't supposed to — it's an application-layer protocol. The transport is your problem.&lt;/p&gt;

&lt;p&gt;Until now, "your problem" meant a lot of yak shaving.&lt;/p&gt;




&lt;h2&gt;
  
  
  What a Session Layer Gives You
&lt;/h2&gt;

&lt;p&gt;The OSI model has a slot for exactly this: Layer 5, the session layer. It's the layer that manages connections between peers — maintaining them, authenticating them, and routing them across NATs.&lt;/p&gt;

&lt;p&gt;The web uses TLS here. Agents need something that speaks agent.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://pilotprotocol.network/" rel="noopener noreferrer"&gt;Pilot Protocol&lt;/a&gt; is a peer-to-peer network built specifically for this slot. Instead of routing agent traffic through HTTP (a document protocol built for browsers), Pilot operates at UDP with its own reliable-stream layer on top — X25519 key exchange, AES-256-GCM per tunnel, Ed25519 identity, automatic NAT traversal via STUN + hole-punching.&lt;/p&gt;

&lt;p&gt;Each agent gets a 48-bit address. Direct, authenticated, no intermediary required.&lt;/p&gt;




&lt;h2&gt;
  
  
  One Line of Code
&lt;/h2&gt;

&lt;p&gt;Here's what adding Pilot to your MCP server actually looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://pilotprotocol.network/install.sh | sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That installs a single static binary. No SDK. No API key. No account.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pilotctl daemon start &lt;span class="nt"&gt;--hostname&lt;/span&gt; my-mcp-server
&lt;span class="c"&gt;# Daemon running (pid 24817)&lt;/span&gt;
&lt;span class="c"&gt;# Address: 0:A91F.0000.7C2E&lt;/span&gt;
&lt;span class="c"&gt;# Hostname: my-mcp-server&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your MCP server now has a Pilot address. Any other agent on the network — regardless of what NAT it's behind — can reach it directly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pilotctl ping agent-alpha
&lt;span class="c"&gt;# ✓ reply from 0:4B2E.0000.1A3D · 38ms&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No VPN. No public endpoint. No relay server you have to run.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why UDP, Not TCP?
&lt;/h2&gt;

&lt;p&gt;TCP is great for browsers loading pages. It wasn't designed for the round-trip latency profile of agent-to-agent calls.&lt;/p&gt;

&lt;p&gt;Head-of-line blocking is the killer: if one packet is dropped, everything queues behind it. For a browser loading a web page, that's fine — you're waiting for HTML to render anyway. For an agent making 50 parallel data requests, it's a disaster.&lt;/p&gt;

&lt;p&gt;Pilot runs UDP with its own reliable-stream implementation: sliding window, AIMD congestion control, selective acknowledgement (SACK). You get reliability without the head-of-line blocking tax. The benchmark from the Pilot homepage: &lt;strong&gt;12s on Pilot vs 51s via the web&lt;/strong&gt; for the same data retrieval task.&lt;/p&gt;




&lt;h2&gt;
  
  
  The MCP + Pilot Pattern
&lt;/h2&gt;

&lt;p&gt;The natural pairing looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Agent A (MCP client)
    ↓ Pilot tunnel (encrypted, P2P)
Agent B (MCP server)
    ↓ MCP tool calls
Tools / data / capabilities
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pilot handles the transport: addressing, NAT traversal, encryption. MCP handles the application layer: tool definitions, structured responses. Neither replaces the other.&lt;/p&gt;

&lt;p&gt;Pilot even has a dedicated page for this pattern: &lt;a href="https://pilotprotocol.network/for/mcp" rel="noopener noreferrer"&gt;MCP + Pilot&lt;/a&gt; — your MCP server gets a network address and becomes reachable from anywhere on the Pilot network.&lt;/p&gt;




&lt;h2&gt;
  
  
  Discovery Is Solved Too
&lt;/h2&gt;

&lt;p&gt;Once your server is on Pilot, it joins the backbone — a global directory where agents can find peers by capability rather than by hostname.&lt;/p&gt;

&lt;p&gt;That means another agent can query "I need a tool that does X" and Pilot routes it to you, without you publishing a URL anywhere. Agent discovery stops being a directory you maintain and becomes a property of the network itself.&lt;/p&gt;

&lt;p&gt;There are already 350+ specialized service agents on the backbone: Crossref for paper lookups, historical FX data, aviation weather, crt.sh for certificate transparency, FDA recalls. They're just peers on the network.&lt;/p&gt;




&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;p&gt;MCP is the right protocol for tool-calling. But it needs a transport layer that wasn't designed for humans loading documents in browsers.&lt;/p&gt;

&lt;p&gt;Adding Pilot solves the NAT problem, the discovery problem, and the "two agents from different operators need to talk" problem — in one binary, one command.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://pilotprotocol.network/install.sh | sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then go back to building the agent, not the plumbing.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Pilot Protocol is live at &lt;a href="https://pilotprotocol.network/" rel="noopener noreferrer"&gt;pilotprotocol.network&lt;/a&gt; — ~163,000 agents, 12.7B+ requests routed, published as an IETF Internet-Draft.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>mcp</category>
      <category>networking</category>
    </item>
    <item>
      <title>How to Deploy Multi-Agent Systems Cross-Cloud[Python]</title>
      <dc:creator>William Baker </dc:creator>
      <pubDate>Mon, 04 May 2026 20:21:24 +0000</pubDate>
      <link>https://forem.com/asterview/how-to-deploy-multi-agent-systems-cross-cloudpython-576a</link>
      <guid>https://forem.com/asterview/how-to-deploy-multi-agent-systems-cross-cloudpython-576a</guid>
      <description>&lt;p&gt;&lt;strong&gt;Quick Answer:&lt;/strong&gt; To connect AI agents across different cloud environments, developers must replace synchronous HTTP with asynchronous brokers like &lt;strong&gt;Celery&lt;/strong&gt; and &lt;strong&gt;Redis&lt;/strong&gt;, externalize state memory, secure tool execution using the &lt;strong&gt;Model Context Protocol (MCP)&lt;/strong&gt;, bypass strict NAT firewalls via &lt;strong&gt;Pilot Protocol&lt;/strong&gt; transport, and trace distributed workflows with &lt;strong&gt;OpenTelemetry&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Deploying a &lt;strong&gt;Multi-Agent System (MAS)&lt;/strong&gt; across distributed cloud environments instantly breaks standard local network assumptions. To maintain cross-cloud agent communication, engineers must abandon synchronous local testing patterns and implement asynchronous task delegation, stateless container memory, decoupled tool execution, and decentralized peer-to-peer networking. &lt;/p&gt;

&lt;p&gt;Standard &lt;strong&gt;REST APIs&lt;/strong&gt; fail in production because &lt;strong&gt;Large Language Model (LLM)&lt;/strong&gt; inference introduces variable latency, causing synchronous HTTP requests to time out. Furthermore, when scaling an orchestrator agent on &lt;strong&gt;AWS&lt;/strong&gt; and specialized worker agents on &lt;strong&gt;GCP&lt;/strong&gt;, relying on standard TCP/IP routing leads to continuous IP churn and blocked connections at corporate &lt;strong&gt;NAT firewalls&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;The reality of distributed multi-agent architecture is that you are building an emergent private internet for autonomous software. Here are five architectural implementations required to connect agents across disparate cloud networks.&lt;/p&gt;

&lt;h3&gt;
  
  
  Synchronous HTTP Will Throttle Your Agent Architecture
&lt;/h3&gt;

&lt;p&gt;When scaling from one agent to two, developers typically default to standard REST APIs where one agent sends a synchronous POST request to another. This fails in production because LLM inference times are highly variable. Generating a response or executing an unoptimized tool takes anywhere from ten to forty seconds. Cloud load balancers and standard HTTP clients time out waiting for the response, dropping the connection and forcing the agent to restart its entire reasoning loop.&lt;/p&gt;

&lt;p&gt;Cross-cloud agent communication must be asynchronous. Instead of blocking HTTP requests, agents must place delegation tasks into a distributed message broker. This allows the orchestrator agent to continue processing other inputs while the worker agent processes the task on a separate node.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Using Celery with Redis for async cross-cloud task delegation
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;celery&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Celery&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Celery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;agent_tasks&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;broker&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;redis://external-broker-url:6379/0&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@app.task&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;delegate_to_research_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# This runs on the GCP worker node asynchronously
&lt;/span&gt;    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;research_agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Store result in external database for the AWS agent to fetch later
&lt;/span&gt;    &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;store_result&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;delegate_to_research_agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;

&lt;span class="c1"&gt;# On the AWS orchestrator node: trigger without blocking
&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;delegate_to_research_agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;delay&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Analyze Q3 earnings&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;previous_context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Task dispatched with ID: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Ephemeral Containers Destroy Conversational State
&lt;/h3&gt;

&lt;p&gt;Agents running in auto-scaling cloud instances are ephemeral. If an agent process crashes mid-task due to an out-of-memory error from a massive context window, the container restarts. If conversational history and task trajectories are stored in the local memory of the agent process, the entire workflow vanishes upon restart.&lt;/p&gt;

&lt;p&gt;To survive node migrations, agent processes must be completely stateless. Every tool output, intermediate reasoning step, and user prompt should be immediately pushed to an external, globally accessible data store. Upon initialization, the agent rebuilds its context window by querying this external memory.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Externalizing agent state to Redis
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;

&lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Redis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;global-redis.internal&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;6379&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;save_agent_thought&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;step_data&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Push the latest reasoning step to a list
&lt;/span&gt;    &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rpush&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;agent_state:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;step_data&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;rebuild_context&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Rebuild state if the container restarts
&lt;/span&gt;    &lt;span class="n"&gt;raw_steps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lrange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;agent_state:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;raw_steps&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Managing Tool Execution Across Network Boundaries
&lt;/h3&gt;

&lt;p&gt;Hardcoding API keys and database connection strings into agent logic creates massive security vulnerabilities on untrusted cloud virtual machines. The agent reasoning loop should be strictly separated from tool execution permissions.&lt;/p&gt;

&lt;p&gt;The Model Context Protocol acts as the industry standard for this decoupling. By wrapping internal databases in an MCP server, you dictate exactly what data the agent can interact with using standardized JSON-RPC schemas. The cloud agent requests tool execution, and the secure MCP server executes it, ensuring the autonomous model never directly touches raw infrastructure credentials.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Connecting an agent to a secure MCP server across the network
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mcp&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ClientSession&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;StdioServerParameters&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mcp.client.stdio&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;stdio_client&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;query_secure_tool&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="c1"&gt;# The server parameters define the connection to the secure tool environment
&lt;/span&gt;    &lt;span class="n"&gt;server_params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;StdioServerParameters&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;python&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;secure_mcp_server.py&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;stdio_client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;server_params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nf"&gt;as &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;read&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;ClientSession&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;read&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;initialize&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

            &lt;span class="c1"&gt;# The agent discovers available tools dynamically
&lt;/span&gt;            &lt;span class="n"&gt;tools&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;list_tools&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

            &lt;span class="c1"&gt;# The agent executes the tool without seeing the underlying credentials
&lt;/span&gt;            &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;call_tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query_internal_db&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;arguments&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;target&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Q3_sales&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;query_secure_tool&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Overcoming IP Churn and NAT Firewalls for Direct Transport
&lt;/h3&gt;

&lt;p&gt;While the Model Context Protocol formats tool requests, it assumes the underlying network is already routable. Cloud containers face continuous IP churn, and enterprise networks utilize strict NAT firewalls. Exposing local tool servers across clouds usually requires Virtual Private Cloud peering or central API gateways, introducing latency and single points of failure.&lt;/p&gt;

&lt;p&gt;This transport problem requires assigning agents persistent cryptographic identities using Pilot Protocol. Instead of binding communication to fragile physical IPs, this userspace overlay network assigns a permanent 48-bit virtual address mathematically bound to an Ed25519 keypair. The pure-Go daemon utilizes automated UDP hole-punching to bypass strict firewalls and executes X25519 Elliptic Curve Diffie-Hellman key exchanges. This allows an orchestrator on AWS to communicate directly with a worker on a corporate network without reverse proxies.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install the pure-Go userspace network stack&lt;/span&gt;
curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://pilotprotocol.network/install.sh | sh

&lt;span class="c"&gt;# Initialize the daemon on the local secure machine (Node A)&lt;/span&gt;
pilotctl daemon start &lt;span class="nt"&gt;--hostname&lt;/span&gt; secure-mcp-tool

&lt;span class="c"&gt;# Initialize the daemon on the cloud VPS agent (Node B)&lt;/span&gt;
pilotctl daemon start &lt;span class="nt"&gt;--hostname&lt;/span&gt; cloud-worker-agent

&lt;span class="c"&gt;# Node B can now route directly to Node A bypassing the NAT&lt;/span&gt;
&lt;span class="c"&gt;# utilizing the underlying TCP-over-UDP transport layer&lt;/span&gt;
pilotctl connect secure-mcp-tool &lt;span class="nt"&gt;--message&lt;/span&gt; &lt;span class="s1"&gt;'{"jsonrpc": "2.0", "method": "call_tool"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Distributed Tracing is Mandatory for Agent Debugging
&lt;/h3&gt;

&lt;p&gt;When a cross-cloud multi-agent workflow fails, identifying the exact point of failure is difficult. If an orchestrator on Azure delegates a task to a researcher on GCP, and the GCP agent encounters a hallucination loop, local logs will only show a generic HTTP timeout.&lt;/p&gt;

&lt;p&gt;Implementing distributed tracing is non-negotiable for autonomous systems. Injecting trace context into payloads passed between clouds allows engineers to visualize the entire sequence of tool calls and prompt generations across network boundaries using OpenTelemetry standards.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Injecting OpenTelemetry trace IDs into cross-cloud payloads
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;opentelemetry&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;trace&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;opentelemetry.propagate&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;inject&lt;/span&gt;

&lt;span class="n"&gt;tracer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;trace&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_tracer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;dispatch_task_to_peer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agent_endpoint&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;tracer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;start_as_current_span&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cross_cloud_delegation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;span&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
        &lt;span class="c1"&gt;# Inject the current trace context into the headers or payload
&lt;/span&gt;        &lt;span class="nf"&gt;inject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Add the headers to the payload sent to the remote agent
&lt;/span&gt;        &lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;trace_context&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;

        &lt;span class="c1"&gt;# Standard request to the remote agent
&lt;/span&gt;        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agent_endpoint&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;span&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_attribute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;peer.response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>cloud</category>
      <category>ai</category>
      <category>agents</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>How to Deploy Multi-Agent Systems Cross-Cloud[Python]</title>
      <dc:creator>William Baker </dc:creator>
      <pubDate>Mon, 04 May 2026 20:21:24 +0000</pubDate>
      <link>https://forem.com/asterview/how-to-deploy-multi-agent-systems-cross-cloudpython-4n7c</link>
      <guid>https://forem.com/asterview/how-to-deploy-multi-agent-systems-cross-cloudpython-4n7c</guid>
      <description>&lt;p&gt;&lt;strong&gt;Quick Answer:&lt;/strong&gt; To connect AI agents across different cloud environments, developers must replace synchronous HTTP with asynchronous brokers like &lt;strong&gt;Celery&lt;/strong&gt; and &lt;strong&gt;Redis&lt;/strong&gt;, externalize state memory, secure tool execution using the &lt;strong&gt;Model Context Protocol (MCP)&lt;/strong&gt;, bypass strict NAT firewalls via &lt;strong&gt;Pilot Protocol&lt;/strong&gt; transport, and trace distributed workflows with &lt;strong&gt;OpenTelemetry&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Deploying a &lt;strong&gt;Multi-Agent System (MAS)&lt;/strong&gt; across distributed cloud environments instantly breaks standard local network assumptions. To maintain cross-cloud agent communication, engineers must abandon synchronous local testing patterns and implement asynchronous task delegation, stateless container memory, decoupled tool execution, and decentralized peer-to-peer networking. &lt;/p&gt;

&lt;p&gt;Standard &lt;strong&gt;REST APIs&lt;/strong&gt; fail in production because &lt;strong&gt;Large Language Model (LLM)&lt;/strong&gt; inference introduces variable latency, causing synchronous HTTP requests to time out. Furthermore, when scaling an orchestrator agent on &lt;strong&gt;AWS&lt;/strong&gt; and specialized worker agents on &lt;strong&gt;GCP&lt;/strong&gt;, relying on standard TCP/IP routing leads to continuous IP churn and blocked connections at corporate &lt;strong&gt;NAT firewalls&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;The reality of distributed multi-agent architecture is that you are building an emergent private internet for autonomous software. Here are five architectural implementations required to connect agents across disparate cloud networks.&lt;/p&gt;

&lt;h3&gt;
  
  
  Synchronous HTTP Will Throttle Your Agent Architecture
&lt;/h3&gt;

&lt;p&gt;When scaling from one agent to two, developers typically default to standard REST APIs where one agent sends a synchronous POST request to another. This fails in production because LLM inference times are highly variable. Generating a response or executing an unoptimized tool takes anywhere from ten to forty seconds. Cloud load balancers and standard HTTP clients time out waiting for the response, dropping the connection and forcing the agent to restart its entire reasoning loop.&lt;/p&gt;

&lt;p&gt;Cross-cloud agent communication must be asynchronous. Instead of blocking HTTP requests, agents must place delegation tasks into a distributed message broker. This allows the orchestrator agent to continue processing other inputs while the worker agent processes the task on a separate node.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Using Celery with Redis for async cross-cloud task delegation
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;celery&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Celery&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Celery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;agent_tasks&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;broker&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;redis://external-broker-url:6379/0&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@app.task&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;delegate_to_research_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# This runs on the GCP worker node asynchronously
&lt;/span&gt;    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;research_agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Store result in external database for the AWS agent to fetch later
&lt;/span&gt;    &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;store_result&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;delegate_to_research_agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;

&lt;span class="c1"&gt;# On the AWS orchestrator node: trigger without blocking
&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;delegate_to_research_agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;delay&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Analyze Q3 earnings&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;previous_context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Task dispatched with ID: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Ephemeral Containers Destroy Conversational State
&lt;/h3&gt;

&lt;p&gt;Agents running in auto-scaling cloud instances are ephemeral. If an agent process crashes mid-task due to an out-of-memory error from a massive context window, the container restarts. If conversational history and task trajectories are stored in the local memory of the agent process, the entire workflow vanishes upon restart.&lt;/p&gt;

&lt;p&gt;To survive node migrations, agent processes must be completely stateless. Every tool output, intermediate reasoning step, and user prompt should be immediately pushed to an external, globally accessible data store. Upon initialization, the agent rebuilds its context window by querying this external memory.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Externalizing agent state to Redis
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;

&lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Redis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;global-redis.internal&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;6379&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;save_agent_thought&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;step_data&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Push the latest reasoning step to a list
&lt;/span&gt;    &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rpush&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;agent_state:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;step_data&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;rebuild_context&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Rebuild state if the container restarts
&lt;/span&gt;    &lt;span class="n"&gt;raw_steps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lrange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;agent_state:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;raw_steps&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Managing Tool Execution Across Network Boundaries
&lt;/h3&gt;

&lt;p&gt;Hardcoding API keys and database connection strings into agent logic creates massive security vulnerabilities on untrusted cloud virtual machines. The agent reasoning loop should be strictly separated from tool execution permissions.&lt;/p&gt;

&lt;p&gt;The Model Context Protocol acts as the industry standard for this decoupling. By wrapping internal databases in an MCP server, you dictate exactly what data the agent can interact with using standardized JSON-RPC schemas. The cloud agent requests tool execution, and the secure MCP server executes it, ensuring the autonomous model never directly touches raw infrastructure credentials.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Connecting an agent to a secure MCP server across the network
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mcp&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ClientSession&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;StdioServerParameters&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mcp.client.stdio&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;stdio_client&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;query_secure_tool&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="c1"&gt;# The server parameters define the connection to the secure tool environment
&lt;/span&gt;    &lt;span class="n"&gt;server_params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;StdioServerParameters&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;python&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;secure_mcp_server.py&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;stdio_client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;server_params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nf"&gt;as &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;read&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;ClientSession&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;read&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;initialize&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

            &lt;span class="c1"&gt;# The agent discovers available tools dynamically
&lt;/span&gt;            &lt;span class="n"&gt;tools&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;list_tools&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

            &lt;span class="c1"&gt;# The agent executes the tool without seeing the underlying credentials
&lt;/span&gt;            &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;call_tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query_internal_db&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;arguments&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;target&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Q3_sales&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;query_secure_tool&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Overcoming IP Churn and NAT Firewalls for Direct Transport
&lt;/h3&gt;

&lt;p&gt;While the Model Context Protocol formats tool requests, it assumes the underlying network is already routable. Cloud containers face continuous IP churn, and enterprise networks utilize strict NAT firewalls. Exposing local tool servers across clouds usually requires Virtual Private Cloud peering or central API gateways, introducing latency and single points of failure.&lt;/p&gt;

&lt;p&gt;This transport problem requires assigning agents persistent cryptographic identities using Pilot Protocol. Instead of binding communication to fragile physical IPs, this userspace overlay network assigns a permanent 48-bit virtual address mathematically bound to an Ed25519 keypair. The pure-Go daemon utilizes automated UDP hole-punching to bypass strict firewalls and executes X25519 Elliptic Curve Diffie-Hellman key exchanges. This allows an orchestrator on AWS to communicate directly with a worker on a corporate network without reverse proxies.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install the pure-Go userspace network stack&lt;/span&gt;
curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://pilotprotocol.network/install.sh | sh

&lt;span class="c"&gt;# Initialize the daemon on the local secure machine (Node A)&lt;/span&gt;
pilotctl daemon start &lt;span class="nt"&gt;--hostname&lt;/span&gt; secure-mcp-tool

&lt;span class="c"&gt;# Initialize the daemon on the cloud VPS agent (Node B)&lt;/span&gt;
pilotctl daemon start &lt;span class="nt"&gt;--hostname&lt;/span&gt; cloud-worker-agent

&lt;span class="c"&gt;# Node B can now route directly to Node A bypassing the NAT&lt;/span&gt;
&lt;span class="c"&gt;# utilizing the underlying TCP-over-UDP transport layer&lt;/span&gt;
pilotctl connect secure-mcp-tool &lt;span class="nt"&gt;--message&lt;/span&gt; &lt;span class="s1"&gt;'{"jsonrpc": "2.0", "method": "call_tool"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Distributed Tracing is Mandatory for Agent Debugging
&lt;/h3&gt;

&lt;p&gt;When a cross-cloud multi-agent workflow fails, identifying the exact point of failure is difficult. If an orchestrator on Azure delegates a task to a researcher on GCP, and the GCP agent encounters a hallucination loop, local logs will only show a generic HTTP timeout.&lt;/p&gt;

&lt;p&gt;Implementing distributed tracing is non-negotiable for autonomous systems. Injecting trace context into payloads passed between clouds allows engineers to visualize the entire sequence of tool calls and prompt generations across network boundaries using OpenTelemetry standards.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Injecting OpenTelemetry trace IDs into cross-cloud payloads
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;opentelemetry&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;trace&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;opentelemetry.propagate&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;inject&lt;/span&gt;

&lt;span class="n"&gt;tracer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;trace&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_tracer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;dispatch_task_to_peer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agent_endpoint&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;tracer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;start_as_current_span&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cross_cloud_delegation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;span&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
        &lt;span class="c1"&gt;# Inject the current trace context into the headers or payload
&lt;/span&gt;        &lt;span class="nf"&gt;inject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Add the headers to the payload sent to the remote agent
&lt;/span&gt;        &lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;trace_context&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;

        &lt;span class="c1"&gt;# Standard request to the remote agent
&lt;/span&gt;        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agent_endpoint&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;span&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_attribute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;peer.response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>cloud</category>
      <category>ai</category>
      <category>agents</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
