<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: port</title>
    <description>The latest articles on Forem by port (@port).</description>
    <link>https://forem.com/port</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2459712%2Fd03984f1-d806-4886-ad9d-55a9e1c42928.jpeg</url>
      <title>Forem: port</title>
      <link>https://forem.com/port</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/port"/>
    <language>en</language>
    <item>
      <title>Simple just works: how i built puddleswap</title>
      <dc:creator>port</dc:creator>
      <pubDate>Wed, 20 May 2026 11:18:48 +0000</pubDate>
      <link>https://forem.com/port/simple-just-works-how-i-built-puddleswap-gmi</link>
      <guid>https://forem.com/port/simple-just-works-how-i-built-puddleswap-gmi</guid>
      <description>&lt;p&gt;Any problem yields to enough complexity.&lt;/p&gt;

&lt;p&gt;I caught myself almost doing exactly that on puddleswap. Here's how that went, plus the gut-check I run now before writing anything clever. If you ever feel yourself overengineering things, this is for you.&lt;/p&gt;

&lt;p&gt;I was at a &lt;a href="https://blitz.devnads.com" rel="noopener noreferrer"&gt;Monad Blitz&lt;/a&gt; event, if I am not mistaken it was the one in Ankara, and I was watching everyone around me hack on cool stuff while I sat in the corner answering their questions. I mean that's my job but it felt weird not building stuff.&lt;/p&gt;

&lt;p&gt;So at some point I figured I should just build something(while talking to people at the same time lol). Something simple enough that the brag would be how little it took, not how clever it was.&lt;/p&gt;

&lt;p&gt;That's how puddleswap happened. A no-bs DEX on Monad testnet, the kind a weekend buys you.&lt;/p&gt;

&lt;p&gt;Going in, I wanted the fewest moving parts I could get away with. The thing I'd be most proud of would be how little there was to maintain.&lt;/p&gt;

&lt;p&gt;Most of the actual work was done by an AI agent. It wrote the React frontend, deployed the contracts, and put together the swap UI. The contracts are stock Uniswap V2, audited a thousand times over the years(centuries in web3) and not something I wanted to fork. The frontend is Vite plus React with no backend anywhere. The swap accepts real Circle USDC, a mock USDT we deployed for testnet liquidity, and WMON. A small rebalancer service keeps the price pegs roughly honest.&lt;/p&gt;

&lt;p&gt;It's live at &lt;a href="https://app.puddleswap.org/" rel="noopener noreferrer"&gt;app.puddleswap.org&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The build was mostly uneventful. The agent did its thing, I reviewed diffs, we iterated. What I want to talk about is the one decision I almost got wrong: the routing.&lt;/p&gt;

&lt;h2&gt;
  
  
  The thing I almost overengineered
&lt;/h2&gt;

&lt;p&gt;Standard answer for "how does a DEX UI route swaps" is a graph algorithm. You have N tokens and M pools, build the liquidity graph, run shortest-path weighted by output amount, return the best route. 1inch and Matcha both work this way and every aggregator article online tells you to do the same, so I started writing it.&lt;/p&gt;

&lt;p&gt;Then I looked at my actual data.&lt;/p&gt;

&lt;p&gt;Three "core" tokens: USDC, USDT, WMON. Maybe ten pools, every one of them touching at least one core. I was writing a graph algorithm to solve a problem I didn't have.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fportdeveloper.github.io%2Fassets%2Farticles%2Fsimple-just-works-how-i-built-puddleswap%2Fstar-routing.svg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fportdeveloper.github.io%2Fassets%2Farticles%2Fsimple-just-works-how-i-built-puddleswap%2Fstar-routing.svg" title="the whole graph: A and B connect through three core hubs, plus a direct edge when one exists" alt="Star routing diagram with A on the left, B on the right, and three core hubs USDC, USDT and WMON in the middle" width="1200" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;So I deleted it and wrote this instead (s/o to @danielvf for the idea + the initial PRD).&lt;/p&gt;

&lt;h2&gt;
  
  
  The enumeration
&lt;/h2&gt;

&lt;p&gt;For any swap A → B, enumerate every plausible route through the hubs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Direct: &lt;code&gt;A → B&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Through one hub: &lt;code&gt;A → USDC → B&lt;/code&gt;, &lt;code&gt;A → USDT → B&lt;/code&gt;, &lt;code&gt;A → WMON → B&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Through two hubs: &lt;code&gt;A → USDC → USDT → B&lt;/code&gt;, &lt;code&gt;A → USDC → WMON → B&lt;/code&gt;, &lt;code&gt;A → USDT → WMON → B&lt;/code&gt;, and reverses&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's at most ten candidate paths. Send all ten quote requests in one multicall, pick the path with the highest output, swap on that.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;routes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;buildCandidateRoutes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;tokenIn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;tokenOut&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;cores&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;publicClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;multicall&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;contracts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;routes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;address&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;router&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;abi&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;routerAbi&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;functionName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;getAmountsOut&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;amountIn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;path&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="p"&gt;})),&lt;/span&gt;
  &lt;span class="na"&gt;allowFailure&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;best&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;selectBestQuote&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The whole router is around 50 lines. It builds the candidate list, dedups it, and returns whichever path the multicall said had the highest quote.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters (and not just for DEXes)
&lt;/h2&gt;

&lt;p&gt;I'm not saying graph routing is wrong. For a mainnet aggregator routing across thousands of pools and dozens of DEXes, it's the right tool. I'm saying I wasn't building that.&lt;/p&gt;

&lt;p&gt;Here's the lesson: &lt;strong&gt;a lot of code over-solves its problem.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You see it everywhere once you start looking. Sorting algorithm where the data is always ≤ 10 items (insertion sort is fine, stop). Caching layer where the data hits the database twice a day (the database is already a cache). Pub/sub where there's one publisher and one subscriber (call the function directly).&lt;/p&gt;

&lt;p&gt;The smart-looking solution is usually someone solving the &lt;em&gt;general&lt;/em&gt; problem, because that's what they were trained on. The general problem is harder, more interesting, and absolutely useless to you if your constraints are narrower.&lt;/p&gt;

&lt;p&gt;On puddleswap, my constraints are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One chain, one DEX, mine&lt;/li&gt;
&lt;li&gt;Three hub tokens I control&lt;/li&gt;
&lt;li&gt;Operator-maintained liquidity&lt;/li&gt;
&lt;li&gt;Test users with low gas budgets&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Within those constraints, enumeration is provably correct (every meaningful route gets checked), faster than graph traversal (one batched RPC, not N round-trips), and a fraction of the code. The day any of those constraints stops holding is the day I'll bother writing the graph router.&lt;/p&gt;

&lt;h2&gt;
  
  
  When this breaks
&lt;/h2&gt;

&lt;p&gt;I'd be lying if I said this scales. Obvious failure modes are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Exotic-to-exotic pools that bypass the hubs entirely. Enumeration misses them.&lt;/li&gt;
&lt;li&gt;A hub runs dry of liquidity on one side. Router still checks routes through it and eats a bad quote.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The end
&lt;/h2&gt;

&lt;p&gt;If you're building on Monad testnet and need swaps for your tests, puddleswap is live at &lt;a href="https://app.puddleswap.org/" rel="noopener noreferrer"&gt;app.puddleswap.org&lt;/a&gt;. The router is at &lt;a href="https://github.com/portdeveloper/puddleswap/blob/main/web/src/lib/routing.ts" rel="noopener noreferrer"&gt;puddleswap/web/src/lib/routing.ts&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;And next time you reach for the complex solution, check whether your problem actually needs it. It probably doesn't. &lt;/p&gt;

&lt;p&gt;And maybe ask your agent if there are any easier solutions to the problem you are trying to solve.&lt;/p&gt;

&lt;p&gt;Questions?&lt;/p&gt;

</description>
      <category>puddleswap</category>
      <category>monad</category>
      <category>dex</category>
    </item>
    <item>
      <title>You don't know how to vibe-code</title>
      <dc:creator>port</dc:creator>
      <pubDate>Sun, 17 May 2026 12:30:08 +0000</pubDate>
      <link>https://forem.com/port/you-dont-know-how-to-vibe-code-9m9</link>
      <guid>https://forem.com/port/you-dont-know-how-to-vibe-code-9m9</guid>
      <description>&lt;p&gt;It's 2026. We have AGI (or at least the ability to code almost anything thanks to models like Opus 4.5 from Anthropic and GPT 5.2 from OpenAI).&lt;/p&gt;

&lt;p&gt;But there's one problem. What you create in minutes creates problems you spend hours trying to fix. And if you're unlucky, you end up with a spaghetti codebase that no LLM can untangle. You no longer understand the code. It doesn't even make sense to read it anymore.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;So, what are you doing wrong and what could you do better, and how do some people get everything right when they are vibe-coding?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Honestly, vibe coding kinda gave people the wrong impression on using LLMs to write code. Somehow everyone ended up thinking "yeah i can do this with ONE PROMPT, without EVER LOOKING AT THE CODE".&lt;/p&gt;

&lt;p&gt;That just won't work, unless you consider this good work:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fportdeveloper.github.io%2Fassets%2Farticles%2Fyou-dont-know-how-to-vibe-code%2Fgeneric-vibeslop.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fportdeveloper.github.io%2Fassets%2Farticles%2Fyou-dont-know-how-to-vibe-code%2Fgeneric-vibeslop.jpg" title="generic vibeslop, with lots of ai-purple" alt="Generic AI-styled application screenshot" width="800" height="444"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And the code behind it is even worse. The AI's knowledge is months old, maybe a year. It doesn't know your codebase. It doesn't know what "done" means.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Alright, here's how I actually vibe-code. Or rather, how I use my current favorite tool (claude code) to build real projects.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I'm going to walk you through how I built &lt;a href="https://execevents.xyz/" rel="noopener noreferrer"&gt;execevents.xyz&lt;/a&gt;, a real-time execution visualizer for Monad. Blocks race across the screen as they go through consensus. Transactions stream in live. You can see state changes, call traces, gas usage.&lt;/p&gt;

&lt;p&gt;&lt;br&gt;a short glance at execevents.xyz
  &lt;/p&gt;

&lt;p&gt;This isn't a toy project. Under the hood, execevents connects to Monad's Execution Events API—a Rust service that reads blockchain data directly from shared memory, HFT-style. We're talking sub-millisecond latency for real-time block and transaction data. Building something that interfaces with infrastructure this performant would normally require deep systems knowledge.&lt;/p&gt;

&lt;p&gt;But here's the thing: I built this in HOURS, not days, not weeks. Using Claude Code and the methodology below, anyone can build high-performance applications on Monad without being a systems engineer or even a regular developer.&lt;/p&gt;

&lt;p&gt;Below I explain my methodology about vibe-coding, or how I code.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 1: Think about the end goal
&lt;/h2&gt;

&lt;p&gt;Visualize the most basic version of what you want to build. I usually ask claude something like this:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;I read about execution events from Monad docs and I want to build an app showing how to use them. Here is the page about execution events: (i paste the markdown here) Do not start building until I confirm. Tell me how you are planning to build this. Then ask me to confirm. Also, ask me any questions you have. Our first goal is to reach to a basic MVP.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fportdeveloper.github.io%2Fassets%2Farticles%2Fyou-dont-know-how-to-vibe-code%2Fclaude-plan.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fportdeveloper.github.io%2Fassets%2Farticles%2Fyou-dont-know-how-to-vibe-code%2Fclaude-plan.jpg" title="possible answer from my besto-frendo, kraudu kodu-san" alt="Claude Code implementation plan screenshot" width="800" height="415"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Above is the answer I got from claude. Notice how it basically told me what it's going to be doing exactly. I can now visualize what I am gonna be getting and can direct the project better. This is the point where I want to stop and think. If everything looks OK. I move on to the questions claude asks. Then, I start answering them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Much like real coding, you want to spend time thinking about the code rather than writing it.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fportdeveloper.github.io%2Fassets%2Farticles%2Fyou-dont-know-how-to-vibe-code%2Ftime-spent.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fportdeveloper.github.io%2Fassets%2Farticles%2Fyou-dont-know-how-to-vibe-code%2Ftime-spent.jpg" title="here is how I would suggest you to spend time" alt="Suggested distribution of time spent while building with AI coding tools" width="800" height="700"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You might do several iterations before even you tell claude to build. I usually ask it to not to build in every message until I like the implementation plan.&lt;/p&gt;

&lt;p&gt;I also use the plan mode a lot. It is the new way of telling the claude to ask you questions, and it just works really well!&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 2: Build the MVP, then use it
&lt;/h2&gt;

&lt;p&gt;Then, ask claude to start building. When it finishes doing stuff, test it. This is the part people LOVE skipping, not knowing that the problems that arise later actually stem from it. After it fixes the issue, go back and find another problem to fix, do this until there are no issues left.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fportdeveloper.github.io%2Fassets%2Farticles%2Fyou-dont-know-how-to-vibe-code%2Fvibing-cycle.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fportdeveloper.github.io%2Fassets%2Farticles%2Fyou-dont-know-how-to-vibe-code%2Fvibing-cycle.jpg" title="the cycle of vibing" alt="The cycle of prompting, testing, finding issues, and fixing them" width="800" height="391"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Iterate with small, focused prompts
&lt;/h2&gt;

&lt;p&gt;This is where most people mess up. They find five things wrong and try to fix them all in one massive prompt.&lt;/p&gt;

&lt;p&gt;Don't do that.&lt;/p&gt;

&lt;p&gt;Every time you find something broken, fix just that one thing. Here's what my prompts actually looked like:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prompt 1:&lt;/strong&gt; "The TPS calculation is wrong. It's counting blocks that arrive in batches over WebSocket. Make it only count consecutive block numbers."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prompt 2:&lt;/strong&gt; "This doesn't work on mobile. Add a responsive layout with a bottom sheet for block details."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prompt 3:&lt;/strong&gt; "The block state transitions are too abrupt. Add CSS transitions so blocks slide smoothly between states."&lt;/p&gt;

&lt;p&gt;Each prompt is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Specific -&amp;gt; I'm telling it exactly what's wrong&lt;/li&gt;
&lt;li&gt;Small -&amp;gt; targeting one thing&lt;/li&gt;
&lt;li&gt;Reviewable -&amp;gt; I can read the diff and understand what changed&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step 4: Read the Code
&lt;/h2&gt;

&lt;p&gt;Or at least, take a quick glance at it. Every time Claude makes a change, I read the diff. Not because I don't trust it, but because I need to understand what I'm shipping.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fportdeveloper.github.io%2Fassets%2Farticles%2Fyou-dont-know-how-to-vibe-code%2Fclaude-ascii.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fportdeveloper.github.io%2Fassets%2Farticles%2Fyou-dont-know-how-to-vibe-code%2Fclaude-ascii.jpg" title="claude is surprisingly good at creating ascii stuff" alt="Claude Code creating ASCII art" width="800" height="667"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Reading doesn't mean auditing every line. It usually means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Skimming the diff&lt;/li&gt;
&lt;li&gt;Understanding the approach&lt;/li&gt;
&lt;li&gt;Asking yourself "does this make sense?"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By reading the code, you will catch mistakes, learn, and stay in control. The moment you stop understanding your codebase is the moment you can't fix it anymore. Do not turn your project into a mess you can't make sense of.&lt;/p&gt;

&lt;p&gt;And if you don't understand anything in the code, you can open a new terminal window and ask claude code to explain it for you.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Learned
&lt;/h2&gt;

&lt;p&gt;Building execevents taught me things I wouldn't have learned from tutorials.&lt;/p&gt;

&lt;p&gt;On the systems side: I now understand how Monad's Execution Events work at a low level, how the Rust API pulls data from shared memory, why certain event types arrive in batches, and how to handle the timing edge cases that come with real-time blockchain data. Claude didn't just write code; it explained the architecture as we built it. When the TPS calculation was wrong, debugging it meant understanding WebSocket message ordering and block finality.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fportdeveloper.github.io%2Fassets%2Farticles%2Fyou-dont-know-how-to-vibe-code%2Fme-with-claude.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fportdeveloper.github.io%2Fassets%2Farticles%2Fyou-dont-know-how-to-vibe-code%2Fme-with-claude.png" title="me with claude" alt="Me with Claude" width="236" height="236"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;On the vibe-coding side: I learned that the quality of your output directly reflects the quality of your iteration loop. The people who fail at vibe-coding aren't bad at prompting, they're bad at testing and reading diffs. They skip the boring parts.&lt;/p&gt;

&lt;p&gt;The real unlock is this: with the right methodology, AI tools let you punch above your weight. You can build performant, production-grade applications that interface with serious infrastructure, even if you've never written Rust or worked with shared memory systems. The barrier isn't coding ability anymore. It's knowing how to guide the process.&lt;/p&gt;

&lt;p&gt;Now, go.&lt;/p&gt;

&lt;p&gt;And do magic, for we live in a magical era.&lt;/p&gt;

</description>
      <category>vibecoding</category>
      <category>claudecode</category>
      <category>monad</category>
    </item>
    <item>
      <title>You are prompting GPT 5.5 wrong.</title>
      <dc:creator>port</dc:creator>
      <pubDate>Sun, 17 May 2026 12:29:54 +0000</pubDate>
      <link>https://forem.com/port/you-are-prompting-gpt-55-wrong-505n</link>
      <guid>https://forem.com/port/you-are-prompting-gpt-55-wrong-505n</guid>
      <description>&lt;p&gt;Source: OpenAI.&lt;/p&gt;

&lt;p&gt;Prompting GPT 5.5 is A LOT different than how you prompted any model before. And GPT 5.5 itself can't write good prompts for itself! See the screenshot below from &lt;a class="mentioned-user" href="https://dev.to/victortaelin"&gt;@victortaelin&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fportdeveloper.github.io%2Fassets%2Farticles%2Fyou-are-prompting-gpt-5-5-wrong%2Fvictor-taelin-prompt.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fportdeveloper.github.io%2Fassets%2Farticles%2Fyou-are-prompting-gpt-5-5-wrong%2Fvictor-taelin-prompt.jpg" title="btw def follow Taelin!" alt="Screenshot of a Victor Taelin post about GPT 5.5 prompting"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;So, in this short article, I will be talking about how to create good prompts for GPT 5.5 so that you can do your work better&amp;amp;faster.&lt;/p&gt;

&lt;p&gt;Btw before we go any further, this guide is for using GPT 5.5 inside Codex.&lt;/p&gt;

&lt;p&gt;So here's what changed. Older models needed you to walk them through the steps. First do this, then check that, then call this tool. GPT 5.5 reasons more efficiently and that kind of prompting actively makes it worse. It narrows the search space &amp;amp; you end up with mechanical answers.&lt;/p&gt;

&lt;p&gt;The fix is the opposite of what people are doing. Describe the destination, not the route. Let the model figure out the path.&lt;/p&gt;

&lt;p&gt;I've been changing how I prompt since 5.5 dropped. Here are the 5 moves with the highest hit rate, with examples you can paste in(or modify) directly.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Lead with the outcome
&lt;/h2&gt;

&lt;p&gt;Stop telling the model HOW to solve the problem, instead tell it what the result should look like.&lt;/p&gt;

&lt;p&gt;(btw the full examples are at the end)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Resolve the customer's issue end to end.

Success means:
- the eligibility decision is made from the available policy and account data
- any allowed action is completed before responding
- the final answer includes completed_actions, customer_message, and blockers
- if evidence is missing, ask for the smallest missing field
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  2. Kill the preamble
&lt;/h2&gt;

&lt;p&gt;Codex loves to narrate. "I'll start by examining the file structure." "Let me first check the existing implementation." "Now I'll proceed to make the changes."&lt;/p&gt;

&lt;p&gt;You don't need any of this. You can see what it's doing. The preamble is noise &amp;amp; it eats latency before any real work happens.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Skip preambles. Do not narrate what you are about to do before doing it. Do not announce tool calls. Do not end with "Let me know if you'd like adjustments" or "Feel free to ask if you have questions."

When you finish, report what changed in 2-4 lines. File paths, what was modified, anything I need to know to use the change. That's it.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  3. Bias to action, finish what you start
&lt;/h2&gt;

&lt;p&gt;Default Codex behavior on a hard task is to surface a plan and stop. We don't want that. We want action. Get action:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Bias to action. If the request is clear and the next step is reversible, just do it. Do not stop at analysis, do not stop at a plan, do not stop after the first file change.

Persist until the task is fully handled end to end in this turn:
- carry changes through implementation, verification, and a clear summary
- if you hit a blocker, try one more reasonable approach before stopping
- only stop early if the next step is irreversible, destructive, or genuinely ambiguous

Unless I explicitly ask for a plan or a question, assume I want code shipped.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;(btw this is from the OpenAI Codex starter prompt)&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Read in parallel, not one file at a time
&lt;/h2&gt;

&lt;p&gt;Watch Codex on a real task. It reads package.json, waits, reads src/index.ts, waits, reads src/utils.ts, aaaand waits some more... Use this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;When you need to read multiple files, read them in parallel in a single batch, not sequentially.

Workflow:
1. Plan all the files you need before reading any
2. Issue one parallel batch of reads
3. Analyze together
4. Only do another batch if new unpredictable reads come up

Same for searches. If you need to grep for 3 patterns, run 3 searches in parallel. Sequential reads are only justified when one result genuinely determines the next.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  5. Make it actually verify
&lt;/h2&gt;

&lt;p&gt;Run validation and tests. Don't trust "this should work"::&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;After making changes, run the relevant validation:
- targeted tests for the behavior you changed
- typecheck and lint
- build, if the change touches anything build-time sensitive
- a quick smoke test on the running app if it's user-facing

If validation fails, fix it before reporting done. If validation can't run in this environment, say so &amp;amp; describe the next best check I can run myself.

"Done" means verified, not "code is written."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here are 3 simple rules to follow when prompting GPT 5.5:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Add a completeness rule&lt;/li&gt;
&lt;li&gt;Add a stop condition&lt;/li&gt;
&lt;li&gt;Force verification.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fportdeveloper.github.io%2Fassets%2Farticles%2Fyou-are-prompting-gpt-5-5-wrong%2Ffour-rules.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fportdeveloper.github.io%2Fassets%2Farticles%2Fyou-are-prompting-gpt-5-5-wrong%2Ffour-rules.jpg" title="the four rules" alt="Screenshot summarizing the prompting rules"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here are three examples you can adjust to your use case:&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Building a feature
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Build [feature]. Done = it works in the running app, has at least one test for the new behavior, types and lint clean, diff scoped to this change only.

Stop &amp;amp; ask only if: the next step is destructive, requirements are genuinely ambiguous, or you'd need to expand scope to 3+ unrelated files. Otherwise just ship it.

No preamble. Don't narrate before doing. When done, report changed files + what was modified in 2-4 lines.

Verify before reporting done: run affected tests, typecheck, lint. If anything fails, fix it. "Should work" is not done.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  2. Fixing a bug
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Fix [bug]. Done = root cause is fixed (not the symptom), a test exists that fails before the fix and passes after, no other behavior regressed, diff scoped to the fix.

Stop &amp;amp; ask only if: the bug isn't reproducible from what I gave you, the root cause is in unexpected scope (different module, infra, dependency), or two plausible root causes exist and the wrong fix would mask the real bug.

No preamble. Don't walk me through your hypothesis before testing it. When done, report root cause + fix + what you verified in 3-5 lines.

Verify before reporting done: run the regression test, run the affected module's full suite, confirm the original repro is gone.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  3. Refactoring
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Refactor [target]. Done = behavior is byte-identical before and after, all existing tests pass without modification, types and lint clean, diff scoped to the refactor.

Stop &amp;amp; ask if: you can't preserve behavior without changing a test (means the refactor changed semantics), the refactor naturally pulls in a 3rd+ file beyond what we discussed, or you find a real bug while refactoring (surface it separately, don't silently fix it inside the refactor diff).

No preamble. Don't explain the refactor plan before doing it. When done, report what moved, what's now where, and what was verified in 2-4 lines.

Verify before reporting done: run the FULL test suite (refactors break unexpected places), typecheck, build.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  4. Migration / upgrade
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Migrate [target] from [old] to [new]. Done = the codebase compiles and runs on the new version, all existing tests pass without behavior changes, deprecation warnings from the migration are resolved (not suppressed), diff is scoped to the migration only.

Stop &amp;amp; ask if: the new version requires a behavior change that affects users (don't make that call alone), the migration touches config, infra, or build files in ways we didn't discuss, or you find code that depends on the old version's bugs (genuinely tricky - surface it, don't paper over it).

No preamble. Don't list every breaking change in the changelog before starting - read the changelog yourself and apply what's needed. When done, report what was migrated, what was left untouched and why, and any deprecation warnings still standing.

Verify before reporting done: run the full test suite (migrations break unexpected places), typecheck, build. If the project has integration or e2e tests, run those too - unit tests pass through migrations more often than you'd think.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  5. Adding tests to existing code
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Add tests for [target]. Done = the tests exercise the actual behavior (not implementation details), they pass against the current code, they would fail if the behavior broke, coverage hits the meaningful branches not just the happy path.

Stop &amp;amp; ask if: the code is genuinely hard to test because of how it's structured (don't refactor it to make testing easier without checking), you find a real bug while writing tests (surface it separately, don't quietly fix it), or the existing tests already cover this and I missed it.

No preamble. Don't outline the test plan before writing - just write the tests. When done, report what's covered, what's intentionally not covered, and anything you found while writing them.

Verify before reporting done: run the new tests (must pass), then mutate the code under test in a small way and rerun (the tests must fail - if they don't, they're testing the wrong thing). Run the full suite to make sure nothing else broke.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  And here are 5 things to avoid:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Telling Codex HOW to solve it instead of what done looks like&lt;/li&gt;
&lt;li&gt;Asking GPT to create a prompt for itself&lt;/li&gt;
&lt;li&gt;Using the same chat for more than one task&lt;/li&gt;
&lt;li&gt;Sequential file reads on multi-file tasks (waste of latency)&lt;/li&gt;
&lt;li&gt;Trusting "this should work" without running the tests (never do this)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Alright, if you take one thing from this: before you reach for that Extra High button, rewrite the prompt using the tips above. (and give me a follow)&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Read more: &lt;a href="https://developers.openai.com/api/docs/guides/prompt-guidance" rel="noopener noreferrer"&gt;developers.openai.com/api/docs/guides/prompt-guidance&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>gpt55</category>
      <category>prompting</category>
      <category>codex</category>
    </item>
    <item>
      <title>Skills don't work the way we think they do</title>
      <dc:creator>port</dc:creator>
      <pubDate>Sun, 17 May 2026 12:29:53 +0000</pubDate>
      <link>https://forem.com/port/skills-dont-work-the-way-we-think-they-do-494j</link>
      <guid>https://forem.com/port/skills-dont-work-the-way-we-think-they-do-494j</guid>
      <description>&lt;p&gt;I just finished reading SkillBench paper: &lt;a href="https://arxiv.org/pdf/2602.12670" rel="noopener noreferrer"&gt;https://arxiv.org/pdf/2602.12670&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And the results are definitely not what most people expect.&lt;/p&gt;

&lt;h2&gt;
  
  
  What researchers did
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8mpe0bst31q027kojnfa.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8mpe0bst31q027kojnfa.jpg" alt="SkillBench research setup screenshot" width="800" height="427"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;They did 86 real-work tasks across 11 domains and executed 7,308 runs.&lt;/p&gt;

&lt;p&gt;Each task was tested in three modes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Baseline (no skills)&lt;/li&gt;
&lt;li&gt;Curated skills (human-written)&lt;/li&gt;
&lt;li&gt;Self-generated skills by the model&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fportdeveloper.github.io%2Fassets%2Farticles%2Fskills-dont-work-the-way-we-think-they-do%2Fhaiku-skills-opus.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fportdeveloper.github.io%2Fassets%2Farticles%2Fskills-dont-work-the-way-we-think-they-do%2Fhaiku-skills-opus.jpg" title="haiku with good skills is better than vanilla opus" alt="SkillBench result comparing smaller models with skills to larger models without skills" width="800" height="489"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Without further ado, below are some conclusions that I found interesting in the paper.&lt;/p&gt;

&lt;h2&gt;
  
  
  Self-generated skills don't help
&lt;/h2&gt;

&lt;p&gt;One of the most hyped ideas in agent research is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Let the model write its own tools / skills."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;But it is mostly a wasted effort. In this research, self-generated skills produced no meaningful improvement over baseline.&lt;/p&gt;

&lt;p&gt;In some cases, they made performance worse.&lt;/p&gt;

&lt;p&gt;Today's models simply cannot reliably create useful reusable procedural abstractions.&lt;/p&gt;

&lt;p&gt;This matters because a huge part of current agent research assumes models can recursively improve by generating better skills/tools. This benchmark suggests that assumption is premature.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcgvhskj7mvris5aob4l7.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcgvhskj7mvris5aob4l7.jpg" alt="SkillBench chart showing self-generated skills did not meaningfully improve performance" width="800" height="356"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Human-made skills work A LOT better
&lt;/h2&gt;

&lt;p&gt;When Skills were carefully written by humans, performance jumped +16.2 percentage points on average.&lt;/p&gt;

&lt;p&gt;But here's what's even more surprising:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Domain variance was extreme&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Some domains saw small gains (~4-5 pp)&lt;/li&gt;
&lt;li&gt;Others saw enormous gains (~50+ pp)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr4s6huh7gu3ef3tin93y.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr4s6huh7gu3ef3tin93y.jpg" alt="SkillBench chart showing high domain variance for human-made skills" width="800" height="471"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Skills don't help the same in different fields.. They disproportionately help in structured, procedural domains.&lt;/p&gt;

&lt;h2&gt;
  
  
  Smaller models + skills ≈ bigger models without skills
&lt;/h2&gt;

&lt;p&gt;A smaller model with curated Skills matched or exceeded a larger model without Skills.&lt;/p&gt;

&lt;p&gt;This is huge for cost optimization:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Local agents&lt;/li&gt;
&lt;li&gt;Edge deployment&lt;/li&gt;
&lt;li&gt;Open-source models&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Too many skills can hurt
&lt;/h2&gt;

&lt;p&gt;Overly broad or verbose skill libraries degraded performance. Focused, minimal skill modules performed better.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd36f3n4pkpwxitsm20iq.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd36f3n4pkpwxitsm20iq.jpg" alt="SkillBench result showing too many skills can degrade performance" width="800" height="307"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Pick your skills carefully. 2-3 skills work better than 4+ skills.&lt;/p&gt;

&lt;h2&gt;
  
  
  Here is my takeaway
&lt;/h2&gt;

&lt;p&gt;If this paper is right (and i think it is, mostly because of my personal experiences with skill files):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Scaling alone isn't enough&lt;/li&gt;
&lt;li&gt;Autonomy narratives are premature&lt;/li&gt;
&lt;li&gt;Skill architecture design is now a first-class research problem&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Read the full paper: &lt;a href="https://arxiv.org/pdf/2602.12670" rel="noopener noreferrer"&gt;https://arxiv.org/pdf/2602.12670&lt;/a&gt;&lt;/p&gt;

</description>
      <category>claudeskills</category>
      <category>skillbench</category>
      <category>aiagents</category>
    </item>
    <item>
      <title>so... how to create a skill that works?</title>
      <dc:creator>port</dc:creator>
      <pubDate>Sun, 17 May 2026 12:29:52 +0000</pubDate>
      <link>https://forem.com/port/so-how-to-create-a-skill-that-works-3k7p</link>
      <guid>https://forem.com/port/so-how-to-create-a-skill-that-works-3k7p</guid>
      <description>&lt;p&gt;In my previous article, I argued that skills don't work the way most people expect.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Related: &lt;a href="https://portdeveloper.github.io/articles/skills-dont-work-the-way-we-think-they-do.html" rel="noopener noreferrer"&gt;Skills don't work the way we think they do&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The data from SkillBench supports this. Attaching skills doesn't automatically guarantee better performance.&lt;/p&gt;

&lt;p&gt;So the real question becomes:&lt;/p&gt;

&lt;p&gt;If skills don't magically fix models... How do you engineer them properly?&lt;/p&gt;

&lt;p&gt;To answer that, we need to understand how knowledge itself works.&lt;/p&gt;

&lt;p&gt;I think human knowledge is like a block of cheese.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fportdeveloper.github.io%2Fassets%2Farticles%2Fhow-to-create-a-skill-that-works%2Fhuman-knowledge-cheese.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fportdeveloper.github.io%2Fassets%2Farticles%2Fhow-to-create-a-skill-that-works%2Fhuman-knowledge-cheese.png" title="human knowledge or a block of cheese" alt="A block of cheese representing human knowledge" width="540" height="360"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It grows over time, with holes ever-present.&lt;/p&gt;

&lt;p&gt;When we hit something we don't know, we:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;look it up&lt;/li&gt;
&lt;li&gt;learn it&lt;/li&gt;
&lt;li&gt;apply it&lt;/li&gt;
&lt;li&gt;patch the hole and move forward&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;LLMs don't do this.&lt;/p&gt;

&lt;p&gt;When they hit a hole, they don't say "I don't know."&lt;/p&gt;

&lt;p&gt;They hallucinate. They lazily fill the gap with plausible-sounding but incorrect information.&lt;/p&gt;

&lt;p&gt;Aaand that's where things break, and we, being the superior entity, come in to help.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Two Types of Holes
&lt;/h2&gt;

&lt;p&gt;Through trial and error, I've noticed there are two kinds.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Knowledge gaps
&lt;/h3&gt;

&lt;p&gt;Example:&lt;/p&gt;

&lt;p&gt;My OpenClaw agent tries to open a browser extension. It fails.&lt;/p&gt;

&lt;p&gt;I tell it:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"You already have a browser. Open that."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Suddenly the dumdum understands the task and opens the freaking browser.&lt;/p&gt;

&lt;p&gt;It wasn't incapable.&lt;/p&gt;

&lt;p&gt;It just didn't reason through the environment correctly.&lt;/p&gt;

&lt;p&gt;That's a hole.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Moldy knowledge
&lt;/h3&gt;

&lt;p&gt;Sometimes it does know something, but it's outdated.&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Using &lt;code&gt;useScaffoldContractRead&lt;/code&gt; instead of &lt;code&gt;useScaffoldReadContract&lt;/code&gt; in Scaffold-ETH&lt;/li&gt;
&lt;li&gt;Manually defining Monad mainnet instead of importing from &lt;code&gt;viem/chains&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's stale info on the LLM's side. I call it mold.&lt;/p&gt;

&lt;p&gt;And mold spreads silently. If you don't correct it once, it keeps reappearing in future runs. And you might never notice it.&lt;/p&gt;

&lt;h2&gt;
  
  
  How I Create Skill Files
&lt;/h2&gt;

&lt;p&gt;Here's my actual process.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. I let the model fail
&lt;/h3&gt;

&lt;p&gt;For example, when I was building the monad-development skill, I simply said:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Create a token on Monad."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That's it. Then I watched it fail.&lt;/p&gt;

&lt;p&gt;I didn't over-direct it.&lt;/p&gt;

&lt;p&gt;I wanted to see where the holes were.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. I take notes on every failure
&lt;/h3&gt;

&lt;p&gt;This sounds weird but yes I watch it and take notes/let it takes notes afterwards. after the LLM completes its run. I ask it "What did you have problems with?", "What did you fail to do on the first try?", and I go and check if the thing I asked for is built the way I wanted it to be.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. I create the skill.md file
&lt;/h3&gt;

&lt;p&gt;The skill file contains the patches to fill in the gaps of the LLMs knowledge and remove mold+fill in the gap that is created by removing the moldy part.&lt;/p&gt;

&lt;p&gt;The file is concise, specific, and clear.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. I re-run and benchmark
&lt;/h3&gt;

&lt;p&gt;I run the same prompt again with the skill attached. If it still struggles, I refine the skill.&lt;/p&gt;

&lt;p&gt;I repeat until:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;First-attempt success rate is high&lt;/li&gt;
&lt;li&gt;Hallucinations drop(mostly)&lt;/li&gt;
&lt;li&gt;Tool usage becomes clean and consistent&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What This Really Is
&lt;/h2&gt;

&lt;p&gt;This is systematic failure harvesting. Treat the LLM as a system with blind spots and engineer around them.&lt;/p&gt;

&lt;p&gt;Prompt. Let it fail. Take notes. Create a skill file out of your notes. Rinse and repeat until you are at a desired success rate.&lt;/p&gt;

&lt;p&gt;This is how you create a skill that actually works.&lt;/p&gt;

&lt;h2&gt;
  
  
  Further reading
&lt;/h2&gt;

&lt;p&gt;SkillBench paper:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Skills Don't Always Improve Performance&lt;br&gt;&lt;br&gt;
&lt;a href="https://arxiv.org/pdf/2602.12670" rel="noopener noreferrer"&gt;https://arxiv.org/pdf/2602.12670&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;My previous article:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://portdeveloper.github.io/articles/skills-dont-work-the-way-we-think-they-do.html" rel="noopener noreferrer"&gt;Skills don't work the way we think they do&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Vercel's agents.md versus skills.md article:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;AGENTS.md outperforms skills in our agent evals&lt;br&gt;&lt;br&gt;
&lt;a href="https://vercel.com/blog/agents-md-outperforms-skills-in-our-agent-evals" rel="noopener noreferrer"&gt;https://vercel.com/blog/agents-md-outperforms-skills-in-our-agent-evals&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>claudeskills</category>
      <category>aiagents</category>
      <category>skillbench</category>
    </item>
    <item>
      <title>I built a copy-for-LLMs button for Docusaurus. Then Ethereum and Sui shipped it.</title>
      <dc:creator>port</dc:creator>
      <pubDate>Mon, 27 Apr 2026 19:01:36 +0000</pubDate>
      <link>https://forem.com/port/i-built-a-copy-for-llms-button-for-docusaurus-then-ethereum-and-sui-shipped-it-3d7l</link>
      <guid>https://forem.com/port/i-built-a-copy-for-llms-button-for-docusaurus-then-ethereum-and-sui-shipped-it-3d7l</guid>
      <description>&lt;p&gt;*&lt;em&gt;A few months ago I got tired of selecting docs pages and pasting them into Claude. Half the time the nav came along with the content. So I built &lt;code&gt;docusaurus-plugin-copy-page-button&lt;/code&gt;: a one-line install that drops a Copy page button into your Docusaurus sidebar.&lt;br&gt;
*&lt;/em&gt;&lt;br&gt;
When I click the button, I get the page as clean markdown. I also added a dropdown that opens the page directly in ChatGPT, Claude, or Gemini.&lt;/p&gt;

&lt;p&gt;Setup:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;npm install docusaurus-plugin-copy-page-button&lt;br&gt;
&lt;/code&gt;&lt;br&gt;
Then one line in &lt;code&gt;docusaurus.config.js:&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;plugins: ['docusaurus-plugin-copy-page-button']&lt;br&gt;
&lt;/code&gt;&lt;br&gt;
That's it.&lt;/p&gt;

&lt;p&gt;Six months later, I see the plugin running on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ethereum execution-apis&lt;/li&gt;
&lt;li&gt;Sui, Walrus, Seal, SuiNS (Mysten Labs)&lt;/li&gt;
&lt;li&gt;Monad&lt;/li&gt;
&lt;li&gt;Flare&lt;/li&gt;
&lt;li&gt;Kaia&lt;/li&gt;
&lt;li&gt;Nillion&lt;/li&gt;
&lt;li&gt;Chronicle&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Around 10k installs a month, mostly blockchain ecosystems. I didn't aim at that niche, it just landed there.&lt;/p&gt;

&lt;h3&gt;
  
  
  What was actually hard
&lt;/h3&gt;

&lt;p&gt;Three things took most of the time.&lt;/p&gt;

&lt;p&gt;Content extraction. Docusaurus pages come wrapped in nav, breadcrumbs, edit-this-page links, footers, and a sidebar. The plugin walks the DOM, finds the article container, drops the chrome, and hands the rest to a markdown converter that handles code blocks, tables, lists, and admonitions.&lt;/p&gt;

&lt;p&gt;Then SPA route changes. Docusaurus uses client-side navigation. Inject the button on first load and it vanishes when the user clicks a link. The plugin watches popstate, Docusaurus's own events, and URL changes, then re-injects on each route.&lt;/p&gt;

&lt;p&gt;And mobile. Docusaurus collapses the TOC sidebar on small screens. The button needs to live somewhere visible without breaking the layout. Took a few iterations.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;If you run a Docusaurus site, install it. If something's missing, open an issue.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/portdeveloper/docusaurus-plugin-copy-page-button" rel="noopener noreferrer"&gt;https://github.com/portdeveloper/docusaurus-plugin-copy-page-button&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;npm: &lt;a href="https://www.npmjs.com/package/docusaurus-plugin-copy-page-button" rel="noopener noreferrer"&gt;https://www.npmjs.com/package/docusaurus-plugin-copy-page-button&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Live demo: &lt;a href="https://portdeveloper.github.io/copy-page-button-showcase/" rel="noopener noreferrer"&gt;https://portdeveloper.github.io/copy-page-button-showcase/&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>docusaurus</category>
      <category>ai</category>
      <category>webdev</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
