<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Simon Morley</title>
    <description>The latest articles on Forem by Simon Morley (@simon_morley).</description>
    <link>https://forem.com/simon_morley</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3564652%2Ff1533649-1cc6-4966-8e4c-14c383cda1ef.png</url>
      <title>Forem: Simon Morley</title>
      <link>https://forem.com/simon_morley</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/simon_morley"/>
    <language>en</language>
    <item>
      <title>XDP: The Kernel-Level Powerhouse Behind Modern Network Defence</title>
      <dc:creator>Simon Morley</dc:creator>
      <pubDate>Wed, 12 Nov 2025 12:54:28 +0000</pubDate>
      <link>https://forem.com/simon_morley/xdp-the-kernel-level-powerhouse-behind-modern-network-defense-222n</link>
      <guid>https://forem.com/simon_morley/xdp-the-kernel-level-powerhouse-behind-modern-network-defense-222n</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Traditional packet processing in Linux has always had one problem: latency, just like your Nan.&lt;/p&gt;

&lt;p&gt;Packets climb an almost endless ladder through kernel subsystems before reaching user space. By which time your firewall has probably missed the critical window to act. Shame on you and your Nan.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;eXpress Data Path (XDP)&lt;/strong&gt; changes that completely. It's a fast-path hook that runs &lt;em&gt;inside&lt;/em&gt; the kernel's network driver layer: before sockets, before Netfilter, before the kernel allocates a socket buffer (skb).&lt;/p&gt;

&lt;p&gt;This means you can inspect, modify, drop, or redirect packets &lt;em&gt;as they arrive on the NIC&lt;/em&gt;, with nanosecond-level performance.&lt;/p&gt;

&lt;p&gt;It's like knowing who's going to turn up at the pub before they've left the house.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Core Idea
&lt;/h2&gt;

&lt;p&gt;XDP extends the Linux kernel with programmable packet handling at the &lt;strong&gt;driver level&lt;/strong&gt;, using &lt;strong&gt;eBPF&lt;/strong&gt; (extended Berkeley Packet Filter) programs compiled into bytecode.&lt;/p&gt;

&lt;p&gt;Instead of pushing packets up the stack, XDP lets you attach logic that decides what happens next, directly in the NIC's receive path.&lt;/p&gt;

&lt;h3&gt;
  
  
  Execution flow:
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;NIC receives a packet.
&lt;/li&gt;
&lt;li&gt;XDP hook triggers before skb allocation.
&lt;/li&gt;
&lt;li&gt;eBPF program runs in the kernel VM.
&lt;/li&gt;
&lt;li&gt;Program returns one of several actions:

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;XDP_PASS&lt;/code&gt;: let the packet continue to the normal stack
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;XDP_DROP&lt;/code&gt;: discard it immediately
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;XDP_TX&lt;/code&gt;: bounce it back out the same interface
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;XDP_REDIRECT&lt;/code&gt;: forward it to another interface, CPU, or AF_XDP socket
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;XDP_ABORTED&lt;/code&gt;: fail gracefully if something goes wrong
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why It Matters
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. &lt;strong&gt;Performance&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;XDP can process &lt;strong&gt;millions of packets per second per core&lt;/strong&gt;.&lt;br&gt;&lt;br&gt;
Facebook's Cilium team measured over 20 million packets per second, on commodity hardware.&lt;/p&gt;

&lt;p&gt;That's like Mo Farah racing your Nan in an ultra marathon and finishing it 20 million times before she's even put her jeggings on. &lt;/p&gt;
&lt;h3&gt;
  
  
  2. &lt;strong&gt;Programmability&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Unlike fixed-function firewalls or DPDK pipelines, XDP programs are just eBPF bytecode.&lt;/p&gt;

&lt;p&gt;You can dynamically load and unload filters at runtime, without recompiling the kernel or restarting services.&lt;/p&gt;
&lt;h3&gt;
  
  
  3. &lt;strong&gt;Security&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;You can build kernel-resident security controls:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;DDoS mitigation:&lt;/strong&gt; drop floods at line rate
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Port knocking or protocol filtering:&lt;/strong&gt; block unwanted ports before TCP handshake
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inline IDS signatures:&lt;/strong&gt; detect or throttle known attack patterns
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  4. &lt;strong&gt;Observability&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Because XDP operates before skb allocation, it's ideal for high-fidelity telemetry. &lt;/p&gt;

&lt;p&gt;You can capture packet metadata (MACs, IPs, ports, timestamps) and push structured events to user space with ring buffers — no packet copies, no pcap overhead.&lt;/p&gt;
&lt;h2&gt;
  
  
  XDP in the Wild
&lt;/h2&gt;
&lt;h3&gt;
  
  
  &lt;strong&gt;Meta (Facebook)&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Meta was one of the earliest large-scale adopters of XDP.&lt;br&gt;&lt;br&gt;
They use it in &lt;strong&gt;Katran&lt;/strong&gt;, their in-kernel load balancer, to handle tens of millions of connections per second while maintaining microsecond-level latency.&lt;/p&gt;

&lt;p&gt;XDP replaced parts of their older DPDK-based stack, cutting CPU load and enabling dynamic policy updates through eBPF maps.&lt;br&gt;&lt;br&gt;
The same foundation powers Cilium’s kernel datapath and underpins parts of Meta’s edge networking infrastructure.&lt;/p&gt;
&lt;h3&gt;
  
  
  &lt;strong&gt;Cloudflare&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Cloudflare also uses XDP to defend its global edge network against &lt;strong&gt;DDoS attacks&lt;/strong&gt;.  &lt;/p&gt;

&lt;p&gt;By placing mitigation logic directly inside the kernel, they can absorb massive floods — up to &lt;strong&gt;hundreds of millions of packets per second&lt;/strong&gt; — without userspace overhead.  &lt;/p&gt;

&lt;p&gt;Their engineers have written extensively about how XDP allows per-interface rate limiting, SYN flood filtering, and on-the-fly rules pushed from Go and Rust control planes.  &lt;/p&gt;

&lt;p&gt;It's effectively their last-line kernel shield before packets ever reach the proxy layer.&lt;/p&gt;

&lt;p&gt;Together, Meta and Cloudflare have proven that XDP can scale from hyperscaler infrastructure to real-world production workloads, not just lab benchmarks.&lt;/p&gt;
&lt;h2&gt;
  
  
  Typical Use Cases
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;DDoS Protection&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Drop or rate-limit SYN floods directly in driver&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;XDP_DROP&lt;/code&gt; TCP SYNs after threshold&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Load Balancing&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Redirect packets to backend queues or CPUs&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;XDP_REDIRECT&lt;/code&gt; to AF_XDP sockets&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Firewalling&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Kernel-level ACLs&lt;/td&gt;
&lt;td&gt;Filter by IP, port, or protocol&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Telemetry&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Stream header data to user space&lt;/td&gt;
&lt;td&gt;XDP + perf ring buffer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Inline Remediation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Block C2 connections before userspace&lt;/td&gt;
&lt;td&gt;Combine XDP + LSM hook&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;h2&gt;
  
  
  Writing an XDP Program (Example)
&lt;/h2&gt;

&lt;p&gt;I've been writing this in Rust recently but you can do in c or go etc. Rust is the best IMHO.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// src/xdp.rs
use aya_bpf::{
    bindings::xdp_action,
    macros::{map, xdp},
    maps::HashMap,
    programs::XdpContext,
};

#[map(name = "SYN_COUNTER")]
static mut SYN_COUNTER: HashMap&amp;lt;u32, u64&amp;gt; = HashMap::&amp;lt;u32, u64&amp;gt;::with_max_entries(1024, 0);

#[xdp(name = "count_syns")]
pub fn count_syns(ctx: XdpContext) -&amp;gt; u32 {
    match try_count_syns(ctx) {
        Ok(ret) =&amp;gt; ret,
        Err(_) =&amp;gt; xdp_action::XDP_ABORTED,
    }
}

fn try_count_syns(ctx: XdpContext) -&amp;gt; Result&amp;lt;u32, ()&amp;gt; {
    let hdr = ctx.ip()?.ok_or(())?;
    if hdr.protocol != aya_bpf::bindings::IPPROTO_TCP as u8 {
        return Ok(xdp_action::XDP_PASS);
    }

    // Parse TCP header
    let tcp = ctx.transport::&amp;lt;aya_bpf::bindings::tcphdr&amp;gt;().ok_or(())?;
    let flags = unsafe { (*tcp).syn() as u8 };

    if flags == 1 {
        let key = hdr.protocol as u32;
        unsafe {
            let counter = SYN_COUNTER.get(&amp;amp;key).copied().unwrap_or(0);
            SYN_COUNTER.insert(&amp;amp;key, &amp;amp;(counter + 1), 0);
        }
    }

    Ok(xdp_action::XDP_PASS)
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And a userspace loader:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// src/main.rs
use aya::{Bpf, programs::Xdp};
use std::{env, process};

fn main() -&amp;gt; Result&amp;lt;(), anyhow::Error&amp;gt; {
    let iface = env::args().nth(1).unwrap_or_else(|| {
        eprintln!("Usage: cargo run -- &amp;lt;iface&amp;gt;");
        process::exit(1);
    });

    let mut bpf = Bpf::load_file("target/bpfel-unknown-none/release/xdp-example")?;
    let program: &amp;amp;mut Xdp = bpf.program_mut("count_syns").unwrap().try_into()?;
    program.load()?;
    program.attach(&amp;amp;iface, aya::programs::XdpFlags::default())?;

    println!("XDP program attached to {}", iface);
    loop {
        std::thread::sleep(std::time::Duration::from_secs(60));
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As you can see, not much to get the basics going.&lt;/p&gt;

&lt;h2&gt;
  
  
  Advanced Scenarios
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. &lt;strong&gt;Dynamic Remediation&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Combine XDP with userspace controllers.&lt;br&gt;&lt;br&gt;
For example, an agent monitors traffic patterns and pushes new eBPF maps into the kernel to block malicious IPs dynamically.&lt;/p&gt;
&lt;h3&gt;
  
  
  2. &lt;strong&gt;Programmable Rate Limiting&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Use per-source counters in eBPF maps:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Count packets per IP
&lt;/li&gt;
&lt;li&gt;Apply backoff or redirect decisions
&lt;/li&gt;
&lt;li&gt;Synchronize with userspace via shared maps&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  3. &lt;strong&gt;Hybrid Visibility&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Send metadata to userspace without full payloads:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="n"&gt;bpf_perf_event_output&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_F_CURRENT_CPU&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;pkt_meta&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pkt_meta&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Challenges and Limitations
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Hardware support&lt;/strong&gt; varies by NIC driver.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Verifier constraints&lt;/strong&gt;: programs must be bounded and safe.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Debugging&lt;/strong&gt; can be non-trivial — &lt;code&gt;bpftool prog trace&lt;/code&gt; helps, but it's still kernel space.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Portability&lt;/strong&gt;: kernel versions differ in helper function availability.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Knowledge&lt;/strong&gt; - you still gotta know what you're doing fu*king around in there, this ain't no lovable prompt party.s&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Still, XDP is maturing fast, and the eBPF ecosystem around it (bpftool, libbpf, Cilium, Katran) makes development significantly easier.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;XDP represents the most radical shift in Linux networking since Netfilter. &lt;/p&gt;

&lt;p&gt;It lets you run programmable logic where it matters most — &lt;em&gt;at the point of ingress&lt;/em&gt; — turning your kernel into a programmable network processor.&lt;/p&gt;

&lt;p&gt;Whether you're building autonomous defenses, ultra-low-latency telemetry, or custom in-kernel routing, XDP gives you the foundation for it.&lt;/p&gt;

&lt;p&gt;The kernel is no longer a bottleneck - it's a battlefield and XDP is the armour keeping your loved ones (your nan etc) safe.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Now?
&lt;/h2&gt;

&lt;p&gt;I'm working on XDP applications for a couple of projects that are going on right now. More news on this soon.&lt;/p&gt;

&lt;h3&gt;
  
  
  Further Reading
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://xdp-project.net/" rel="noopener noreferrer"&gt;Linux XDP Project&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.cloudflare.com/l4drop-xdp-ebpf-based-ddos-mitigations/" rel="noopener noreferrer"&gt;Cloudflare: How We Use XDP for DDoS Protection&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://engineering.fb.com/2018/05/22/open-source/open-sourcing-katran-a-scalable-network-load-balancer/" rel="noopener noreferrer"&gt;Meta’s Katran Load Balancer&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.cilium.io/" rel="noopener noreferrer"&gt;Cilium’s XDP Integration&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/libbpf/libbpf-bootstrap" rel="noopener noreferrer"&gt;libbpf-bootstrap Templates&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>xdp</category>
      <category>linux</category>
      <category>networking</category>
      <category>observability</category>
    </item>
    <item>
      <title>Why Multi-Validator Hosts Break Traditional Security Scanning</title>
      <dc:creator>Simon Morley</dc:creator>
      <pubDate>Wed, 05 Nov 2025 14:56:18 +0000</pubDate>
      <link>https://forem.com/simon_morley/why-multi-validator-hosts-break-traditional-security-scanning-3nbj</link>
      <guid>https://forem.com/simon_morley/why-multi-validator-hosts-break-traditional-security-scanning-3nbj</guid>
      <description>&lt;p&gt;Determining a host is running a Sui validator is easy.&lt;/p&gt;

&lt;p&gt;Step 1 - scan a couple of ports:&lt;/p&gt;

&lt;p&gt;Port 8080? Sui network endpoint.&lt;br&gt;&lt;br&gt;
Port 9184? Sui metrics.  &lt;/p&gt;

&lt;p&gt;Step 2 - Done. Next host.&lt;/p&gt;

&lt;p&gt;And this is fine, but how do we really know it's a Sui validator? Also, humour me, we know it's a Sui validator, because there's a list of them.&lt;/p&gt;

&lt;p&gt;But it turns out, these Sui validators also have http (80) open frequently. Which muddies the water. I don't know why, we're still working on that.&lt;/p&gt;

&lt;p&gt;How do we find an Ethereum node. Same idea, different ports.&lt;/p&gt;

&lt;p&gt;How do we find out if the host is running Sui and Ethereum? &lt;/p&gt;

&lt;p&gt;It gets really messy, really fast. False positives, false negatives. General confusion. The humans have to intervene.&lt;/p&gt;

&lt;p&gt;Traditional scanning starts with understanding. Once that's clear, the scanning can commence. &lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem Nobody Talks About (mostly because they're not interested, but if they are, they're not talking about it).
&lt;/h2&gt;

&lt;p&gt;Validator operators don't run one chain per host like some kind of theoretical best-practice diagram.&lt;/p&gt;

&lt;p&gt;They run multiple validators on the same infrastructure because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hardware costs money&lt;/li&gt;
&lt;li&gt;Operations complexity scales with host count&lt;/li&gt;
&lt;li&gt;A 32-core server running one validator is wasteful&lt;/li&gt;
&lt;li&gt;Most chains don't max out resources simultaneously&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So you get hosts running:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sui + Ethereum&lt;/li&gt;
&lt;li&gt;Solana + Cosmos&lt;/li&gt;
&lt;li&gt;Ethereum + Polygon + Arbitrum&lt;/li&gt;
&lt;li&gt;Some combination I've literally never seen before&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And now your nice clean rule-based scanner that looks for "Sui signatures" doesn't know what to do.&lt;/p&gt;

&lt;p&gt;I spent weeks trying to figure it out, ended up:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Asking the user what services are running - lil bit 2002.&lt;/li&gt;
&lt;li&gt;Using AI to figure it out!&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Using AI did a great job, sometimes. At a cost. I walked away for a bit, did something else.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Overlapping Port Problem
&lt;/h2&gt;

&lt;p&gt;It gets worse when chains use similar port ranges or standard services.&lt;/p&gt;

&lt;p&gt;Multiple validators might expose:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Metrics endpoints (Sui on 9184, standard Prometheus on 9090)&lt;/li&gt;
&lt;li&gt;JSON-RPC endpoints (Ethereum 8545, Solana 8899, Sui 8080)&lt;/li&gt;
&lt;li&gt;P2P networking (Ethereum 30303, Solana 8000-10000, Sui 8084)&lt;/li&gt;
&lt;li&gt;WebSocket connections (Solana 8900)&lt;/li&gt;
&lt;li&gt;Monitoring stacks (all using Grafana on 3000)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can't just say "port 9090 = Prometheus therefore monitoring only."&lt;/p&gt;

&lt;p&gt;Because what if that Prometheus instance is exposing metrics for &lt;strong&gt;three different validators&lt;/strong&gt;?&lt;/p&gt;

&lt;p&gt;Now you need to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Identify which metrics belong to which chain&lt;/li&gt;
&lt;li&gt;Understand which validators are actually running&lt;/li&gt;
&lt;li&gt;Map CVEs to the correct services&lt;/li&gt;
&lt;li&gt;Determine risk posture across multiple chains&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Rule-based scanning doesn't scale to this.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Configuration Variance Problem
&lt;/h2&gt;

&lt;p&gt;Even if you nail down the ports, validator configurations vary wildly.&lt;/p&gt;

&lt;p&gt;Some operators:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Run validators in Docker (different process visibility)&lt;/li&gt;
&lt;li&gt;Use non-standard ports (8545 becomes 18545 because reasons)&lt;/li&gt;
&lt;li&gt;Proxy everything through nginx (now all you see is nginx)&lt;/li&gt;
&lt;li&gt;Run custom monitoring stacks (Prometheus? Grafana? Both? Neither?)&lt;/li&gt;
&lt;li&gt;Use systemd service names that don't match upstream defaults&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So your rules for "detecting an Ethereum validator" need to account for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Standard Geth on 30303/8545&lt;/li&gt;
&lt;li&gt;Dockerized Geth on custom ports&lt;/li&gt;
&lt;li&gt;Proxied Geth behind nginx on 443&lt;/li&gt;
&lt;li&gt;Custom compiled Geth with a weird banner&lt;/li&gt;
&lt;li&gt;Besu instead of Geth (different client, same chain)&lt;/li&gt;
&lt;li&gt;Nethermind or Erigon (different again)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And that's &lt;strong&gt;one chain&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Multiply this across Sui, Solana, Cosmos, Polygon, Avalanche...&lt;/p&gt;

&lt;p&gt;You see the problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  Manual Verification Works (But Doesn't Scale)
&lt;/h2&gt;

&lt;p&gt;The only reliable way to identify multi-chain hosts right now?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Human verification.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This works. It's accurate.&lt;/p&gt;

&lt;p&gt;It's also &lt;strong&gt;slower than your nan&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If you're scanning hundreds of validator hosts across multiple clients, you can't manually verify every configuration.&lt;/p&gt;

&lt;p&gt;And if the configuration changes (which it does), you have to verify again.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Insight That Changes Everything
&lt;/h2&gt;

&lt;p&gt;After manually verifying enough multi-chain hosts, I started to notice something:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;They have a shape.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not a shape you can easily encode in rules.&lt;br&gt;&lt;br&gt;
But a shape you can &lt;strong&gt;recognise&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;A Sui+ETH host "feels different" than a Sui-only host.&lt;br&gt;&lt;br&gt;
A Solana+Cosmos host has a different "fingerprint" than either chain alone.&lt;/p&gt;

&lt;p&gt;You can't write down the rules for why.&lt;br&gt;&lt;br&gt;
But you know it when you see it.&lt;/p&gt;

&lt;p&gt;Your brain is doing something that rule-based scanners can't:&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Pattern matching across multiple dimensions simultaneously.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You're not looking for specific ports.&lt;br&gt;&lt;br&gt;
You're looking at the &lt;strong&gt;whole configuration&lt;/strong&gt; and recognizing similarity.&lt;/p&gt;

&lt;p&gt;Port 8080 + 8545 + 30303 + 9090 + 3000?&lt;br&gt;&lt;br&gt;
That's a Sui + Ethereum setup with monitoring.&lt;/p&gt;

&lt;p&gt;Port 8899 + 8900 + 8000-8020 + 9090?&lt;br&gt;&lt;br&gt;
That's Solana with standard monitoring.&lt;/p&gt;

&lt;p&gt;Port 8080 + 9184 + 8545 + 8551 + 30303 + 8899 + 9090 + 3000?&lt;br&gt;&lt;br&gt;
That's a three-chain monster that needs close attention.&lt;/p&gt;

&lt;p&gt;You're not consciously running through these rules.&lt;br&gt;&lt;br&gt;
You're just &lt;strong&gt;seeing the pattern&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Goodbye AI, or at least some of it.
&lt;/h2&gt;

&lt;p&gt;Instead of writing rules, we could:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Manually verify a multi-chain host once (or get the user to do this)!&lt;/li&gt;
&lt;li&gt;Store its "fingerprint" (ports, services, banners, everything)&lt;/li&gt;
&lt;li&gt;When we see a new host, search for similar fingerprints&lt;/li&gt;
&lt;li&gt;If it's similar to a verified host, inherit that classification&lt;/li&gt;
&lt;li&gt;If it's novel, verify manually and add to the training set&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Ok, we're still using openai's embeddings but that's a lot cheaper than using openai's api.&lt;/p&gt;

&lt;p&gt;This AI called this &lt;strong&gt;scaled pattern matching with human-verified training data.&lt;/strong&gt;!! Weheey.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Comes Next
&lt;/h2&gt;

&lt;p&gt;In Part 2, I'll show you how vector embeddings let you do exactly this: turn a server's full configuration into a numerical fingerprint, then search for "servers that look like this one."&lt;/p&gt;

&lt;p&gt;It's actually quite boring and if you know me, I love boring. Meanwhile, ChatGPT inserted this "it's just high-dimensional similarity search" which I found fun so I left it.&lt;/p&gt;

&lt;p&gt;Spoiler: Postgres + pgvector is good enough! Most of us don't need the MEGA VECTOR DBS&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This is Part 1 of 3 on building better security scanning for multi-chain validator infrastructure. Part 2 covers vector embeddings as scaled pattern matching.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;Building something for good over here.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>vectordatabase</category>
      <category>web3</category>
      <category>cybersecurity</category>
    </item>
    <item>
      <title>How Google Mistook My Sui Node for a Bitcoin Farm (And Banned Me) (again)</title>
      <dc:creator>Simon Morley</dc:creator>
      <pubDate>Thu, 30 Oct 2025 12:14:28 +0000</pubDate>
      <link>https://forem.com/simon_morley/how-google-mistook-my-sui-node-for-a-bitcoin-farm-and-banned-me-again-3883</link>
      <guid>https://forem.com/simon_morley/how-google-mistook-my-sui-node-for-a-bitcoin-farm-and-banned-me-again-3883</guid>
      <description>&lt;p&gt;Google thought I was mining Bitcoin - mining it like it's 2018 baby. But no, wasn't doing that (again).&lt;/p&gt;

&lt;p&gt;I was running a L1s validator test node—you know, the exact kind of blockchain infrastructure that legitimate DeFi platforms actually need. The kind that requires computational resources because that's how distributed consensus works. The kind that works really well on cloud providers...&lt;/p&gt;

&lt;p&gt;But Google's threat detection AI apparently can't tell the difference between "crypto mining operation" and "blockchain validator infrastructure."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;So they banned me.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not just for that, mind you. The ban was a trifecta:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Running security scans&lt;/strong&gt; (with explicit authorisation from targets)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Operating honeypots&lt;/strong&gt; (literal security research)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Running that "Bitcoin miner"&lt;/strong&gt; (a Sui test node m8)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;One day I'm building away, on GCP because that's where I &lt;em&gt;like&lt;/em&gt; to build things, the next day? Locked out. Account suspended. No more deploying. No more scanning infrastructure. No more anything.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv421duyduwk06m9ctdr8.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv421duyduwk06m9ctdr8.gif" alt="Where do I even go!"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Irony Wasn't Lost On Me
&lt;/h2&gt;

&lt;p&gt;Here I am, building a platform to help decentralised networks secure their infrastructure, and I get flagged as a threat &lt;em&gt;by another AI system&lt;/em&gt; that can't distinguish between malicious activity and legitimate development.&lt;/p&gt;

&lt;p&gt;Google: confused about blockchain workloads.&lt;br&gt;
Simon: trying to eliminate false positives in security scanning.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Unbanning Process (AKA: Purgatory)
&lt;/h2&gt;

&lt;p&gt;I deleted this section because it was boring. Eventually, they unbanned me. Weheey!&lt;/p&gt;

&lt;p&gt;I got back in—but with restrictions. Some things I could build on GCP. Other things? Not so much.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conditionally reinstated.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I was annoyed at first - just sat there looking at my computer. I realise now this was a blessing in disguise.&lt;/p&gt;

&lt;p&gt;It looked like all those hours in the data centres in the 2000s would finally pay off. Yeah, I am that old. Am I going back to bare metal!? Probably not but still, it's an option.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmiqvwvc7d3zhhihv1jgl.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmiqvwvc7d3zhhihv1jgl.gif" alt="sbsq"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  The Question I Should've Asked Earlier
&lt;/h2&gt;

&lt;p&gt;Sitting there, freshly unbanned and afraid to breathe wrong, I had a realisation. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Even before the ban, I wasn't being smart.&lt;/strong&gt; I'm there burning through AI API calls like they were free. Every scan would:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Discover ports&lt;/li&gt;
&lt;li&gt;Identify services
&lt;/li&gt;
&lt;li&gt;Extract banners and metadata&lt;/li&gt;
&lt;li&gt;Embed everything with OpenAI&lt;/li&gt;
&lt;li&gt;Analyse it with a GPT&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Port 22 open? Ask GPT-4 if it's SSH.&lt;br&gt;&lt;br&gt;
Port 443 responding? Better check with the AI if it's HTTPS.&lt;br&gt;&lt;br&gt;
Port 3000 with a Grafana banner? Let's spend $0.03 to confirm what we already know.&lt;/p&gt;

&lt;p&gt;I was asking a $200 billion company's large language model to tell me things &lt;strong&gt;I learned in 2002.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;

  &lt;iframe src="https://www.youtube.com/embed/Il-an3K9pjg"&gt;
  &lt;/iframe&gt;


&lt;/p&gt;

&lt;p&gt;Thousands of dollars. Thousands of API calls. Thousands of Nvidia chips spinning up to answer questions like "is port 22 usually SSH?"&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Four hundred times.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Same ports. Same configurations. Same fucking validator setups across different hosts.&lt;/p&gt;

&lt;p&gt;And I kept asking. Every. Single. Time.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Constraint That Changes Everything
&lt;/h2&gt;

&lt;p&gt;The GCP ban forced a question:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;How do I operate smarter with fewer resources and less aggressive scanning?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;But the real question—the one I'd been avoiding—was simpler:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why am I re-asking questions I already know the answer to?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Your brain doesn't work like this. After twenty years of staring at security scans, you recognise patterns instantly. You see an open port configuration and you &lt;em&gt;know&lt;/em&gt;—not because you're thinking hard, but because you've seen it before. That's not intelligence. That's &lt;strong&gt;memory.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F61xcxestm4tcc6kiqli0.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F61xcxestm4tcc6kiqli0.gif" alt="MIND BLOWN"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What Comes Next
&lt;/h2&gt;

&lt;p&gt;Getting banned taught me something valuable: &lt;strong&gt;constraints force innovation.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I couldn't scan aggressively anymore. I couldn't just throw compute at every problem. I had to be &lt;em&gt;efficient&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;So I stopped asking the AI everything.&lt;/p&gt;

&lt;p&gt;Instead, I built a system that remembers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;In Part 2&lt;/strong&gt;, I'll show you exactly how much money I was burning on redundant AI calls, why "asking the model" isn't the same as "being intelligent," and how a $200 Postgres instance became smarter than my entire AI pipeline.&lt;/p&gt;

&lt;p&gt;Spoiler: Vector databases aren't about replacing AI. They're about remembering what you already figured out.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Building AI-powered security infrastructure for decentralised networks. This is Part 1 of 3 on getting banned from GCP and what it taught me about building smarter systems.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>vectordatabase</category>
      <category>googlecloud</category>
      <category>security</category>
      <category>devops</category>
    </item>
    <item>
      <title>PGDN Sentinel — an OSS security toolkit for Sui validators, inside Discord</title>
      <dc:creator>Simon Morley</dc:creator>
      <pubDate>Wed, 29 Oct 2025 11:40:42 +0000</pubDate>
      <link>https://forem.com/simon_morley/pgdn-sentinel-an-oss-security-toolkit-for-sui-validators-inside-discord-31cg</link>
      <guid>https://forem.com/simon_morley/pgdn-sentinel-an-oss-security-toolkit-for-sui-validators-inside-discord-31cg</guid>
      <description>&lt;p&gt;When I published the &lt;em&gt;State of Sui&lt;/em&gt; report, the biggest surprise wasn't the 39.6 % of voting power exposed — it was actually how little people 'cared' about external hygiene. I thought I was doing a good thing for the network. Alas, Sui really didn't seem that fussed.&lt;/p&gt;

&lt;p&gt;The data was acknowledged and questioned and dismissed as 'not a bug bounty'. But that wasn't the aim of the project at all. &lt;/p&gt;

&lt;p&gt;So I thought it would be cool to try and visualise this data without requiring Yet Another Dashboard Tool To Login In To (YADTTTIT)?&lt;/p&gt;

&lt;p&gt;So I built a wee Discord Bot that would allow validators to have a look at their scores. And also for regular users, like me, to check how secure a validator was.&lt;/p&gt;

&lt;p&gt;And here it is!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb4qb3k8003yupmbudi93.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb4qb3k8003yupmbudi93.gif" alt="PGDN Sentinel" width="480" height="1039"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why a Discord bot?
&lt;/h2&gt;

&lt;p&gt;Most validator operators already live in Discord.&lt;br&gt;&lt;br&gt;
You're there for epoch coordination, validator channels, and announcements — so security should meet you there too.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;PGDN Sentinel&lt;/strong&gt; is a private, agentic security toolkit for Sui validators that runs entirely through Discord DMs.&lt;br&gt;&lt;br&gt;
No dashboards, no credentials, no installs.&lt;br&gt;&lt;br&gt;
Just slash commands. That's what she said.&lt;/p&gt;

&lt;p&gt;I released this code as an open source project that you can use although I haven't worked out how to make the backend data public yet. That's just in a db for now.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Keep The Data Private Simon?!
&lt;/h2&gt;

&lt;p&gt;As with most external analysis, I did uncover a large number of validators with actual issues, CVEs, misconfigurations etc. I figured that it probably wouldn't be the best idea to publish these. &lt;/p&gt;

&lt;p&gt;That said, I did create a 'validation' logic that would allow the 'validators' to prove ownership and then get a list of these. And I've been offering some free advice to them too. Because that's how I roll.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is the architecture, I hear you asking?!
&lt;/h2&gt;

&lt;p&gt;I created two repos - the main 'bot' that subscribes to the Discord webhooks and an API. The API is connected to the db and I'm running this in a Kubernetes cluster. I guess in theory, the bot can run anywhere but I locked the API's ingress down.&lt;/p&gt;

&lt;p&gt;It's all in Python. And Claude gave me a helping hand, as usual.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters
&lt;/h2&gt;

&lt;p&gt;In &lt;em&gt;Simulated Attack&lt;/em&gt;, I modelled how an attacker could disable enough validators to cross the 33 % halt threshold. Sentinel exists to close that gap — to make external posture checks routine and effortless.&lt;/p&gt;

&lt;p&gt;You don't need a SOC team to know if your node is exposed.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;➡️ &lt;strong&gt;&lt;a href="https://pgdn.ai/pgdn-sentinel-discord" rel="noopener noreferrer"&gt;Add PGDN Sentinel in Discord&lt;/a&gt;&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Works in any server or direct DM.&lt;/p&gt;

&lt;p&gt;The code can be found here, it's MIT licensed which means you are totally welcome to do what you want with it.&lt;/p&gt;

&lt;p&gt;API: &lt;a href="https://github.com/pgdn-oss/pgdn-api-discord" rel="noopener noreferrer"&gt;https://github.com/pgdn-oss/pgdn-api-discord&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Bot: &lt;a href="https://github.com/pgdn-oss/pgdn-discord" rel="noopener noreferrer"&gt;https://github.com/pgdn-oss/pgdn-discord&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Get in touch
&lt;/h2&gt;

&lt;p&gt;I'm a CTO with 20 years experience, most recently even managed to exit a crypto exchange. I would love to connect on Twitter - please do DM and follow me, I have limited frens on there still :) &lt;a href="https://x.com/simonpmorley" rel="noopener noreferrer"&gt;https://x.com/simonpmorley&lt;/a&gt;&lt;/p&gt;

</description>
      <category>security</category>
      <category>sui</category>
      <category>discord</category>
      <category>web3</category>
    </item>
    <item>
      <title>It's true, the web3 world is as decentralised as your nan's underwear.</title>
      <dc:creator>Simon Morley</dc:creator>
      <pubDate>Wed, 22 Oct 2025 14:06:42 +0000</pubDate>
      <link>https://forem.com/simon_morley/its-true-the-web3-world-is-as-decentralised-as-your-nans-underwear-53cd</link>
      <guid>https://forem.com/simon_morley/its-true-the-web3-world-is-as-decentralised-as-your-nans-underwear-53cd</guid>
      <description>&lt;div class="ltag__link"&gt;
  &lt;a href="/simon_morley" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__pic"&gt;
      &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3564652%2Ff1533649-1cc6-4966-8e4c-14c383cda1ef.png" alt="simon_morley"&gt;
    &lt;/div&gt;
  &lt;/a&gt;
  &lt;a href="https://dev.to/simon_morley/the-tesla-generator-paradox-and-why-web3-is-still-about-as-decentralised-as-your-nans-underwear-4g26" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__content"&gt;
      &lt;h2&gt;The Tesla Generator Paradox And Why Web3 Is Still About as Decentralised as Your Nan’s Underwear&lt;/h2&gt;
      &lt;h3&gt;Simon Morley ・ Oct 22&lt;/h3&gt;
      &lt;div class="ltag__link__taglist"&gt;
        &lt;span class="ltag__link__tag"&gt;#web3&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#decentralization&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#blockchain&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#architecture&lt;/span&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/a&gt;
&lt;/div&gt;


</description>
      <category>web3</category>
      <category>decentralization</category>
      <category>blockchain</category>
      <category>architecture</category>
    </item>
    <item>
      <title>The Tesla Generator Paradox And Why Web3 Is Still About as Decentralised as Your Nan’s Underwear</title>
      <dc:creator>Simon Morley</dc:creator>
      <pubDate>Wed, 22 Oct 2025 10:34:48 +0000</pubDate>
      <link>https://forem.com/simon_morley/the-tesla-generator-paradox-and-why-web3-is-still-about-as-decentralised-as-your-nans-underwear-4g26</link>
      <guid>https://forem.com/simon_morley/the-tesla-generator-paradox-and-why-web3-is-still-about-as-decentralised-as-your-nans-underwear-4g26</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;“If your blockchain goes down because AWS goes down, you’re not decentralized.”&lt;br&gt;&lt;br&gt;
— Ben Schiller, &lt;em&gt;CoinDesk&lt;/em&gt;, Oct 2025&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  🧭 The Outage That Shook the Backbone
&lt;/h2&gt;

&lt;p&gt;Bla bla bla, we've all heard the news. Maybe your Alexa broke. Whatever. For context:&lt;/p&gt;

&lt;p&gt;On &lt;strong&gt;October 20–21, 2025&lt;/strong&gt;, AWS suffered a &lt;strong&gt;major outage&lt;/strong&gt; centered in the &lt;code&gt;us-east-1&lt;/code&gt; region.&lt;br&gt;&lt;br&gt;
Snapchat, Fortnite, Roblox, and even parts of Amazon itself went dark. Bla bla bla.&lt;/p&gt;

&lt;p&gt;For nearly fifteen hours, the internet’s most “redundant” infrastructure was anything but.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;If AWS sneezes, the internet catches a cold.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;(I don't know who said that FYI).&lt;/p&gt;

&lt;p&gt;This time the fallout went deeper — &lt;strong&gt;blockchains, RPC endpoints, and validator APIs&lt;/strong&gt; started failing too.&lt;/p&gt;

&lt;p&gt;The world’s “decentralized” systems suddenly felt very centralized.&lt;/p&gt;

&lt;h2&gt;
  
  
  🔍 The Tesla Generator Paradox
&lt;/h2&gt;

&lt;p&gt;Web3 today is a bit like &lt;strong&gt;running a Tesla off a petrol generator&lt;/strong&gt;. Which apparently people do, someone on Twitter even fact checked this image so it must be true.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://x.com/brechtcastel/status/1570425739811459075" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6vkgesbwxo6goxj9uboh.png" alt="Not a tesla fanboy" width="526" height="701"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Everyone is banging on about decentralisation but peel back the layers and you’ll find the same old fossil infrastructure humming underneath. I've been saying this for years, including during my tenure as a CTO at a DeFi startup. So, I must be right.&lt;/p&gt;

&lt;h2&gt;
  
  
  📊 What the Data Says
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Because I haz got the datas, I did look at them datas!&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you've been reading my thrilling articles about the fact Sui is probably going to collapse quite soon, you'll know what's coming. I did some analysis of 122 Sui validators and public nodes - it  reveals moderate decentralisation on paper — but concentration in practice.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Latitude.sh&lt;/strong&gt;: 18.0 %
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OVH&lt;/strong&gt;: 18.0 %
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Constant Company&lt;/strong&gt;: 7.4 %
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Top 3 providers = 43 %&lt;/strong&gt; of all validators
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;US jurisdiction&lt;/strong&gt;: 31 %
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Western Europe&lt;/strong&gt; (UK + DE + FR + NL + IE): 41 %
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AWS + GCP + Azure combined&lt;/strong&gt;: 10.7 %&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;strong&gt;Herfindahl–Hirschman Index (HHI)&lt;/strong&gt; is 859 — “unconcentrated” by regulatory standards but that’s misleading when dozens of nodes share the same upstream providers, fibre routes, and power grids.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;BTW - I learned myself something today - HHI - fancy new term for Simon!&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You can see all the data here:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/pgdn-oss/pgdn-research/blob/main/reports/2025-10-sui-decentralisation.md" rel="noopener noreferrer"&gt;https://github.com/pgdn-oss/pgdn-research/blob/main/reports/2025-10-sui-decentralisation.md&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Oh and here is something fun - Sui once told me that their infra runs in this super private network in Switzerland but actually they have their stuff in Google Cloud. Sweet.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"On paper it’s diverse. In practice, it’s the same house with many doors."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  🧩 Web3 Built on Web2 Foundations
&lt;/h2&gt;

&lt;p&gt;The irony is painful: even networks designed for fault tolerance often rely on a few hyperscalers for uptime. Hyperscalers! (ChatGPT put this word in here, I decided it sounded cool).&lt;/p&gt;

&lt;p&gt;When &lt;strong&gt;AWS&lt;/strong&gt; stumbles, so do RPC providers like &lt;strong&gt;Infura&lt;/strong&gt;, &lt;strong&gt;Alchemy&lt;/strong&gt;, and &lt;strong&gt;QuickNode&lt;/strong&gt;.&lt;br&gt;&lt;br&gt;
When &lt;strong&gt;Cloudflare&lt;/strong&gt; misconfigures, half of Solana RPC endpoints vanish.&lt;br&gt;&lt;br&gt;
When &lt;strong&gt;OVH&lt;/strong&gt; catches fire (literally, in 2021), validators go dark.&lt;/p&gt;

&lt;p&gt;It’s decentralisation &lt;strong&gt;in code&lt;/strong&gt;, centralisation &lt;strong&gt;in practice&lt;/strong&gt;. Fact. I think it's the damn thought leaders again saying words.&lt;/p&gt;

&lt;h2&gt;
  
  
  ⚙️ What Real Infrastructure Decentralisation Requires
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Jurisdictional diversity&lt;/strong&gt; — not just different countries, but different &lt;em&gt;regulators&lt;/em&gt; and &lt;em&gt;risk domains&lt;/em&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bare-metal or sovereign hosting&lt;/strong&gt; — own or colocate hardware. Don’t rent from hyperscalers. Actually lattitude.sh claim to be be bare metal but the point is we need diversity!&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-provider topology&lt;/strong&gt; — AWS + OVH + local datacenters + community nodes.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transparent infrastructure maps&lt;/strong&gt; — disclose where nodes live and how they’re connected.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Resilience testing&lt;/strong&gt; — simulate region loss, BGP leaks, and power faults.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agentic monitoring&lt;/strong&gt; — track correlated risk across clouds, not just node count. I've put this in because this is what I am building, hint hint.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  🧠 The Hard Truth
&lt;/h2&gt;

&lt;p&gt;We call it &lt;em&gt;Web3&lt;/em&gt;, but until our nodes can survive an AWS outage,&lt;br&gt;&lt;br&gt;
it’s still &lt;strong&gt;Web2 in disguise&lt;/strong&gt; — centralised scaffolding painted in decentralised colours.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Decentralisation isn’t about the number of validators.&lt;br&gt;&lt;br&gt;
It’s about how many can survive when the lights go out.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The good news?&lt;/p&gt;

&lt;p&gt;Tools like agentic scanners, peer diversity metrics, and open telemetry are starting to expose these weak points. The next step is acting on them — &lt;strong&gt;before&lt;/strong&gt; the next outage does it for us.&lt;/p&gt;

</description>
      <category>web3</category>
      <category>decentralization</category>
      <category>blockchain</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Simulated Attack: How a 33% consensus risk puts Sui one incident away from a network halt</title>
      <dc:creator>Simon Morley</dc:creator>
      <pubDate>Tue, 21 Oct 2025 10:28:55 +0000</pubDate>
      <link>https://forem.com/simon_morley/simulated-attack-how-a-33-consensus-risk-puts-sui-one-incident-away-from-a-network-halt-1jlf</link>
      <guid>https://forem.com/simon_morley/simulated-attack-how-a-33-consensus-risk-puts-sui-one-incident-away-from-a-network-halt-1jlf</guid>
      <description>&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
In my external posture analysis of Sui validator infrastructure, I found ≈39.6% of voting power was externally vulnerable - above the 33% consensus halt threshold by ~6.6 percentage points (equivalent to &lt;strong&gt;621 voting power&lt;/strong&gt; in our dataset).&lt;/p&gt;

&lt;p&gt;This simulated attack models how an attacker could chain public signals and operational misconfigurations to disable enough validators to cross that threshold.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The result is a resilience warning: the network was, at scan time, within striking distance of a service-impacting halt.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Full details here:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/pgdn-oss/sui-network-report-250819/blob/main/simulated_attack.md" rel="noopener noreferrer"&gt;https://github.com/pgdn-oss/sui-network-report-250819/blob/main/simulated_attack.md&lt;br&gt;
&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Ethics &amp;amp; scope
&lt;/h2&gt;

&lt;p&gt;This was a non-exploitative simulation using only publicly-observable data. I did &lt;strong&gt;not&lt;/strong&gt; access private systems, exfiltrate data, or run exploits. I have redacted IPs, hostnames, step-by-step exploit primitives and any reproduction commands that would enable misuse. Operators who find themselves in the report and need confidential help: open an issue on the repo or contact me privately via the repo's issue tracker. I follow coordinated disclosure best practices.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the numbers matter
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;33% halt threshold:&lt;/strong&gt; Sui's consensus can be materially impacted if ≥33% of voting power goes offline or is disabled.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Observed exposure (~39.6%):&lt;/strong&gt; My scans found roughly 39.6% of voting power had externally-observable vulnerabilities or misconfigurations that an attacker could plausibly target.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Delta:&lt;/strong&gt; That is ~6.6 percentage points above the halt threshold — &lt;strong&gt;621 voting power&lt;/strong&gt; in our dataset. In plain terms: the network was within a single coordinated incident of crossing a critical resilience boundary.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This isn't an abstract metric — it maps operational exposure to consensus risk. When you combine exposed validator surfaces at scale, you stop abstracting "nodes" and start measuring real systemic fragility.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the simulated attack actually shows
&lt;/h2&gt;

&lt;p&gt;The simulated attack is a modelling exercise — it demonstrates attacker decision-making rather than executing an exploit. Steps (sanitised):&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Reconnaissance:&lt;/strong&gt; collect public signals (metrics, HTTP banners, management port responses).
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enrichment:&lt;/strong&gt; parse metric labels and banners to infer roles and topology (which nodes are validators, leaders, etc.).
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Prioritisation:&lt;/strong&gt; rank targets by attacker attractiveness — validators with exposed metrics + reachable management surfaces are high-value.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Confirmatory enumeration:&lt;/strong&gt; light, non-destructive probes to validate co-residency and service fingerprints.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Attack-path modelling:&lt;/strong&gt; chain the signals into a plausible escalation path that, if realized, could disable selected validators (e.g., by misconfigurations, exposed management APIs, or operational errors), potentially pushing cumulative offline voting power above the halt threshold.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Key point: the simulation ties &lt;em&gt;what is observable from the outside&lt;/em&gt; to &lt;em&gt;what an attacker would prioritise&lt;/em&gt;. It’s the mapping from telemetry -&amp;gt; decisions -&amp;gt; systemic outcome.&lt;/p&gt;

&lt;h2&gt;
  
  
  Concrete quantitative findings (sanitised)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Total endpoints analysed:&lt;/strong&gt; ~122 Sui-related endpoints.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Voting-power exposure observed:&lt;/strong&gt; ≈39.6% of total voting power showed externally-observable vulnerabilities by our conservative scanner and confidence policy.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Consensus threshold context:&lt;/strong&gt; The 33% threshold is a critical operational boundary for consensus liveness; the observed exposure exceeded this by ~6.6 percentage points (621 voting power in dataset terms).
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Common signal types driving exposure:&lt;/strong&gt; public metrics exposing role labels, management APIs reachable on common ports (container management fingerprints appeared repeatedly), and default HTTP/admin pages leaking product/type info.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;(Exact tables, heatmaps, and per-validator rows are in the full report; I have redacted host-level identifiers from this article.)&lt;/p&gt;

&lt;h2&gt;
  
  
  What this means for Sui (and similar networks)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Resilience is operational, not just cryptographic.&lt;/strong&gt; Excellent protocol design doesn't prevent nodes from being misconfigured or deployed insecurely. If enough validators share similar deployment mistakes, the protocol's liveness assumptions are at risk.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Decentralisation ≠ diversity of security posture.&lt;/strong&gt; A network of validators operated similarly — including shared misconfigurations, concentrates systemic risk.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The practical impact of a coordinated incident:&lt;/strong&gt; crossing the 33% threshold could cause temporary halts, delays in finality, staking reward disruption, and a loss of confidence among users and delegators. Even short outages can have outsized reputational costs for a young ecosystem.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What operators and the ecosystem should do now
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;For validators (immediate):&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Inventory externally-exposed services&lt;/strong&gt; for your validator and associated infra. If you can't list them, you're blind.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Close management APIs to the public&lt;/strong&gt; (bind to localhost or private networks; require VPN/mTLS jump hosts).
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Protect metrics&lt;/strong&gt; — use private scraping or authenticated gateways; remove internal hostnames and role labels from public metrics.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Silence banners &amp;amp; versions&lt;/strong&gt; that leak product/version info.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Run external posture checks&lt;/strong&gt; against your own endpoints and triage findings immediately.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;For the Sui ecosystem (coordination &amp;amp; incentives):&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Require external-risk audits&lt;/strong&gt; as part of validator onboarding. Make passing an external posture check a first-class requirement.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Incentivise ops maturity&lt;/strong&gt; — link staking, eligibility, or onboarding checks to evidence of secure deployment.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Support operator tooling&lt;/strong&gt; — provide vetted scanner tooling and an official remediation playbook.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Share anonymised telemetry&lt;/strong&gt; so the community can track progress and systemic risk without exposing individual operators.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Limitations &amp;amp; responsible framing
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;The 39.6% figure is based on conservative heuristics and an externally-observable posture scan — operator verification can reduce false positives. Some "exposed" signals are port-only observations or default pages that do not necessarily imply 'compromiseability'.
&lt;/li&gt;
&lt;li&gt;This is not a claim that the network was attacked, only that the modeled conditions could — with additional operational error or a coordinated attack — cross the consensus threshold.
&lt;/li&gt;
&lt;li&gt;My goal is operational improvement: to turn a surprising statistic into urgent, practical action.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Reproducibility &amp;amp; where to find the data
&lt;/h2&gt;

&lt;p&gt;Full dataset, scripts, heatmaps and appendices are in the report: &lt;a href="https://github.com/pgdn-oss/sui-network-report-250819" rel="noopener noreferrer"&gt;https://github.com/pgdn-oss/sui-network-report-250819&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you operate validators or infrastructure that appear in the report and want private assistance, please open an issue on the repo and I will respond via coordinated disclosure.&lt;/p&gt;

&lt;p&gt;And there's a cool discord bot called PGDN Sentinel that you can use too. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://pgdn.ai/pgdn-sentinel-discord" rel="noopener noreferrer"&gt;https://pgdn.ai/pgdn-sentinel-discord&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Final word
&lt;/h2&gt;

&lt;p&gt;This isn't an alarmist headline. It's a measured warning based on data: if multiple operators expose similar surfaces, consensus-level fragility is not hypothetical — it's quantifiable and fixable. The immediate wins (close management APIs, protect metrics, automate posture checks) dramatically reduce the chance of a coordinated incident.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;(I’m working on something new here — automating external risk discovery at scale. I’ll share details soon.)&lt;/em&gt;&lt;/p&gt;

</description>
      <category>cybersecurity</category>
      <category>web3</category>
      <category>blockchain</category>
      <category>security</category>
    </item>
    <item>
      <title>The state of Sui: What external-facing risk looks like (and why top engineers miss it)</title>
      <dc:creator>Simon Morley</dc:creator>
      <pubDate>Mon, 20 Oct 2025 09:02:07 +0000</pubDate>
      <link>https://forem.com/simon_morley/the-state-of-sui-what-external-facing-risk-looks-like-and-why-top-engineers-miss-it-4m0k</link>
      <guid>https://forem.com/simon_morley/the-state-of-sui-what-external-facing-risk-looks-like-and-why-top-engineers-miss-it-4m0k</guid>
      <description>&lt;p&gt;TL;DR&lt;br&gt;
I analysed the externally-observable posture of 122 Sui network endpoints. What I found isn't about whether the Sui team build great software, it's about how even 'good' engineers can miss &lt;em&gt;external&lt;/em&gt; operational risk: exposed services, misconfigured infrastructure, and public metrics that leak sensitive operational data. This piece summarises my main findings, why they matter, and practical steps operators can take today.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I did this
&lt;/h2&gt;

&lt;p&gt;I wanted to show, with data, how external attack surface and operational misconfigurations can defeat even excellent engineering. The Sui protocol has strong engineering — my goal is educational: to help teams measure and close external exposure before an attacker finds it.&lt;/p&gt;

&lt;p&gt;The data was shared with the Sui security team in August 2025. &lt;/p&gt;

&lt;h2&gt;
  
  
  What I scanned and how
&lt;/h2&gt;

&lt;p&gt;Briefly (full methodology in the linked report):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;I measured 122 Sui-related endpoints for externally reachable services (HTTP, RPC, Docker API, metrics endpoints, etc.).&lt;/li&gt;
&lt;li&gt;My approach focused on &lt;strong&gt;externally observable posture&lt;/strong&gt; — what an internet attacker can see and reach — not on private code or internal access.&lt;/li&gt;
&lt;li&gt;I applied conservative confidence thresholds for version/CVE mapping and logged only reproducible findings.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;See the full methodology and raw data in my published findings. (link: &lt;a href="https://github.com/pgdn-oss/sui-network-report-250819" rel="noopener noreferrer"&gt;https://github.com/pgdn-oss/sui-network-report-250819&lt;/a&gt;)&lt;/p&gt;

&lt;h2&gt;
  
  
  Topline findings
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;strong&gt;non-trivial percent&lt;/strong&gt; of observed endpoints exposed services that should never be public (for example, metrics endpoints reachable from the internet, and port 2375 — Docker remote API — observed in a surprising number of hosts). SSH all over the shop.&lt;/li&gt;
&lt;li&gt;Many public websites were default vendor landing pages or misconfigured web servers (these can leak service versions and admin consoles).&lt;/li&gt;
&lt;li&gt;Only a small fraction had WAFs present when an HTTP endpoint existed.&lt;/li&gt;
&lt;li&gt;Several hosts returned service banners or version strings that mapped to known CVEs (I used a conservative confidence policy; the “CVE-affected” label is an upper bound pending operator verification).&lt;/li&gt;
&lt;li&gt;The distribution of problems is not uniform — some operators were well locked down, others left obvious signals that an external attacker could use.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;(Full counts, tables and heatmaps are available in the full report.)&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;External visibility is an attacker’s map.&lt;/strong&gt; Public metrics, misconfigured HTTP endpoints and exposed management APIs are high-value reconnaissance.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automated attacks scale.&lt;/strong&gt; An exposed metrics endpoint or Docker API is trivial for automated tooling to find and target at scale.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Engineers think inside-out.&lt;/strong&gt; Teams often focus on consensus and cryptography (rightly), and under-invest in hardening the network/ops layer that faces the internet.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Concrete examples (anonymised)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Metrics endpoints reachable on the public internet that expose internal state and operational metrics.&lt;/li&gt;
&lt;li&gt;Docker remote API (2375/tcp) responding with service banners — a trivial path to container escape or remote code execution in the wrong hands.&lt;/li&gt;
&lt;li&gt;Default web server landing pages that leak version information or provide admin paths.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;(Again — see the report for technical reproduction notes and timeline.)&lt;/p&gt;

&lt;h2&gt;
  
  
  Remediation checklist (for operators &amp;amp; you)
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Inventory your externally reachable endpoints.&lt;/strong&gt; If you can’t list them, you can’t secure them. Use internal scans + trusted external scans.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Close management interfaces to the public.&lt;/strong&gt; Docker APIs, admin consoles, metrics scrape endpoints — bind them to localhost / private networks only.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Require auth and network controls.&lt;/strong&gt; Where management APIs must be reachable externally, place them behind a mutual-TLS gateway, VPN, or tightly-scoped firewall rules.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Harden metrics endpoints.&lt;/strong&gt; Don’t expose Prometheus or similar scrapers to the public internet. Use an internal scraper or secure gateway.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Remove verbose banners &amp;amp; version strings.&lt;/strong&gt; Configure servers to not reveal build/versioning in HTTP headers or service banners.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monitor for drift.&lt;/strong&gt; Re-run external posture scans regularly and detect when previously-closed ports reappear.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Patch management.&lt;/strong&gt; Track service versions and patch known CVEs promptly — but assume some versions may still be exposed until verified.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Limitations &amp;amp; ethics
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;My scans are non-invasive and focused on public-facing services. I do not exploit vulnerabilities, nor do I publish private data.&lt;/li&gt;
&lt;li&gt;Some “port-only” observations require operator verification (e.g., distinguishing a ghost port from a genuine service).&lt;/li&gt;
&lt;li&gt;The CVE mappings are conservative upper-bound estimates that need operator confirmation for actionable triage.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;See the full methodology, opsec and reproducibility appendix for the exact scanner commands and the policy I used for CVE confidence. (link: &lt;a href="https://github.com/pgdn-oss/pgdn-cve" rel="noopener noreferrer"&gt;https://github.com/pgdn-oss/pgdn-cve&lt;/a&gt;)&lt;/p&gt;

&lt;h2&gt;
  
  
  What I recommend to protocol teams and operators
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Fund or mandate periodic external posture reviews as part of release processes.&lt;/li&gt;
&lt;li&gt;Automate external smoke tests that confirm management APIs and metrics are not exposed.&lt;/li&gt;
&lt;li&gt;Make “no-management-exposed” a documented runbook for deployment.&lt;/li&gt;
&lt;li&gt;Share anonymised exposure telemetry so the community can learn and raise the bar.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Closing — why I published this
&lt;/h2&gt;

&lt;p&gt;This is about shared risk and learning. Great protocol engineering doesn’t immunise an operator against mistakes in deployment and ops. My hope: this write-up becomes a practical resource for teams and operators to make the mesh of Sui (and similar networks) safer for everyone.&lt;/p&gt;

&lt;p&gt;Full report (data, scripts, and appendices): &lt;a href="https://github.com/pgdn-oss/sui-network-report-250819" rel="noopener noreferrer"&gt;https://github.com/pgdn-oss/sui-network-report-250819&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I’ve been building something new that takes this kind of analysis much further — automating external risk discovery at scale. More on that soon.&lt;/p&gt;

&lt;p&gt;Thanks for reading, Simon.&lt;/p&gt;

</description>
      <category>security</category>
      <category>blockchain</category>
      <category>sui</category>
      <category>cybersecurity</category>
    </item>
    <item>
      <title>Teaching Security Scanners to Remember - Using Vector Embeddings to Stop Chasing Ghost Ports</title>
      <dc:creator>Simon Morley</dc:creator>
      <pubDate>Tue, 14 Oct 2025 13:19:09 +0000</pubDate>
      <link>https://forem.com/simon_morley/teaching-security-scanners-to-remember-using-vector-embeddings-to-stop-chasing-ghost-ports-of7</link>
      <guid>https://forem.com/simon_morley/teaching-security-scanners-to-remember-using-vector-embeddings-to-stop-chasing-ghost-ports-of7</guid>
      <description>&lt;p&gt;I've scanned the same 118 blockchain validator nodes probably 200 times over the past year. And for most of that time, my scanner was an idiot with amnesia - treating scan #200 exactly like scan #1, learning nothing.&lt;/p&gt;

&lt;p&gt;Every single time, ports 2375 and 2376 showed up as "open." Every single time, my tools dutifully tested them for Docker APIs. Every single time, they found nothing. Ten seconds wasted per scan, multiplied by hundreds of scans, just... gone.&lt;/p&gt;

&lt;p&gt;Then I had a thought: What if my scanner could remember?&lt;/p&gt;

&lt;h2&gt;
  
  
  The Ghost Port Problem
&lt;/h2&gt;

&lt;p&gt;Here's what kept happening across all 118+ nodes, spanning multiple cloud providers and geographies:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ports 2375/2376 (standard Docker API ports) responded to TCP handshakes&lt;/li&gt;
&lt;li&gt;But curl hung. Netcat got EOF immediately. No banner, no service, nothing&lt;/li&gt;
&lt;li&gt;Identical TCP fingerprints every time: TTL≈63, window=65408&lt;/li&gt;
&lt;li&gt;These were otherwise hardened validator nodes with strict firewalls&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Traditional security scanners reported these as "open/tcpwrapped" or "unknown service." Which meant:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Repeated Docker API testing (10+ seconds per port)&lt;/li&gt;
&lt;li&gt;Manual investigation on every scan&lt;/li&gt;
&lt;li&gt;False positives in my reports&lt;/li&gt;
&lt;li&gt;Wasted scanning budget when cloud providers flagged excessive probes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;After the 50th identical scan, I was done. There had to be a better way.&lt;/p&gt;

&lt;h2&gt;
  
  
  Vector Embeddings: Not Just for Chatbots
&lt;/h2&gt;

&lt;p&gt;Vector embeddings are typically associated with NLP and RAG systems — turning text into high-dimensional vectors where semantically similar things cluster together. But the core concept is universal: represent complex data as points in space, then query "what's similar to this?"&lt;/p&gt;

&lt;p&gt;What if each network scan became a vector representing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Port combinations and states&lt;/li&gt;
&lt;li&gt;TCP-level behaviors (TTL, window size, response timing)&lt;/li&gt;
&lt;li&gt;Application-layer responses&lt;/li&gt;
&lt;li&gt;Infrastructure context (hosting provider, network profile)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then instead of treating every scan independently, I could query: &lt;strong&gt;"What have I learned from similar infrastructure before?"&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Architecture
&lt;/h2&gt;

&lt;p&gt;I built a three-part system:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Alan (AI Planner)&lt;/strong&gt;: LLM-based decision engine that receives scan context and historical patterns, then generates optimized probe sequences&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stan (Executor)&lt;/strong&gt;: Runs the actual scanning commands (nmap, masscan, protocol probes) and captures behavioral metadata&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Vince (Vector Memory)&lt;/strong&gt;: PostgreSQL with pgvector extension storing 1536-dimensional embeddings with cosine similarity search&lt;/p&gt;

&lt;p&gt;The flow looks like:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Stan discovers open ports → [22, 80, 2375, 2376, 9000, 9184]&lt;/li&gt;
&lt;li&gt;Vector memory finds similar historical scans&lt;/li&gt;
&lt;li&gt;Alan gets enriched context with patterns&lt;/li&gt;
&lt;li&gt;Alan generates optimized probe plan based on what worked before&lt;/li&gt;
&lt;li&gt;Results stored with behavioral fingerprint&lt;/li&gt;
&lt;li&gt;Embedding generated and indexed for future queries&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Setting Up pgvector
&lt;/h2&gt;

&lt;p&gt;I chose pgvector because it's PostgreSQL-native, mature, and way more cost-effective than managed vector databases:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;EXTENSION&lt;/span&gt; &lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;validator_scans&lt;/span&gt; 
  &lt;span class="k"&gt;ADD&lt;/span&gt; &lt;span class="k"&gt;COLUMN&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1536&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;validator_scans_embedding_idx&lt;/span&gt; 
  &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;validator_scans&lt;/span&gt; 
  &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;ivfflat&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="n"&gt;vector_cosine_ops&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lists&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Similarity queries are simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;host&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ports&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
       &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;similarity&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;validator_scans&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'90 days'&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;query_embedding&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For embeddings, I use OpenAI's text-embedding-ada-002 (1536 dimensions) because it's dirt cheap ($0.0001 per 1K tokens) and handles structured text well.&lt;/p&gt;

&lt;h2&gt;
  
  
  Beyond Simple Signatures
&lt;/h2&gt;

&lt;p&gt;Traditional fingerprinting is rule-based:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;IF port == 2375 AND banner contains "Docker" 
  THEN service = Docker API
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Vector-based learning captures behavior:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Port 2375: SYN-ACK succeeds, TTL=63, window=65408, 
 no banner, immediate FIN on data send, 
 appears alongside ports 9000+9184 (Sui consensus/metrics),
 ASN indicates Vultr hosting"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Similarity search returns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"47 similar scans: 46 showed identical 'ghost port' behavior,
 1 had actual Docker (flagged as anomaly)"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Conclusion: 98% confidence this is NOT Docker, likely cloud infrastructure artifact&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Watching It Learn
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Scans 1-20 (Initial learning)&lt;/strong&gt;: System tests Docker APIs as expected, stores behavioral metadata showing timeouts and connection refusals.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Scans 21-50 (Pattern recognition)&lt;/strong&gt;: Vector similarity search starts clustering:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Query: Scan with ports [22, 80, 2375, 2376, 9000, 9184]

Top matches:
- Scan #14: 96% similarity → 2375/2376 ghost ports
- Scan #8:  94% similarity → 2375/2376 ghost ports  
- Scan #19: 93% similarity → 2375/2376 ghost ports

Pattern confidence: 0.85 (17/20 matching scans)
Recommendation: Skip Docker testing on 2375/2376
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Scans 51+ (Optimized)&lt;/strong&gt;: High confidence behavioral signatures:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"similar_scan_count"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;47&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.96&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"2375_behavior"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ghost_port - skip Docker probes"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"estimated_time_saved"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"45s per scan"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Results
&lt;/h2&gt;

&lt;p&gt;After 200 scans of the same infrastructure:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Time efficiency&lt;/strong&gt;: 58 seconds per scan → 20 seconds per scan (66% reduction)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Probe efficiency&lt;/strong&gt;: 7.2 probes per host → 3.8 probes per host (47% less network traffic)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;False positives&lt;/strong&gt;: 2.4 per scan → 0.3 per scan (87% reduction)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pattern recognition speed&lt;/strong&gt;: Confident patterns (&amp;gt;0.85 similarity) after just 18-25 similar scans&lt;/p&gt;

&lt;p&gt;But here's the coolest part: &lt;strong&gt;anomaly detection&lt;/strong&gt;. On scan #73, port 2375 actually responded with a Docker API. The system immediately flagged it: "Unusual behavior — historical data shows 0.02% Docker response rate." Turned out to be a misconfigured node that needed immediate attention.&lt;/p&gt;

&lt;h2&gt;
  
  
  Practical Considerations
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Similarity thresholds matter&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Homogeneous infrastructure (like validators): 0.75-0.85&lt;/li&gt;
&lt;li&gt;Mixed environments: 0.65-0.75&lt;/li&gt;
&lt;li&gt;Pentesting diverse targets: 0.60-0.70&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cold start problem&lt;/strong&gt;: First 10-20 scans of new infrastructure provide no optimization. Mitigation: seed database with known patterns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Temporal drift&lt;/strong&gt;: Infrastructure changes over time. I time-weight similarity to prefer recent scans.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Embedding overhead&lt;/strong&gt;: Adds 50-100ms per scan. I generate embeddings asynchronously in production.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Matters
&lt;/h2&gt;

&lt;p&gt;Traditional security scanners treat every scan as a fresh start. They're like someone with no short-term memory, asking the same questions over and over. This made sense 20 years ago when each network was unique.&lt;/p&gt;

&lt;p&gt;But modern security teams scan thousands of similar nodes repeatedly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Development environments that clone production&lt;/li&gt;
&lt;li&gt;Auto-scaling cloud infrastructure&lt;/li&gt;
&lt;li&gt;Container clusters with identical configurations&lt;/li&gt;
&lt;li&gt;Blockchain validator networks (my use case)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Vector-based behavioral fingerprinting lets scanners accumulate institutional knowledge that compounds over time. They get smarter with every scan, building confidence about what's normal and what's anomalous.&lt;/p&gt;

&lt;p&gt;As cloud infrastructure grows more complex — with synthetic network responses, polymorphic services, and dynamic topologies — we need security tools that learn. Not just from signature databases, but from their own experience.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;I'm exploring:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Multi-modal embeddings combining text with numeric TCP fingerprints&lt;/li&gt;
&lt;li&gt;Transfer learning: do patterns from Sui validators apply to Ethereum nodes?&lt;/li&gt;
&lt;li&gt;Hierarchical clustering to automatically build infrastructure taxonomies&lt;/li&gt;
&lt;li&gt;Tracking temporal pattern evolution to detect infrastructure migrations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The core insight stands: &lt;strong&gt;every scan is a training example&lt;/strong&gt;. Stop forgetting. Start remembering.&lt;/p&gt;




&lt;p&gt;I’m publishing the open source code here: github.com/pgdn-oss. Built with PostgreSQL, pgvector, and OpenAI embeddings. Part of a new venture, coming soon.&lt;/p&gt;

</description>
      <category>cybersecurity</category>
      <category>vectordatabase</category>
      <category>security</category>
      <category>openai</category>
    </item>
  </channel>
</rss>
