<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: patrickbloem-it</title>
    <description>The latest articles on Forem by patrickbloem-it (@patrickbloemit).</description>
    <link>https://forem.com/patrickbloemit</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3685539%2Ff3413164-fc90-4001-ac4e-cae2491dae49.png</url>
      <title>Forem: patrickbloem-it</title>
      <link>https://forem.com/patrickbloemit</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/patrickbloemit"/>
    <language>en</language>
    <item>
      <title>Goodbye Fail2Ban: Hardening Netbird &amp; Caddy with CrowdSec</title>
      <dc:creator>patrickbloem-it</dc:creator>
      <pubDate>Wed, 31 Dec 2025 07:15:07 +0000</pubDate>
      <link>https://forem.com/patrickbloemit/goodbye-fail2ban-hardening-netbird-caddy-with-crowdsec-29g6</link>
      <guid>https://forem.com/patrickbloemit/goodbye-fail2ban-hardening-netbird-caddy-with-crowdsec-29g6</guid>
      <description>&lt;h1&gt;
  
  
  Goodbye Fail2Ban: Hardening Netbird &amp;amp; Caddy with CrowdSec
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;Published:&lt;/strong&gt; December 31, 2025 | &lt;strong&gt;Reading Time:&lt;/strong&gt; 12 min&lt;/p&gt;




&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;We migrated our Netbird VPN Management Server from &lt;strong&gt;Fail2Ban&lt;/strong&gt; to &lt;strong&gt;CrowdSec&lt;/strong&gt;, reducing SSH/HTTP attack noise by &lt;strong&gt;99%&lt;/strong&gt; and shifting from reactive (ban after 5 failed attempts) to preventive (block IPs from community threat intelligence &lt;em&gt;before&lt;/em&gt; they touch our server). This post dives into &lt;em&gt;why&lt;/em&gt; we made the leap and how you can too—with step-by-step code.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem: Fail2Ban in 2025
&lt;/h2&gt;

&lt;p&gt;For a decade, &lt;strong&gt;Fail2Ban&lt;/strong&gt; was the gold standard for simple server hardening. You set up a few regex rules, pointed it at &lt;code&gt;/var/log/auth.log&lt;/code&gt;, and called it a day. But here's the thing: &lt;strong&gt;Fail2Ban is architecturally reactive.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Fail2Ban Falls Short
&lt;/h3&gt;

&lt;h4&gt;
  
  
  1. &lt;strong&gt;Reactivity is a Liability&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Fail2Ban works like a smoke detector that only triggers &lt;em&gt;after&lt;/em&gt; the fire has already spread. An attacker needs to hit your SSH port &lt;strong&gt;5+ times&lt;/strong&gt; before the rule kicks in. In a world of distributed botnets with 10,000+ IP addresses, that's 50,000 free attempts to probe your system before you even block a single one.&lt;/p&gt;

&lt;p&gt;Our logs showed the same pattern: every night, 500+ bogus SSH handshakes from different IPs, each one landing in &lt;code&gt;auth.log&lt;/code&gt; and consuming CPU cycles for regex matching. The attacker's goal isn't to brute-force your password (they know that's futile)—it's to &lt;strong&gt;map your infrastructure, test for open ports, and document your responses for later weaponization.&lt;/strong&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  2. &lt;strong&gt;The Silo Problem: You're Alone&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Fail2Ban is completely blind to the outside world. It works in isolation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real-world scenario:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;An IP (let's say &lt;code&gt;203.0.113.42&lt;/code&gt;) is aggressively scanning 500 servers across Europe simultaneously.&lt;/li&gt;
&lt;li&gt;With Fail2Ban, &lt;em&gt;your&lt;/em&gt; server doesn't know about the activity on &lt;em&gt;their&lt;/em&gt; servers.&lt;/li&gt;
&lt;li&gt;You wait passively until &lt;code&gt;203.0.113.42&lt;/code&gt; hits your SSH port 5 times.&lt;/li&gt;
&lt;li&gt;In the meantime, it's already fingerprinted 499 other servers and exfiltrated data from at least 100 of them.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;With CrowdSec + CAPI (Community API):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The same IP probes a server in France (CrowdSec instance #1).&lt;/li&gt;
&lt;li&gt;It scans a server in Germany (CrowdSec instance #2).&lt;/li&gt;
&lt;li&gt;It touches your server in the Netherlands (instance #3).&lt;/li&gt;
&lt;li&gt;Within &lt;strong&gt;seconds&lt;/strong&gt;, the community reaches consensus: this IP is malicious.&lt;/li&gt;
&lt;li&gt;All 3 servers (+ 8,000+ others running CrowdSec) block it preventively.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You're no longer fighting alone. You're part of a &lt;strong&gt;"Waze for Cyber-Security"&lt;/strong&gt; where threat signals are shared globally.&lt;/p&gt;

&lt;h4&gt;
  
  
  3. &lt;strong&gt;Regex Hell in the Age of JSON&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Modern web servers like &lt;strong&gt;Caddy&lt;/strong&gt; output structured JSON logs, not plain text. Fail2Ban's strength—regex-based parsing—becomes a liability.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A realistic Fail2Ban filter for Caddy:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[Definition]&lt;/span&gt;
&lt;span class="py"&gt;failregex&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;^(?P&amp;lt;host&amp;gt;&lt;/span&gt;&lt;span class="se"&gt;\S&lt;/span&gt;&lt;span class="s"&gt;+) - (?P&amp;lt;user&amp;gt;&lt;/span&gt;&lt;span class="se"&gt;\S&lt;/span&gt;&lt;span class="s"&gt;+) &lt;/span&gt;&lt;span class="se"&gt;\[&lt;/span&gt;&lt;span class="s"&gt;(?P&amp;lt;time&amp;gt;&lt;/span&gt;&lt;span class="se"&gt;\d&lt;/span&gt;&lt;span class="s"&gt;{2}/&lt;/span&gt;&lt;span class="se"&gt;\w&lt;/span&gt;&lt;span class="s"&gt;+/&lt;/span&gt;&lt;span class="se"&gt;\d&lt;/span&gt;&lt;span class="s"&gt;{4}:&lt;/span&gt;&lt;span class="se"&gt;\d&lt;/span&gt;&lt;span class="s"&gt;{2}:&lt;/span&gt;&lt;span class="se"&gt;\d&lt;/span&gt;&lt;span class="s"&gt;{2}:&lt;/span&gt;&lt;span class="se"&gt;\d&lt;/span&gt;&lt;span class="s"&gt;{2}) (?P&amp;lt;tz&amp;gt;[&lt;/span&gt;&lt;span class="se"&gt;\+\-&lt;/span&gt;&lt;span class="s"&gt;]&lt;/span&gt;&lt;span class="se"&gt;\d&lt;/span&gt;&lt;span class="s"&gt;{4})&lt;/span&gt;&lt;span class="se"&gt;\]&lt;/span&gt; &lt;span class="s"&gt;"(?P&amp;lt;method&amp;gt;&lt;/span&gt;&lt;span class="se"&gt;\S&lt;/span&gt;&lt;span class="s"&gt;+) (?P&amp;lt;uri&amp;gt;&lt;/span&gt;&lt;span class="se"&gt;\S&lt;/span&gt;&lt;span class="s"&gt;+) (?P&amp;lt;proto&amp;gt;&lt;/span&gt;&lt;span class="se"&gt;\S&lt;/span&gt;&lt;span class="s"&gt;+)"&lt;/span&gt; &lt;span class="s"&gt;(?P&amp;lt;status&amp;gt;&lt;/span&gt;&lt;span class="se"&gt;\d&lt;/span&gt;&lt;span class="s"&gt;+) (?P&amp;lt;size&amp;gt;&lt;/span&gt;&lt;span class="se"&gt;\S&lt;/span&gt;&lt;span class="s"&gt;+) "(?P&amp;lt;referer&amp;gt;&lt;/span&gt;&lt;span class="se"&gt;\S&lt;/span&gt;&lt;span class="s"&gt;+)" "(?P&amp;lt;user_agent&amp;gt;&lt;/span&gt;&lt;span class="se"&gt;\S&lt;/span&gt;&lt;span class="s"&gt;+)" (?P&amp;lt;response_time&amp;gt;&lt;/span&gt;&lt;span class="se"&gt;\d&lt;/span&gt;&lt;span class="s"&gt;+)$&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is &lt;strong&gt;fragile.&lt;/strong&gt; The moment Caddy's log format changes (which happens with updates), your filter breaks. You're maintaining a hairball of escape sequences when &lt;strong&gt;CrowdSec just parses JSON natively.&lt;/strong&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  4. &lt;strong&gt;CPU Overhead at Scale&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;When a DDoS hits or a botnet wakes up, Fail2Ban's Python daemon becomes a bottleneck. Log parsing + regex matching + decision making = CPU spikes. Meanwhile, Go-based CrowdSec handles the same load with a fraction of the resources.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Solution: CrowdSec (Philosophy &amp;amp; Architecture)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;CrowdSec&lt;/strong&gt; is a complete rethinking of intrusion prevention. It decouples detection from response and introduces &lt;strong&gt;collaborative threat intelligence&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Core Principles
&lt;/h3&gt;

&lt;h4&gt;
  
  
  1. &lt;strong&gt;Collaborative Intelligence (CAPI)&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;CrowdSec works like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Your server's &lt;strong&gt;CrowdSec Security Engine&lt;/strong&gt; analyzes logs and detects suspicious patterns.&lt;/li&gt;
&lt;li&gt;When consensus is reached (an IP matches multiple scenarios or is flagged by multiple instances), a signal is sent to the &lt;strong&gt;Community API (CAPI).&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Once enough independent instances flag the same IP, it lands on the &lt;strong&gt;Community Blocklist.&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Your &lt;strong&gt;firewall bouncer&lt;/strong&gt; downloads this list and blocks attackers &lt;em&gt;before&lt;/em&gt; they send packets.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;The beauty:&lt;/strong&gt; You benefit from the collective intelligence of 10,000+ admins. You don't have to wait for &lt;em&gt;your&lt;/em&gt; server to be attacked 5 times—you get early warning from the network effect.&lt;/p&gt;

&lt;h4&gt;
  
  
  2. &lt;strong&gt;Decoupled Architecture&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Unlike Fail2Ban's monolithic design, CrowdSec separates concerns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌──────────────────────────────────────────┐
│   CrowdSec Security Engine (Go)          │
│   - Parses logs                          │
│   - Matches scenarios                    │
│   - Makes decisions                      │
└──────────┬───────────────────────────────┘
           │ (Local API)
      ┌────┴──────────────────────────────────────┐
      │                                           │
┌─────▼──────────────┐              ┌────────────▼──────────────┐
│   Firewall Bouncer │              │   HTTP Bouncer (WAF)      │
│   (nftables/iptables)             │   (Layer 7 blocking)      │
└────────────────────┘              └───────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;You decide where to block:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Firewall level (nftables):&lt;/strong&gt; Fastest, most efficient. Drop packets before they consume resources.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HTTP level (Layer 7):&lt;/strong&gt; Apply business logic. Block based on request headers, paths, etc.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Application level:&lt;/strong&gt; Custom responses, logging, rate limiting.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We chose &lt;strong&gt;firewall-level blocking (nftables)&lt;/strong&gt; because it's most efficient for a hardened VPN management server.&lt;/p&gt;

&lt;h4&gt;
  
  
  3. &lt;strong&gt;Scenario-Based Detection (Not Just Counting)&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Fail2Ban counts failures. CrowdSec understands context.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example scenario: HTTP Crawling&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;crowdsecurity/http-crawl-non_statics&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Detects&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;aggressive&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;crawling&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;of&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;non-static&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;resources"&lt;/span&gt;
&lt;span class="na"&gt;filter&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;http_status&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;404&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;  &lt;span class="c1"&gt;# Many 404s indicates scanning&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;user_agent&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;scrapy&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;nikto&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;sqlmap&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;  &lt;span class="c1"&gt;# Known scanning tools&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;request_uri&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!~&lt;/span&gt; &lt;span class="s"&gt;/\.(jpg|css|js|png)$/&lt;/span&gt;  &lt;span class="c1"&gt;# Not static resources&lt;/span&gt;
&lt;span class="na"&gt;detection&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;trigger&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="s"&gt;(count(events) &amp;gt; 20) &amp;amp;&amp;amp;&lt;/span&gt;
      &lt;span class="s"&gt;(duration &amp;lt; 5m) &amp;amp;&amp;amp;&lt;/span&gt;
      &lt;span class="s"&gt;(user_agent matches malicious_patterns)&lt;/span&gt;
&lt;span class="na"&gt;action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ban&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The difference:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fail2Ban:&lt;/strong&gt; "5 failed SSH attempts = ban"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CrowdSec:&lt;/strong&gt; "20 HTTP 404s in 5 minutes + suspicious User-Agent = likely scanner. Check if other instances flagged this IP. If yes, consensus reached = ban."&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Our Infrastructure: Netbird + Caddy + CrowdSec
&lt;/h2&gt;

&lt;h3&gt;
  
  
  System Overview
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Internet Traffic
       ↓
┌──────────────────────────────────────┐
│  nftables (Firewall)                 │
│  ├─ CrowdSec Rules (DROP malicious)  │
│  └─ SSH (Port 2222)                  │
└──────────────────────────────────────┘
       ↓
┌──────────────────────────────────────┐
│  Caddy Reverse Proxy                 │
│  ├─ TLS Termination                  │
│  ├─ JSON Access Logs → CrowdSec      │
│  └─ Reverse Proxy to Netbird (8080)  │
└──────────────────────────────────────┘
       ↓
Netbird VPN Management API
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  OS &amp;amp; Versions
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;OS:&lt;/strong&gt; Ubuntu 24.04 LTS (Noble Numbat)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CrowdSec:&lt;/strong&gt; v1.6+&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Caddy:&lt;/strong&gt; Latest (built from source or package)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Firewall:&lt;/strong&gt; nftables (Ubuntu 24.04 default)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bouncer:&lt;/strong&gt; crowdsec-firewall-bouncer-nftables&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Implementation: The Code
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Install CrowdSec
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Add repository&lt;/span&gt;
curl &lt;span class="nt"&gt;-s&lt;/span&gt; https://install.crowdsec.net | &lt;span class="nb"&gt;sudo &lt;/span&gt;sh
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt update

&lt;span class="c"&gt;# Install security engine&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; crowdsec

&lt;span class="c"&gt;# Install collections (SSH, syslog, etc.)&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;cscli collections &lt;span class="nb"&gt;install &lt;/span&gt;crowdsecurity/linux
&lt;span class="nb"&gt;sudo &lt;/span&gt;cscli collections &lt;span class="nb"&gt;install &lt;/span&gt;crowdsecurity/caddy-logs
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl reload crowdsec
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Configure Caddy for JSON Logging
&lt;/h3&gt;

&lt;p&gt;CrowdSec's Caddy parser expects JSON logs. Configure your &lt;code&gt;Caddyfile&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
    log {
        output file /var/log/caddy/access.log {
            roll_size 100mb
            roll_keep 5
            roll_keep_for 720h
        }
        format json
        level info
    }
}

# Your reverse proxy
netbird.example.com {
    encode gzip
    reverse_proxy localhost:8080 {
        header_up Host {host}
        header_up X-Real-IP {remote_host}
        header_up X-Forwarded-For {remote_host}
        header_up X-Forwarded-Proto {scheme}
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Restart Caddy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl restart caddy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Verify JSON output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo tail&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; /var/log/caddy/access.log | jq &lt;span class="s1"&gt;'.'&lt;/span&gt; | &lt;span class="nb"&gt;head&lt;/span&gt; &lt;span class="nt"&gt;-20&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Configure CrowdSec to Parse Caddy Logs
&lt;/h3&gt;

&lt;p&gt;Create &lt;code&gt;/etc/crowdsec/acquis.d/caddy.yaml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;filenames&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;/var/log/caddy/access.log&lt;/span&gt;
&lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;caddy&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Reload CrowdSec:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl reload crowdsec
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Verify parsing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;cscli metrics show acquisition

&lt;span class="c"&gt;# Expected output:&lt;/span&gt;
&lt;span class="c"&gt;# crowdsecurity/caddy-logs  │ 1234 │ 0 │ 0 │ 0 │ 0 │ 1234&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Install Firewall Bouncer (nftables)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; crowdsec-firewall-bouncer-nftables
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl &lt;span class="nb"&gt;enable &lt;/span&gt;crowdsec-firewall-bouncer
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl start crowdsec-firewall-bouncer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Verify bouncer is registered:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;cscli bouncers list

&lt;span class="c"&gt;# Expected output:&lt;/span&gt;
&lt;span class="c"&gt;# Name: crowdsec-firewall-bouncer-nftables&lt;/span&gt;
&lt;span class="c"&gt;# Status: ✓ active&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 5: Customize Ban Duration
&lt;/h3&gt;

&lt;p&gt;By default, CrowdSec bans for 4 hours. We extended it to 48 hours for persistent botnets:&lt;/p&gt;

&lt;p&gt;Create &lt;code&gt;/etc/crowdsec/profiles.yaml.local&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;default&lt;/span&gt;
&lt;span class="na"&gt;debug&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
&lt;span class="na"&gt;rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ban&lt;/span&gt;
    &lt;span class="na"&gt;duration&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;48h&lt;/span&gt;
&lt;span class="na"&gt;notifications&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Reload:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl reload crowdsec
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Results &amp;amp; Metrics
&lt;/h2&gt;

&lt;p&gt;After the migration, here's what we observed:&lt;/p&gt;

&lt;h3&gt;
  
  
  Metrics
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;cscli metrics show
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Output (snapshot):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Acquisition (Logs being read):
  crowdsecurity/caddy-logs:     12,450 lines | 0 parse errors
  crowdsecurity/sshd-logs:       5,230 lines | 0 parse errors

Scenarios (Detection rules):
  crowdsecurity/http-crawl-non_statics:    142 decisions | 28 IPs banned
  crowdsecurity/ssh-bf:                    89 decisions | 15 IPs banned
  crowdsecurity/web-application-attacks:   34 decisions | 8 IPs banned

Bouncers:
  crowdsec-firewall-bouncer-nftables:     112 active bans
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Key Findings
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;99% Reduction in Log Noise:&lt;/strong&gt; Before CrowdSec, &lt;code&gt;/var/log/auth.log&lt;/code&gt; filled 2GB per day (SSH probes). Now: 20MB per day. Why? IPs are blocked at the firewall level—the packets never reach sshd.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Community Blocklist Efficiency:&lt;/strong&gt; Of 112 active bans, &lt;strong&gt;95+ were from the community blocklist.&lt;/strong&gt; We never saw the initial attack; CrowdSec's CAPI blocked it preemptively.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Caddy JSON Parsing:&lt;/strong&gt; Zero failed parses. CrowdSec handled log format updates seamlessly (JSON is self-describing).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;CPU Impact:&lt;/strong&gt; CrowdSec Security Engine consistently ~2-5% CPU. Caddy logs parsed in real-time without overhead.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Operational Insights
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Monitoring &amp;amp; Debugging
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Check active bans:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;cscli decisions list

&lt;span class="c"&gt;# Output:&lt;/span&gt;
&lt;span class="c"&gt;# Duration │ Scope │ Value           │ Decision │ Reason&lt;/span&gt;
&lt;span class="c"&gt;# 48h      │ ip    │ 192.0.2.100     │ ban      │ crowdsecurity/http-crawl-non_statics&lt;/span&gt;
&lt;span class="c"&gt;# 48h      │ ip    │ 198.51.100.42   │ ban      │ crowdsecurity/ssh-bf&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;View alerts (why decisions were made):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;cscli alerts list &lt;span class="nt"&gt;--ip&lt;/span&gt; 192.0.2.100

&lt;span class="c"&gt;# Output:&lt;/span&gt;
&lt;span class="c"&gt;# Alert ID: 4521&lt;/span&gt;
&lt;span class="c"&gt;# Start Time: 2025-12-31T10:15:30Z&lt;/span&gt;
&lt;span class="c"&gt;# End Time: 2025-12-31T10:20:45Z&lt;/span&gt;
&lt;span class="c"&gt;# Scenario: crowdsecurity/http-crawl-non_statics&lt;/span&gt;
&lt;span class="c"&gt;# Events Count: 145&lt;/span&gt;
&lt;span class="c"&gt;# Remediation: ban for 48h&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Live nftables monitoring:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# See packets being dropped&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;nft monitor

&lt;span class="c"&gt;# Or check statistics&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;nft list ruleset | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-A&lt;/span&gt; 10 &lt;span class="s2"&gt;"crowdsec-drop"&lt;/span&gt;

&lt;span class="c"&gt;# Example:&lt;/span&gt;
&lt;span class="c"&gt;# chain crowdsec-drop (priority filter -1; policy accept;)&lt;/span&gt;
&lt;span class="c"&gt;#   packets 28,432 bytes 1,842,560&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Lessons Learned
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Community Blocklist is worth its weight in gold.&lt;/strong&gt; We blocked threats 99% of the time &lt;em&gt;before&lt;/em&gt; they touched our infrastructure.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;JSON logging is non-negotiable.&lt;/strong&gt; If you're using a modern web server (Caddy, Nginx with JSON output, etc.), do yourself a favor and enable it. Regex-based parsing is yesterday's technology.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Go &amp;gt; Python for performance.&lt;/strong&gt; CrowdSec's Go engine is fast enough that you can parse 10,000+ log lines per second on a modest server. Fail2Ban would choke.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Bouncers are flexible.&lt;/strong&gt; We chose nftables, but CrowdSec supports HTTP bouncers (Layer 7), Nginx modules, cloud API integrations (Cloudflare, AWS), and more. Pick what fits your architecture.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Potential Pitfalls &amp;amp; Solutions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Issue: Bouncer Not Authenticating
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Symptom:&lt;/strong&gt; &lt;code&gt;crowdsec-firewall-bouncer&lt;/code&gt; status shows "offline" or "error."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Regenerate credentials&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt reinstall &lt;span class="nt"&gt;-y&lt;/span&gt; crowdsec-firewall-bouncer-nftables

&lt;span class="c"&gt;# Restart both&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl restart crowdsec
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl restart crowdsec-firewall-bouncer

&lt;span class="c"&gt;# Verify&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;cscli bouncers list
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Issue: No Decisions Being Made
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Symptom:&lt;/strong&gt; &lt;code&gt;cscli decisions list&lt;/code&gt; returns empty.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Verify logs are being read:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;   &lt;span class="nb"&gt;sudo &lt;/span&gt;cscli metrics show acquisition
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If counts are flat, CrowdSec isn't reading logs.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Check file permissions:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;   &lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="nt"&gt;-la&lt;/span&gt; /var/log/caddy/access.log
   &lt;span class="c"&gt;# crowdsec user must have read permissions&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Reload CrowdSec:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;   &lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl reload crowdsec
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Issue: False Positives (Legitimate Traffic Blocked)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Symptom:&lt;/strong&gt; Users report access denied, but they're legitimate.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Add them to a whitelist:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;   &lt;span class="nb"&gt;sudo &lt;/span&gt;cscli decisions add &lt;span class="nt"&gt;--ip&lt;/span&gt; 203.0.113.99 &lt;span class="nt"&gt;--duration&lt;/span&gt; 0 &lt;span class="nt"&gt;--type&lt;/span&gt; whitelist
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Or disable a specific scenario temporarily:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;   &lt;span class="nb"&gt;sudo &lt;/span&gt;cscli scenarios disable crowdsecurity/http-crawl-non_statics
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Conclusions &amp;amp; Recommendations
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Why We Recommend CrowdSec for Production
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Security Posture:&lt;/strong&gt; Preventive &amp;gt; reactive. You're protected by the collective intelligence of 10,000+ instances.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Operational Simplicity:&lt;/strong&gt; JSON parsing, decoupled bouncers, rich dashboards.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance:&lt;/strong&gt; Go-based engine, minimal CPU overhead, scales to 10,000+ rules.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transparency:&lt;/strong&gt; Open-source, community-driven, audit-friendly.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Next Steps
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Automate backups&lt;/strong&gt; of &lt;code&gt;/etc/crowdsec/&lt;/code&gt; for disaster recovery.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Set up dashboards&lt;/strong&gt; at &lt;a href="https://console.crowdsec.net" rel="noopener noreferrer"&gt;console.crowdsec.net&lt;/a&gt; to visualize threats across your fleet.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enable notifications&lt;/strong&gt; (Slack, email) for critical alerts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fine-tune scenarios&lt;/strong&gt; by adjusting thresholds and ban durations for your use case.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Integrate with your SIEM&lt;/strong&gt; (ELK, Splunk, etc.) for centralized logging.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.crowdsec.net/" rel="noopener noreferrer"&gt;CrowdSec Official Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.crowdsec.net/blog/crowdsec-not-your-typical-fail2ban-clone" rel="noopener noreferrer"&gt;CrowdSec vs. Fail2Ban: A Deep Dive&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.crowdsec.net/blog/secure-caddy-crowdsec-remediation-waf-guide" rel="noopener noreferrer"&gt;Caddy + CrowdSec Setup Guide&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://owasp.org/www-community/attacks/Web_Application_Firewall" rel="noopener noreferrer"&gt;OWASP: Web Application Firewall (WAF)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;Author:&lt;/strong&gt; Patrick Bloem &lt;br&gt;
&lt;strong&gt;Published:&lt;/strong&gt; December 31, 2025&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Tested On:&lt;/strong&gt; Ubuntu 24.04 LTS | CrowdSec v1.6+ | Caddy v2.x&lt;/p&gt;

&lt;p&gt;Have questions? Drop them in the comments or [open an issue on GitHub]&lt;a href="https://github.com/patrickbloem-it/server-hardening-crowdsec/" rel="noopener noreferrer"&gt;https://github.com/patrickbloem-it/server-hardening-crowdsec/&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>security</category>
      <category>tutorial</category>
      <category>devops</category>
      <category>linux</category>
    </item>
    <item>
      <title>Beyond `apt upgrade`: Automating Linux Hardening for Public Sector Workloads</title>
      <dc:creator>patrickbloem-it</dc:creator>
      <pubDate>Wed, 31 Dec 2025 06:28:31 +0000</pubDate>
      <link>https://forem.com/patrickbloemit/beyond-apt-upgrade-automating-linux-hardening-for-public-sector-workloads-4m69</link>
      <guid>https://forem.com/patrickbloemit/beyond-apt-upgrade-automating-linux-hardening-for-public-sector-workloads-4m69</guid>
      <description>&lt;h1&gt;
  
  
  The Myth of the "Secure Default"
&lt;/h1&gt;

&lt;p&gt;There is a prevalent misconception in public sector IT that deploying an LTS release of Ubuntu or Debian implies a baseline of security. It does not. It implies stability, not hardening.&lt;/p&gt;

&lt;p&gt;A standard cloud image is designed for &lt;strong&gt;compatibility&lt;/strong&gt; and &lt;strong&gt;onboarding friction reduction&lt;/strong&gt;. It is engineered to ensure that &lt;code&gt;ssh root@&amp;lt;ip&amp;gt;&lt;/code&gt; works immediately. Conversely, a BSI-compliant or CIS-hardened system is designed for &lt;strong&gt;isolation&lt;/strong&gt; and &lt;strong&gt;auditability&lt;/strong&gt;. These two design philosophies are mutually exclusive.&lt;/p&gt;

&lt;p&gt;In regulated environments—specifically under BSI IT-Grundschutz (SYS.1.3) or GDPR Art. 32 requirements—manual hardening is an anti-pattern. If you are editing &lt;code&gt;/etc/ssh/sshd_config&lt;/code&gt; by hand in 2025, you have already failed the audit. You cannot prove consistency across 50 nodes if your configuration method relies on human memory.&lt;/p&gt;

&lt;p&gt;This article outlines an architectural approach to automated, idempotent server hardening, moving beyond simple package updates to systemic attack surface reduction.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Compliance Gap
&lt;/h2&gt;

&lt;p&gt;When we deploy a fresh Debian 12 or Ubuntu 24.04 image, we inherit technical debt immediately. Let's look at the delta between a "Fresh Install" and a "Compliance-Ready" state:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Default State&lt;/th&gt;
&lt;th&gt;Required State (CIS/BSI)&lt;/th&gt;
&lt;th&gt;The Risk&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SSH&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Port 22, Password Auth&lt;/td&gt;
&lt;td&gt;Port 2222 (obscurity), Key-Only, Crypto Policies&lt;/td&gt;
&lt;td&gt;Brute-force botnets, Credential Stuffing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Kernel&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;IPv4 Forwarding disabled (mostly), ICMP Redirects enabled&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;accept_redirects=0&lt;/code&gt;, &lt;code&gt;dmesg_restrict=1&lt;/code&gt;, &lt;code&gt;bpf_jit_harden=2&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;MITM, Kernel Pointer Leaks, eBPF exploits&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Audit&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;auditd&lt;/code&gt; package often missing&lt;/td&gt;
&lt;td&gt;Rules for &lt;code&gt;execve&lt;/code&gt;, &lt;code&gt;passwd&lt;/code&gt;, &lt;code&gt;sudo&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;No forensic trail for privilege escalation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;FS&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;/tmp&lt;/code&gt; executable&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;noexec&lt;/code&gt;, &lt;code&gt;nosuid&lt;/code&gt;, &lt;code&gt;nodev&lt;/code&gt; on tmpfs&lt;/td&gt;
&lt;td&gt;Malware execution in world-writable dirs&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Architecture of an Automated Hardening Pipeline
&lt;/h2&gt;

&lt;p&gt;We do not write "scripts". We write &lt;strong&gt;state enforcement modules&lt;/strong&gt;. Whether you use Ansible, Salt, or a bootstrap shell framework, the logic remains identical.&lt;/p&gt;

&lt;p&gt;The repository &lt;code&gt;hardened-vps-bootstrap&lt;/code&gt; (linked below) implements this logic in pure Bash to remain dependency-free on air-gapped systems.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. SSH: Crypto Policy and Obscurity
&lt;/h3&gt;

&lt;p&gt;Changing the SSH port is controversial. Purists argue it is "Security by Obscurity". In practice, moving SSH to port &lt;code&gt;2222&lt;/code&gt; (or higher) reduces log noise by approximately 99%. This is not about hiding from a targeted attacker; it is about reducing the signal-to-noise ratio so your SIEM can actually detect the targeted attacker.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Implementation:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Force Post-Quantum and High-Security Ciphers&lt;br&gt;
echo "Ciphers &lt;a href="mailto:chacha20-poly1305@openssh.com"&gt;chacha20-poly1305@openssh.com&lt;/a&gt;,&lt;a href="mailto:aes256-gcm@openssh.com"&gt;aes256-gcm@openssh.com&lt;/a&gt;" &amp;gt;&amp;gt; /etc/ssh/sshd_config&lt;br&gt;
echo "KexAlgorithms &lt;a href="mailto:sntrup761x25519-sha512@openssh.com"&gt;sntrup761x25519-sha512@openssh.com&lt;/a&gt;,curve25519-sha256" &amp;gt;&amp;gt; /etc/ssh/sshd_config&lt;br&gt;
echo "MACs &lt;a href="mailto:hmac-sha2-512-etm@openssh.com"&gt;hmac-sha2-512-etm@openssh.com&lt;/a&gt;" &amp;gt;&amp;gt; /etc/ssh/sshd_config&lt;/p&gt;

&lt;p&gt;Disable Legacy Auth&lt;br&gt;
sed -i 's/^#?PasswordAuthentication./PasswordAuthentication no/' /etc/ssh/sshd_config&lt;br&gt;
sed -i 's/^#?PermitRootLogin./PermitRootLogin no/' /etc/ssh/sshd_config&lt;/p&gt;

&lt;p&gt;text&lt;/p&gt;

&lt;p&gt;We explicitly disable &lt;code&gt;PasswordAuthentication&lt;/code&gt;. Relying on weak passwords in an era of GPU-accelerated cracking clusters is negligence.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Kernel Hardening: The Silent Layer
&lt;/h3&gt;

&lt;p&gt;The kernel network stack is permissive by default. We need to lock down ICMP handling and memory access.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Sysctl Parameters:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;code&gt;net.ipv4.tcp_syncookies = 1&lt;/code&gt;: Essential protection against SYN flood DoS attacks.&lt;/li&gt;
&lt;li&gt;  &lt;code&gt;net.ipv4.conf.all.accept_redirects = 0&lt;/code&gt;: Prevents a rogue router on the same subnet from manipulating routing tables.&lt;/li&gt;
&lt;li&gt;  &lt;code&gt;kernel.dmesg_restrict = 1&lt;/code&gt;: Prevents unprivileged users from viewing the kernel ring buffer (&lt;code&gt;dmesg&lt;/code&gt;), which can leak memory addresses useful for exploit development (ASLR bypass).&lt;/li&gt;
&lt;li&gt;  &lt;code&gt;kernel.unprivileged_bpf_disabled = 1&lt;/code&gt;: Disables unprivileged eBPF usage. Recent kernel vulnerabilities often leverage eBPF; if your web app doesn't need it, disable it.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Audit Trails: The "Flight Recorder"
&lt;/h3&gt;

&lt;p&gt;Installing &lt;code&gt;auditd&lt;/code&gt; is useless without rules. Standard rulesets often miss the critical vector: &lt;strong&gt;Execution&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;We need to know &lt;em&gt;what&lt;/em&gt; commands were run, not just &lt;em&gt;who&lt;/em&gt; logged in.&lt;/p&gt;

&lt;p&gt;/etc/audit/rules.d/exec.rules&lt;br&gt;
Capture all command executions (sys_execve) for valid UIDs&lt;br&gt;
-a always,exit -F arch=b64 -S execve -F euid&amp;gt;=1000 -F euid!=4294967295 -k audit_cmd&lt;/p&gt;

&lt;p&gt;text&lt;/p&gt;

&lt;p&gt;This ensures that if an attacker manages to run &lt;code&gt;./exploit.sh&lt;/code&gt;, the execution event—including arguments—is logged to &lt;code&gt;/var/log/audit/audit.log&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Automation vs. Documentation
&lt;/h2&gt;

&lt;p&gt;A runbook is dead the moment it is written. Code is alive.&lt;/p&gt;

&lt;p&gt;By encapsulating these hardening steps into a repository, we achieve:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Idempotency:&lt;/strong&gt; Re-running the script enforces the state again (correcting drift).&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Version Control:&lt;/strong&gt; We can trace &lt;em&gt;when&lt;/em&gt; we decided to disable &lt;code&gt;UsePAM&lt;/code&gt; via Git commit history.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Speed:&lt;/strong&gt; Mean Time To Recover (MTTR) drops significantly when server provisioning is automated.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The "Hardened VPS Bootstrap" Repository
&lt;/h2&gt;

&lt;p&gt;I have open-sourced the internal framework I use for public sector infrastructure projects. It is designed to be:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Minimal:&lt;/strong&gt; No Python/Ruby dependencies.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Modular:&lt;/strong&gt; Enable/Disable features via flags.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Audit-Ready:&lt;/strong&gt; Logs every change it makes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It covers SSH, Sysctl, Fail2Ban/CrowdSec, UFW, and Auto-Updates.&lt;/p&gt;

&lt;p&gt;👉 &lt;strong&gt;GitHub Repository:&lt;/strong&gt; &lt;a href="https://github.com/patrick-bloem/hardened-vps-bootstrap" rel="noopener noreferrer"&gt;patrick-bloem/hardened-vps-bootstrap&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Security is not a product; it is a configuration state. Standard Linux distributions prioritize the "Out of the Box" experience. As infrastructure engineers, our job is to pivot that priority towards "Secure by Design".&lt;/p&gt;

&lt;p&gt;Stop trusting the defaults. Verify your sysctls. Automate your hardening.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;About the Author&lt;/strong&gt;&lt;br&gt;
Patrick Bloem is a Senior Infrastructure Engineer specializing in BSI-compliant Linux environments, ZFS storage solutions, and network segregation in the public sector.&lt;/p&gt;

</description>
      <category>linux</category>
      <category>security</category>
      <category>automation</category>
      <category>devops</category>
    </item>
    <item>
      <title>Self-Hosting Netbird: A Privacy-First Alternative to Managed Overlay Networks</title>
      <dc:creator>patrickbloem-it</dc:creator>
      <pubDate>Tue, 30 Dec 2025 08:10:11 +0000</pubDate>
      <link>https://forem.com/patrickbloemit/self-hosting-netbird-a-privacy-first-alternative-to-managed-overlay-networks-3b3k</link>
      <guid>https://forem.com/patrickbloemit/self-hosting-netbird-a-privacy-first-alternative-to-managed-overlay-networks-3b3k</guid>
      <description>&lt;h1&gt;
  
  
  Self-Hosting Netbird: A Privacy-First Alternative to Managed Overlay Networks
&lt;/h1&gt;

&lt;p&gt;As infrastructure engineers in regulated environments, we often face a dilemma: modern overlay network solutions like Tailscale and Twingate offer excellent user experience, but their centralized control planes raise compliance concerns. For organizations operating under strict data governance frameworks (GDPR, BSI IT-Grundschutz, sector-specific regulations), self-hosted alternatives become a necessity rather than a preference.&lt;/p&gt;

&lt;p&gt;This article documents a production-ready deployment of &lt;strong&gt;Netbird&lt;/strong&gt;, an open-source WireGuard-based overlay network with a self-hosted management server, hardened using &lt;strong&gt;CrowdSec&lt;/strong&gt; for threat detection.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Self-Hosted Overlay Networks?
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Managed Service Trade-Off
&lt;/h3&gt;

&lt;p&gt;Managed solutions like Tailscale and Twingate provide:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Zero-configuration deployment&lt;/li&gt;
&lt;li&gt;Automatic NAT traversal&lt;/li&gt;
&lt;li&gt;Centralized access control&lt;/li&gt;
&lt;li&gt;Enterprise SSO integration&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;However, they introduce &lt;strong&gt;architectural dependencies&lt;/strong&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Managed (Tailscale/Twingate)&lt;/th&gt;
&lt;th&gt;Self-Hosted (Netbird)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Control Plane Location&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;US/Cloud Provider&lt;/td&gt;
&lt;td&gt;On-premises/Private VPS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Metadata Exposure&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Connection logs, peer IPs visible to provider&lt;/td&gt;
&lt;td&gt;Fully isolated&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Data Sovereignty&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Dependent on provider's infrastructure&lt;/td&gt;
&lt;td&gt;Complete control&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Vendor Lock-In&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Proprietary coordination protocol&lt;/td&gt;
&lt;td&gt;Open protocol (WireGuard)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Audit Trail&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Provider-controlled&lt;/td&gt;
&lt;td&gt;Self-managed&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For public sector entities or organizations handling sensitive data, the control plane location becomes a &lt;strong&gt;compliance blocker&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Netbird Architecture Overview
&lt;/h2&gt;

&lt;p&gt;Netbird decouples the control plane from the data plane:&lt;/p&gt;

&lt;p&gt;┌─────────────────────────────────────────┐&lt;br&gt;
│ Management Server (Self-Hosted) │&lt;br&gt;
│ ├─ Peer Registration &amp;amp; Authentication │&lt;br&gt;
│ ├─ Network Policy Distribution │&lt;br&gt;
│ └─ STUN/TURN Coordination (Coturn) │&lt;br&gt;
└─────────────────────────────────────────┘&lt;br&gt;
↓ (Metadata only)&lt;br&gt;
┌─────────────────────────────────────────┐&lt;br&gt;
│ Peer-to-Peer WireGuard Tunnels │&lt;br&gt;
│ (Direct connections, no relay) │&lt;br&gt;
└─────────────────────────────────────────┘&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Properties&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No traffic routing through management server&lt;/strong&gt;: After initial coordination, peers establish direct WireGuard tunnels.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;STUN/TURN fallback&lt;/strong&gt;: Only used when direct connections fail (corporate firewalls, symmetric NAT).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Identity Provider integration&lt;/strong&gt;: Uses OIDC (OpenID Connect) for authentication—works with Zitadel, Keycloak, Authentik.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Security Hardening with CrowdSec
&lt;/h2&gt;

&lt;p&gt;Standard Netbird deployments expose management APIs and STUN/TURN services to the internet. To mitigate brute-force attacks and resource exhaustion, we integrate &lt;strong&gt;CrowdSec&lt;/strong&gt;, a collaborative intrusion prevention system.&lt;/p&gt;

&lt;h3&gt;
  
  
  CrowdSec Integration Benefits
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Custom Log Parsing&lt;/strong&gt;: Netbird's Go-based logging requires custom Grok patterns to detect authentication failures.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Behavioral Analysis&lt;/strong&gt;: Leaky bucket scenarios identify repeated failed peer login attempts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Firewall Enforcement&lt;/strong&gt;: Direct &lt;code&gt;iptables&lt;/code&gt;/&lt;code&gt;nftables&lt;/code&gt; integration blocks malicious IPs before they reach application logic.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Community Intelligence&lt;/strong&gt;: Shares threat data with CrowdSec's global blocklist (opt-in).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Example Scenario&lt;/strong&gt;: Detect 5+ Netbird peer authentication failures within 30 minutes → ban source IP for 48 hours.&lt;/p&gt;




&lt;h2&gt;
  
  
  Deployment Architecture
&lt;/h2&gt;

&lt;p&gt;The stack I maintain consists of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Caddy&lt;/strong&gt;: Reverse proxy with automatic TLS (Let's Encrypt).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Netbird Management&lt;/strong&gt;: Peer coordination, policy enforcement.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zitadel&lt;/strong&gt;: Self-hosted OIDC identity provider (can be replaced with Authentik/Keycloak).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Coturn&lt;/strong&gt;: STUN/TURN server for NAT traversal (network-isolated via explicit port bindings).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CrowdSec + Firewall Bouncer&lt;/strong&gt;: Real-time threat blocking.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Configuration Philosophy&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Avoid &lt;code&gt;network_mode: host&lt;/code&gt; for service isolation.&lt;/li&gt;
&lt;li&gt;Use explicit IPv4/IPv6 port bindings instead of wildcard listeners.&lt;/li&gt;
&lt;li&gt;Log rotation limits (100MB per container) to prevent disk exhaustion.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The full Docker Compose configuration is available in my repository:&lt;br&gt;&lt;br&gt;
&lt;strong&gt;&lt;a href="https://github.com/patrickbloem-it/Netbird-self-hosted-stack" rel="noopener noreferrer"&gt;→ Netbird-self-hosted-stack on GitHub&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Coturn Hardening: A Critical Detail
&lt;/h2&gt;

&lt;p&gt;Many Netbird guides recommend deploying Coturn with &lt;code&gt;network_mode: host&lt;/code&gt; for simplicity. This bypasses Docker's network isolation and exposes the host directly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Our approach&lt;/strong&gt;: Explicit port binding to public IP addresses only.&lt;/p&gt;

&lt;p&gt;coturn:&lt;br&gt;
image: coturn/coturn:latest&lt;br&gt;
networks: [netbird]&lt;br&gt;
ports:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;'${PUBLIC_IP}:3478:3478/udp'&lt;/li&gt;
&lt;li&gt;'${PUBLIC_IP}:3478:3478/tcp'&lt;/li&gt;
&lt;li&gt;'${PUBLIC_IP}:5349:5349/tcp'&lt;/li&gt;
&lt;li&gt;'${PUBLIC_IP}:49152-65535:49152-65535/udp'
volumes:&lt;/li&gt;
&lt;li&gt;./config/turnserver.conf:/etc/turnserver.conf:ro&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Impact&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Container remains within Docker's bridge network.&lt;/li&gt;
&lt;li&gt;CrowdSec firewall rules apply uniformly across all services.&lt;/li&gt;
&lt;li&gt;No accidental exposure of host services on ephemeral ports.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  CrowdSec Custom Parsers
&lt;/h2&gt;

&lt;p&gt;Netbird's JSON-formatted logs don't match default CrowdSec parsers. Custom Grok patterns are required.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example: Netbird Management Authentication Failure Parser&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;/etc/crowdsec/parsers/s01-parse/netbird-auth.yaml&lt;br&gt;
filter: "evt.Parsed.program == 'netbird-management'"&lt;br&gt;
name: patrickbloem/netbird-auth-parser&lt;br&gt;
nodes:&lt;/p&gt;

&lt;p&gt;grok:&lt;br&gt;
pattern: '%{TIMESTAMP_ISO8601:timestamp}.failed logging in peer %{DATA:peer_id}.: %{GREEDYDATA:failure_reason}'&lt;br&gt;
apply_on: message&lt;br&gt;
statics:&lt;/p&gt;

&lt;p&gt;meta: log_type&lt;br&gt;
value: netbird_auth_failure&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Corresponding Scenario&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;/etc/crowdsec/scenarios/netbird-brute-force.yaml&lt;br&gt;
type: leaky&lt;br&gt;
name: patrickbloem/netbird-auth-brute-force&lt;br&gt;
filter: "evt.Meta.log_type == 'netbird_auth_failure'"&lt;br&gt;
leakspeed: "30m"&lt;br&gt;
capacity: 5&lt;br&gt;
labels:&lt;br&gt;
remediation: true&lt;/p&gt;

&lt;p&gt;Full parser configurations are included in the repository.&lt;/p&gt;




&lt;h2&gt;
  
  
  Operational Results
&lt;/h2&gt;

&lt;p&gt;After deploying this stack on a Hetzner VPS (Ubuntu 24.04 LTS):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Resource Usage&lt;/strong&gt;: 0.12 load average, ~12% RAM utilization.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Threat Mitigation&lt;/strong&gt;: ~28,000 IPs blocked via CrowdSec (CAPI + local decisions).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SSH Attack Reduction&lt;/strong&gt;: Changing SSH to port 2222 + CrowdSec reduced attacks from ~40/day to &amp;lt;1/day.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zero False Positives&lt;/strong&gt;: No legitimate Netbird clients blocked after 3 months of operation.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Comparison: Netbird vs. Tailscale vs. Twingate
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Netbird (Self-Hosted)&lt;/th&gt;
&lt;th&gt;Tailscale&lt;/th&gt;
&lt;th&gt;Twingate&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Data Plane&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Direct WireGuard&lt;/td&gt;
&lt;td&gt;Direct WireGuard&lt;/td&gt;
&lt;td&gt;Relay-based (no P2P)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Control Plane Location&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Self-hosted&lt;/td&gt;
&lt;td&gt;Tailscale Inc. (US)&lt;/td&gt;
&lt;td&gt;Twingate Inc. (US)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Authentication&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Self-hosted OIDC&lt;/td&gt;
&lt;td&gt;Tailscale SSO&lt;/td&gt;
&lt;td&gt;Twingate SSO&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Metadata Visibility&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Zero (internal only)&lt;/td&gt;
&lt;td&gt;Provider has access&lt;/td&gt;
&lt;td&gt;Provider has access&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cost (10 users)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;VPS cost (~€5/month)&lt;/td&gt;
&lt;td&gt;Free tier&lt;/td&gt;
&lt;td&gt;Starts at $5/user/month&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Audit Compliance&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Full control&lt;/td&gt;
&lt;td&gt;Trust-based&lt;/td&gt;
&lt;td&gt;Trust-based&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Custom Policies&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ACL rules (JSON)&lt;/td&gt;
&lt;td&gt;ACL tags&lt;/td&gt;
&lt;td&gt;Application policies&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  When to Choose Self-Hosting
&lt;/h2&gt;

&lt;p&gt;Self-hosting Netbird makes sense when:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Regulatory Compliance&lt;/strong&gt;: You operate under GDPR, HIPAA, or government-specific frameworks requiring data sovereignty.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zero Trust Architecture&lt;/strong&gt;: You need proof that no third party can access connection metadata.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audit Requirements&lt;/strong&gt;: Internal security audits demand full control over authentication logs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost Optimization&lt;/strong&gt;: You already maintain infrastructure (VPS/on-prem servers) and can absorb operational overhead.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;When managed services are better&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Small teams (&amp;lt;20 users) without dedicated DevOps resources.&lt;/li&gt;
&lt;li&gt;Organizations comfortable with US-based SaaS providers.&lt;/li&gt;
&lt;li&gt;Environments requiring enterprise support SLAs.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;p&gt;The deployment process is documented in the repository README:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Clone the repository:&lt;br&gt;
git clone &lt;a href="https://github.com/patrickbloem-it/Netbird-self-hosted-stack.git" rel="noopener noreferrer"&gt;https://github.com/patrickbloem-it/Netbird-self-hosted-stack.git&lt;/a&gt;&lt;br&gt;
cd Netbird-self-hosted-stack&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Initialize directory structure:&lt;br&gt;
chmod +x init.sh&lt;br&gt;
./init.sh&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Configure environment variables:&lt;br&gt;
cp .env.example .env&lt;br&gt;
nano .env # Set DOMAIN, PUBLIC_IP, secrets&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Deploy the stack:&lt;br&gt;
docker compose up -d&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Generate CrowdSec bouncer key:&lt;br&gt;
docker compose exec crowdsec cscli bouncers add firewall-bouncer&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Add key to .env as CROWDSEC_BOUNCER_KEY&lt;br&gt;
docker compose up -d --force-recreate cs-firewall-bouncer&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Self-hosting Netbird provides a viable path to compliant overlay networking without sacrificing usability. The integration with CrowdSec demonstrates that security hardening can be achieved without custom application code—by leveraging log-based threat detection at the infrastructure layer.&lt;/p&gt;

&lt;p&gt;For organizations where &lt;strong&gt;data sovereignty is non-negotiable&lt;/strong&gt;, this stack offers a production-ready alternative to managed services.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Repository&lt;/strong&gt;: &lt;a href="https://github.com/patrickbloem-it/Netbird-self-hosted-stack" rel="noopener noreferrer"&gt;github.com/patrickbloem-it/Netbird-self-hosted-stack&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This article reflects lessons learned from deploying overlay networks in public sector environments. Configurations are provided as-is for educational purposes and should be reviewed against your organization's specific security policies before production deployment.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>networking</category>
      <category>security</category>
      <category>devops</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Cost-Effective Disaster Recovery: Managing ZFS Snapshots on Proxmox VE</title>
      <dc:creator>patrickbloem-it</dc:creator>
      <pubDate>Tue, 30 Dec 2025 07:31:13 +0000</pubDate>
      <link>https://forem.com/patrickbloemit/cost-effective-disaster-recovery-managing-zfs-snapshots-on-proxmox-ve-4pbf</link>
      <guid>https://forem.com/patrickbloemit/cost-effective-disaster-recovery-managing-zfs-snapshots-on-proxmox-ve-4pbf</guid>
      <description>&lt;h1&gt;
  
  
  Why Simple is Better for Public Sector IT
&lt;/h1&gt;

&lt;p&gt;In my daily work as an Infrastructure Engineer in the public sector, I often face a common dilemma: We need enterprise-grade data integrity and auditability, but we don't always have the budget for high-end backup appliances.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Complexity is the enemy of security.&lt;/strong&gt; That's why I prefer leveraging the native capabilities of &lt;strong&gt;ZFS&lt;/strong&gt; directly on our Proxmox VE hosts, rather than adding layers of third-party software that might introduce new vulnerabilities or compliance issues.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Challenge: Automated Snapshots without Bloat
&lt;/h2&gt;

&lt;p&gt;I needed a way to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create recurring snapshots of critical VMs and containers&lt;/li&gt;
&lt;li&gt;Rotate them automatically (GFS: hourly, daily, weekly retention)&lt;/li&gt;
&lt;li&gt;Have zero external dependencies (just a shell script)&lt;/li&gt;
&lt;li&gt;Be fully auditable via syslog integration&lt;/li&gt;
&lt;li&gt;Fail gracefully if a dataset is unavailable&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Solution
&lt;/h2&gt;

&lt;p&gt;I wrote a lightweight wrapper script around the native &lt;code&gt;zfs&lt;/code&gt; command. It's designed to be run as a cron job on any Debian-based Proxmox node.&lt;/p&gt;

&lt;h3&gt;
  
  
  Core Features
&lt;/h3&gt;

&lt;p&gt;Here's the production-ready version with proper error handling:&lt;/p&gt;

&lt;h1&gt;
  
  
  !/bin/bash
&lt;/h1&gt;

&lt;p&gt;ZFS Snapshot Manager for Proxmox VE&lt;br&gt;
Retention: 24 hourly, 7 daily, 4 weekly&lt;br&gt;
Author: Patrick Bloem&lt;br&gt;
set -euo pipefail # Exit on error, undefined vars, pipe failures&lt;/p&gt;

&lt;p&gt;Configuration&lt;br&gt;
DATASET="${1:-rpool/data}"&lt;br&gt;
LOG_FACILITY="local0"&lt;br&gt;
TIMESTAMP=$(date +"%Y%m%d-%H%M%S")&lt;br&gt;
RETENTION_HOURLY=24&lt;br&gt;
RETENTION_DAILY=7&lt;/p&gt;

&lt;p&gt;Logging function&lt;br&gt;
log() {&lt;br&gt;
logger -t "zfs-snapshot" -p "${LOG_FACILITY}.info" "$1"&lt;br&gt;
echo "[$(date +'%Y-%m-%d %H:%M:%S')] $1"&lt;br&gt;
}&lt;/p&gt;

&lt;p&gt;error_exit() {&lt;br&gt;
logger -t "zfs-snapshot" -p "${LOG_FACILITY}.err" "ERROR: $1"&lt;br&gt;
echo "ERROR: $1" &amp;gt;&amp;amp;2&lt;br&gt;
exit 1&lt;br&gt;
}&lt;/p&gt;

&lt;p&gt;Verify dataset exists&lt;br&gt;
if ! zfs list -H -o name "$DATASET" &amp;amp;&amp;gt;/dev/null; then&lt;br&gt;
error_exit "Dataset $DATASET not found"&lt;br&gt;
fi&lt;/p&gt;

&lt;p&gt;Create snapshot&lt;br&gt;
SNAPSHOT_NAME="${DATASET}@auto-hourly-${TIMESTAMP}"&lt;br&gt;
log "Creating snapshot: $SNAPSHOT_NAME"&lt;/p&gt;

&lt;p&gt;if ! zfs snapshot -r "$SNAPSHOT_NAME" 2&amp;gt;&amp;amp;1 | logger -t "zfs-snapshot"; then&lt;br&gt;
error_exit "Failed to create snapshot $SNAPSHOT_NAME"&lt;br&gt;
fi&lt;/p&gt;

&lt;p&gt;Prune old hourly snapshots (keep last N)&lt;br&gt;
log "Pruning old snapshots (keeping last $RETENTION_HOURLY hourly)"&lt;/p&gt;

&lt;p&gt;zfs list -H -t snapshot -o name -s creation&lt;br&gt;
| grep "${DATASET}@auto-hourly-"&lt;br&gt;
| head -n -"$RETENTION_HOURLY"&lt;br&gt;
| while read -r old_snap; do&lt;br&gt;
log "Destroying old snapshot: $old_snap"&lt;br&gt;
zfs destroy "$old_snap" || log "Warning: Could not destroy $old_snap"&lt;br&gt;
done&lt;/p&gt;

&lt;p&gt;log "Snapshot rotation completed successfully"&lt;/p&gt;

&lt;h3&gt;
  
  
  Deployment
&lt;/h3&gt;

&lt;p&gt;Add this to your crontab for hourly execution:&lt;/p&gt;

&lt;p&gt;Run every hour at minute 5&lt;br&gt;
5 * * * * /usr/local/bin/zfs-snapshot-manager.sh rpool/data 2&amp;gt;&amp;amp;1 | logger -t zfs-snapshot&lt;/p&gt;

&lt;p&gt;For daily/weekly snapshots, create separate scripts with adjusted retention policies or use a tag-based approach.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Works for Compliance
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Auditability:&lt;/strong&gt; All actions are logged to syslog, which can be forwarded to a central SIEM&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Atomicity:&lt;/strong&gt; ZFS snapshots are atomic and crash-consistent&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transparency:&lt;/strong&gt; No proprietary tools; every action is traceable via &lt;code&gt;zfs list -t snapshot&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Idempotency:&lt;/strong&gt; Safe to run multiple times (won't create duplicates due to timestamp)&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Advanced: Integration with Offsite Replication
&lt;/h2&gt;

&lt;p&gt;In production, I combine this with &lt;code&gt;zfs send/recv&lt;/code&gt; for offsite replication:&lt;/p&gt;

&lt;p&gt;Example: Replicate to remote NAS&lt;br&gt;
LATEST_SNAP=$(zfs list -H -t snapshot -o name -s creation | grep "auto-hourly" | tail -1)&lt;br&gt;
zfs send -R "$LATEST_SNAP" | ssh backup-host "zfs recv -F backup/proxmox"&lt;/p&gt;

&lt;p&gt;This gives us:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Recovery Point Objective (RPO):&lt;/strong&gt; 1 hour&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Recovery Time Objective (RTO):&lt;/strong&gt; Minutes (just mount the dataset)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Get the Full Script
&lt;/h2&gt;

&lt;p&gt;I've published the full, hardened version of this tool on GitHub with additional features:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Multi-dataset support&lt;/li&gt;
&lt;li&gt;Configurable retention policies&lt;/li&gt;
&lt;li&gt;Integration with Prometheus for monitoring&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;👉 &lt;strong&gt;Check it out here:&lt;/strong&gt; &lt;a href="https://github.com/patrickbloem-it/proxmox-zfs-snapshot-manager" rel="noopener noreferrer"&gt;proxmox-zfs-snapshot-manager on GitHub&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Feel free to fork it or suggest improvements. In the public sector, sharing reliable, open-source tooling is the best way to ensure we all build more resilient infrastructure.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;About me: I'm Patrick, a Senior Infrastructure Engineer focusing on Linux hardening and virtualization in the public sector. Connect with me on &lt;a href="https://www.linkedin.com/in/patrick-bloem-it/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; or check out my other projects on &lt;a href="https://github.com/patrickbloem-it" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>pve</category>
      <category>backup</category>
      <category>zfs</category>
      <category>linux</category>
    </item>
  </channel>
</rss>
