<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Okeoghene Akwerigbe</title>
    <description>The latest articles on Forem by Okeoghene Akwerigbe (@okeoghene_akwerigbe_a07a5).</description>
    <link>https://forem.com/okeoghene_akwerigbe_a07a5</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3903163%2Faca72602-9ff7-43c7-b286-a725297498f3.png</url>
      <title>Forem: Okeoghene Akwerigbe</title>
      <link>https://forem.com/okeoghene_akwerigbe_a07a5</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/okeoghene_akwerigbe_a07a5"/>
    <language>en</language>
    <item>
      <title>SwiftDeploy: Building a Declarative Deployment CLI with Observability and OPA Policy Gates</title>
      <dc:creator>Okeoghene Akwerigbe</dc:creator>
      <pubDate>Wed, 06 May 2026 20:56:38 +0000</pubDate>
      <link>https://forem.com/okeoghene_akwerigbe_a07a5/swiftdeploy-building-a-declarative-deployment-cli-with-observability-and-opa-policy-gates-4f10</link>
      <guid>https://forem.com/okeoghene_akwerigbe_a07a5/swiftdeploy-building-a-declarative-deployment-cli-with-observability-and-opa-policy-gates-4f10</guid>
      <description>&lt;p&gt;The idea of SwiftDeploy is really simple: What if I could describe my deployment once, then let a tool generate the infrastructure files for me?&lt;/p&gt;

&lt;p&gt;Most beginner DevOps projects teach you to write a docker-compose.yml, configure Nginx, wire up containers, add health checks, and then remember to keep all of those files in sync. That is useful practice, but in real projects it can become messy very quickly.&lt;/p&gt;

&lt;p&gt;You change a port in one file and forget another. You switch a service from stable to canary in Docker Compose but forget the value in your notes. You edit Nginx by hand and now nobody knows whether the generated config still matches the intended deployment.&lt;/p&gt;

&lt;p&gt;SwiftDeploy is an attempt to solve that problem in a small, understandable way.&lt;/p&gt;

&lt;p&gt;It is a Python CLI tool that reads one file, manifest.yaml, then generates the infrastructure files, deploys the stack, checks policies, exposes metrics, and keeps an audit trail.&lt;/p&gt;

&lt;p&gt;In this post, I will walk through how SwiftDeploy works and how you can rebuild the same idea yourself.&lt;/p&gt;

&lt;p&gt;Repository: &lt;a href="https://github.com/AkwerigbeO/swiftdeploy" rel="noopener noreferrer"&gt;https://github.com/AkwerigbeO/swiftdeploy&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;What SwiftDeploy Does&lt;br&gt;
SwiftDeploy is a small deployment tool with five main responsibilities:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Read a deployment manifest&lt;/li&gt;
&lt;li&gt;Generate Docker Compose and Nginx config files&lt;/li&gt;
&lt;li&gt;Deploy a FastAPI app behind Nginx&lt;/li&gt;
&lt;li&gt;Use OPA policies as safety gates before deploys and promotions&lt;/li&gt;
&lt;li&gt;Expose metrics, show status, and generate an audit report&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The project structure looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;.
|-- app/
|   |-- main.py
|   `-- requirements.txt
|-- policies/
|   |-- infra.rego
|   `-- canary.rego
|-- templates/
|   |-- docker-compose.yml.tpl
|   `-- nginx.conf.tpl
|-- Dockerfile
|-- manifest.yaml
|-- swiftdeploy
`-- README.md

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The important thing is that manifest.yaml is the source of truth. The generated files are not meant to be edited directly.&lt;/p&gt;

&lt;p&gt;The Core Idea: One Manifest Controls Everything&lt;br&gt;
Here is an example manifest:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;swift-odysia:latest&lt;/span&gt;
  &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3000&lt;/span&gt;
  &lt;span class="na"&gt;mode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;canary&lt;/span&gt;
  &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
  &lt;span class="na"&gt;restart_policy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;unless-stopped&lt;/span&gt;

&lt;span class="na"&gt;nginx&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nginx:latest&lt;/span&gt;
  &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8844&lt;/span&gt;
  &lt;span class="na"&gt;proxy_timeout&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;10s&lt;/span&gt;
  &lt;span class="na"&gt;contact&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;you@example.com&lt;/span&gt;

&lt;span class="na"&gt;network&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;swiftdeploy-net&lt;/span&gt;
  &lt;span class="na"&gt;driver_type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bridge&lt;/span&gt;

&lt;span class="na"&gt;policy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;opa_url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;http://127.0.0.1:8181&lt;/span&gt;
  &lt;span class="na"&gt;infra_min_disk_gb&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;
  &lt;span class="na"&gt;infra_max_cpu_load&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2.0&lt;/span&gt;
  &lt;span class="na"&gt;canary_max_error_rate&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.01&lt;/span&gt;
  &lt;span class="na"&gt;canary_max_p99_latency_seconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.5&lt;/span&gt;
  &lt;span class="na"&gt;canary_window_seconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;30&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This file answers the basic deployment questions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What image should run?&lt;/li&gt;
&lt;li&gt;What port does the app use?&lt;/li&gt;
&lt;li&gt;Should the app run as stable or canary?&lt;/li&gt;
&lt;li&gt;What port should Nginx expose?&lt;/li&gt;
&lt;li&gt;What Docker network should the containers use?&lt;/li&gt;
&lt;li&gt;What policy limits should be enforced?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once the manifest exists, SwiftDeploy can generate the rest.&lt;/p&gt;

&lt;p&gt;The Design: A Tool That Writes Its Own Infrastructure Files&lt;br&gt;
The first command is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./swiftdeploy init
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On Windows PowerShell, you can run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python swiftdeploy init

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command reads manifest.yaml, then renders two template files:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;templates/docker-compose.yml.tpl -&amp;gt; docker-compose.yml
templates/nginx.conf.tpl         -&amp;gt; nginx.conf

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The template contains placeholders like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{{&lt;/span&gt;&lt;span class="nv"&gt;SERVICE_IMAGE&lt;/span&gt;&lt;span class="pi"&gt;}}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The CLI replaces that with the value given in the manifest:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;swift-odysia:latest&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is the whole trick. SwiftDeploy is not magically inventing infrastructure. It is using a manifest plus templates to generate consistent config files.&lt;/p&gt;

&lt;p&gt;The generated Docker Compose file creates three main containers:&lt;/p&gt;

&lt;p&gt;app: the FastAPI service&lt;br&gt;
nginx: the public reverse proxy&lt;br&gt;
opa: the policy engine&lt;br&gt;
The app container does not publish its port directly. It only uses expose, so traffic must go through Nginx.&lt;/p&gt;

&lt;p&gt;Nginx is the public entry point. It listens on the port from the manifest, forwards traffic to the app, adds useful headers, and returns JSON error bodies for gateway failures.&lt;/p&gt;

&lt;p&gt;OPA runs as a sidecar. The CLI talks to OPA when it needs policy decisions.&lt;/p&gt;

&lt;p&gt;The architecture looks like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F89aageoksz95g065qvhs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F89aageoksz95g065qvhs.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Building the App&lt;br&gt;
The application is a small FastAPI service. It supports two modes:&lt;/p&gt;

&lt;p&gt;stable&lt;br&gt;
canary&lt;br&gt;
The mode comes from an environment variable:&lt;br&gt;
MODE=stable&lt;br&gt;
or:&lt;/p&gt;

&lt;p&gt;MODE=canary&lt;br&gt;
The app exposes:&lt;/p&gt;

&lt;p&gt;GET  /&lt;br&gt;
GET  /healthz&lt;br&gt;
GET  /metrics&lt;br&gt;
POST /chaos&lt;br&gt;
The root endpoint returns a welcome response with the mode, version, and timestamp.&lt;/p&gt;

&lt;p&gt;The health endpoint returns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ok"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"uptime"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"canary"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When the app is running in canary mode, it adds this header to responses:&lt;br&gt;
X-Mode: canary&lt;/p&gt;

&lt;p&gt;That makes it easy to confirm which version of the service is responding.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Adding Observability with /metrics&lt;/strong&gt;&lt;br&gt;
Deploying a service is only half the story. You also need to see what it is doing.&lt;/p&gt;

&lt;p&gt;SwiftDeploy exposes a Prometheus-compatible /metrics endpoint. The app tracks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight prometheus"&gt;&lt;code&gt;&lt;span class="n"&gt;http_requests_total&lt;/span&gt;
&lt;span class="n"&gt;http_request_duration_seconds&lt;/span&gt;
&lt;span class="n"&gt;app_uptime_seconds&lt;/span&gt;
&lt;span class="n"&gt;app_mode&lt;/span&gt;
&lt;span class="n"&gt;chaos_active&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The request counter uses labels:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;method
path
status_code
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That means you can see how many requests went to /healthz, how many hit /, and how many returned 500.&lt;/p&gt;

&lt;p&gt;The latency histogram lets the CLI estimate P99 latency. That becomes important later when deciding whether a canary is healthy enough to promote.&lt;/p&gt;

&lt;p&gt;You can check metrics with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl http://localhost:8844/metrics
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;or on Powershell:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;Invoke-WebRequest&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-UseBasicParsing&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;http://localhost:8844/metrics&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;Select-Object&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-ExpandProperty&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;Content&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The Guardrails (OPA)&lt;/strong&gt;&lt;br&gt;
A deployment tool should not just run commands blindly. It should ask: “Is this safe?”&lt;/p&gt;

&lt;p&gt;That is where OPA, Open Policy Agent, comes in.&lt;/p&gt;

&lt;p&gt;OPA lets you write policy rules in Rego. Instead of hardcoding all safety checks inside the CLI, SwiftDeploy sends facts to OPA and lets OPA decide whether an action is allowed.&lt;/p&gt;

&lt;p&gt;This is an important design choice:&lt;/p&gt;

&lt;p&gt;The CLI gathers information. OPA makes the policy decision.&lt;/p&gt;

&lt;p&gt;SwiftDeploy has two policy files:&lt;/p&gt;

&lt;p&gt;policies/infra.rego&lt;br&gt;
policies/canary.rego&lt;br&gt;
They answer different questions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Infrastructure Policy&lt;/strong&gt;&lt;br&gt;
The infrastructure policy answers:&lt;/p&gt;

&lt;p&gt;Is the host safe enough for deployment?&lt;/p&gt;

&lt;p&gt;It checks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;free disk space&lt;/li&gt;
&lt;li&gt;&lt;p&gt;CPU load&lt;br&gt;
The policy denies deployment if:&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;disk free is less than 10GB&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;CPU load is greater than 2.0&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Those limits come from manifest.yaml, not from the Rego file.&lt;/p&gt;

&lt;p&gt;That matters because policy logic and environment configuration are different things. The Rego file defines the rule. The manifest defines the threshold.&lt;/p&gt;

&lt;p&gt;A simplified version of the logic is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rego"&gt;&lt;code&gt;&lt;span class="ow"&gt;package&lt;/span&gt; &lt;span class="n"&gt;infra&lt;/span&gt;

&lt;span class="ow"&gt;default&lt;/span&gt; &lt;span class="n"&gt;allow&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

&lt;span class="n"&gt;allow&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt; &lt;span class="n"&gt;if&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;disk_free&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;min_disk&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;allow&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt; &lt;span class="n"&gt;if&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cpu_load&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;max_cpu&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./swiftdeploy deploy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;SwiftDeploy:&lt;/p&gt;

&lt;p&gt;Starts OPA&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Collects host disk and CPU data&lt;/li&gt;
&lt;li&gt;Sends that data to OPA&lt;/li&gt;
&lt;li&gt;Blocks the deploy if OPA denies it&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Example Failure is &lt;br&gt;
DEPLOY BLOCKED by infra policy:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Disk space below minimum threshold
FAIL: Deployment aborted due to policy violations.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is the hard gate. If the environment is unsafe, deploy stops.&lt;br&gt;
**&lt;br&gt;
Canary Safety Policy**&lt;br&gt;
The canary policy answers:&lt;/p&gt;

&lt;p&gt;Is the canary healthy enough to promote?&lt;/p&gt;

&lt;p&gt;It checks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;error rate&lt;/li&gt;
&lt;li&gt;&lt;p&gt;P99 latency&lt;br&gt;
The policy denies promotion if:&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;error rate is greater than 1%&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;P99 latency is greater than 500ms&lt;br&gt;
Before promoting, SwiftDeploy scrapes /metrics, samples a configured window, calculates the error rate and P99 latency, then sends those facts to OPA.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The CLI does not decide whether 1% is good or bad. OPA does.&lt;/p&gt;

&lt;p&gt;That keeps the deployment flow flexible. If I want to make the policy stricter later, I can change the policy threshold in the manifest without rewriting the deployment logic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why OPA Isolation Matters&lt;/strong&gt;&lt;br&gt;
OPA is powerful because it answers policy questions. That also means it should not be exposed publicly.&lt;/p&gt;

&lt;p&gt;In SwiftDeploy, Nginx is the public ingress. OPA is bound to:&lt;/p&gt;

&lt;p&gt;127.0.0.1:8181&lt;br&gt;
The CLI can reach it from the host machine, but public traffic through Nginx cannot reach the OPA API.&lt;/p&gt;

&lt;p&gt;That separation matters because users should access the application, not the policy engine.&lt;/p&gt;

&lt;p&gt;You can test that OPA is not leaking through Nginx:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;Invoke-WebRequest&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-UseBasicParsing&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;http://localhost:8844/v1/data/infra&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That should not return an OPA policy response through the public app port.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Deploying the Stack&lt;/strong&gt;&lt;br&gt;
To replicate the project locally, start by building the app image:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker build &lt;span class="nt"&gt;-t&lt;/span&gt; swift-odysia:latest &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Generate infrastructure files:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./swiftdeploy init
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Validate the setup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./swiftdeploy validate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Deploy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./swiftdeploy deploy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check health&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl http://localhost:8844/healthz
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ok"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"uptime"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"canary"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Status View&lt;br&gt;
SwiftDeploy includes a status command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./swiftdeploy status

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For a single snapshot:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./swiftdeploy status &lt;span class="nt"&gt;--count&lt;/span&gt; 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The status view shows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;mode&lt;/li&gt;
&lt;li&gt;uptime&lt;/li&gt;
&lt;li&gt;chaos state&lt;/li&gt;
&lt;li&gt;throughput&lt;/li&gt;
&lt;li&gt;error rate&lt;/li&gt;
&lt;li&gt;P99 latency&lt;/li&gt;
&lt;li&gt;policy compliance&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;========================================================================
SWIFTDEPLOY STATUS
========================================================================
Time          : 2026-05-06T17:39:30.794723+00:00
Mode          : canary
Uptime        : 10s
Chaos         : 0 (0=none, 1=slow, 2=error)
Throughput    : 0.00 req/s
Error rate    : 0.00%
P99 latency   : 0.100s

Policy Compliance
- infra: PASS - policy allowed
- canary: PASS - policy allowed
========================================================================
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every status scrape is appended to history.jsonl. That file becomes the raw audit trail.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Chaos: Injecting Slow Responses&lt;/strong&gt;&lt;br&gt;
The app has a /chaos endpoint that only works in canary mode.&lt;/p&gt;

&lt;p&gt;To inject slow responses:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8844/chaos &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"mode":"slow","duration":2}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then generate traffic:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="k"&gt;for &lt;/span&gt;i &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;1..10&lt;span class="o"&gt;}&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do &lt;/span&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; http://localhost:8844/ &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /dev/null&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On PowerShell:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;..&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ForEach-Object&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;Invoke-WebRequest&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-UseBasicParsing&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;http://localhost:8844/&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Out-Null&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After that run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./swiftdeploy status &lt;span class="nt"&gt;--count&lt;/span&gt; 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Chaos: Injecting Errors&lt;br&gt;
To inject errors:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8844/chaos &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"mode":"error","rate":0.5}'&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Generate Traffic:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="k"&gt;for &lt;/span&gt;i &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;1..20&lt;span class="o"&gt;}&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do &lt;/span&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; http://localhost:8844/ &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /dev/null&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On Powershell:&lt;br&gt;
1..20 | ForEach-Object {&lt;br&gt;
  try {&lt;br&gt;
    Invoke-WebRequest -UseBasicParsing &lt;a href="http://localhost:8844/" rel="noopener noreferrer"&gt;http://localhost:8844/&lt;/a&gt; | Out-Null&lt;br&gt;
  } catch {}&lt;br&gt;
}&lt;/p&gt;

&lt;p&gt;Now the metrics endpoint records more 500 responses. The status view should show the error rate climbing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lessons Learned&lt;/strong&gt;&lt;br&gt;
The first lesson is that a deployment tool is more than a wrapper around docker compose up.&lt;/p&gt;

&lt;p&gt;A good deployment tool needs to know:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;what should exist&lt;/li&gt;
&lt;li&gt;whether the environment is safe&lt;/li&gt;
&lt;li&gt;whether the app is healthy&lt;/li&gt;
&lt;li&gt;what changed over time
The second lesson is that generated files are powerful when there is a clear source of truth. By making manifest.yaml the only file operators need to edit, the system becomes easier to reason about.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The third lesson is that policy belongs in its own layer. The CLI should collect facts, but OPA should make the allow/deny decision. That separation makes the system easier to test and safer to extend.&lt;/p&gt;

&lt;p&gt;The fourth lesson is that canary deployments need metrics. A container can be running and still be a bad candidate for promotion. Error rate and latency tell a better story than health checks alone.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Final Thoughts&lt;/strong&gt;&lt;br&gt;
SwiftDeploy is not a replacement for Kubernetes, Terraform, or a production deployment platform. It is a learning project that shows the core ideas behind those tools in a smaller package:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;declare desired state&lt;/li&gt;
&lt;li&gt;generate infrastructure&lt;/li&gt;
&lt;li&gt;observe runtime behavior&lt;/li&gt;
&lt;li&gt;enforce safety policies&lt;/li&gt;
&lt;li&gt;keep an audit trail
If you are learning DevOps, this kind of project is a good way to understand how deployment automation, observability, policy, and reliability fit together.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The best part is that the idea is simple enough to rebuild:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Start with a manifest&lt;/li&gt;
&lt;li&gt;Add templates&lt;/li&gt;
&lt;li&gt;Write a CLI to render them&lt;/li&gt;
&lt;li&gt;Add health checks&lt;/li&gt;
&lt;li&gt;Add metrics&lt;/li&gt;
&lt;li&gt;Add OPA policy gates&lt;/li&gt;
&lt;li&gt;Add history and audit reporting&lt;/li&gt;
&lt;li&gt;
That is SwiftDeploy in one sentence:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;em&gt;A small deployment CLI that turns one manifest into a running, observable, policy-protected stack.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>automation</category>
      <category>cli</category>
      <category>devops</category>
      <category>tooling</category>
    </item>
    <item>
      <title>Multi-Layered Defense: Building an Intelligent, Dual-Trigger Firewall</title>
      <dc:creator>Okeoghene Akwerigbe</dc:creator>
      <pubDate>Wed, 29 Apr 2026 13:40:45 +0000</pubDate>
      <link>https://forem.com/okeoghene_akwerigbe_a07a5/multi-layered-defense-building-an-intelligent-dual-trigger-firewall-27kc</link>
      <guid>https://forem.com/okeoghene_akwerigbe_a07a5/multi-layered-defense-building-an-intelligent-dual-trigger-firewall-27kc</guid>
      <description>&lt;p&gt;In modern DevOps, a simple rate-limiter isn't enough. If you set a hard limit of "10 requests per second," what happens when your product goes viral and legitimate customers hit that limit? You end up blocking the exact people you want to serve. Professional security requires intelligence: systems that adapt to your traffic patterns and can tell the difference between a busy hour and a malicious botnet.&lt;/p&gt;

&lt;p&gt;For my &lt;strong&gt;HNG Stage 3&lt;/strong&gt; project, I decided to move beyond basic firewalls. I built a Python-based &lt;strong&gt;Anomaly Detection Engine&lt;/strong&gt; that uses a &lt;strong&gt;Dual-Trigger System&lt;/strong&gt; to protect a server in real-time.&lt;/p&gt;

&lt;p&gt;Here is a deep dive into the architecture, the math, and the automation behind it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. The Architecture: Real-Time Log Tailing&lt;/strong&gt;&lt;br&gt;
The foundation of the engine is a continuous loop that monitors the server's pulse. Instead of acting as a proxy that traffic must pass through (which can slow things down), this engine sits entirely out of the way.&lt;/p&gt;

&lt;p&gt;It uses a Python script to "tail" Nginx JSON access logs asynchronously. The moment a request hits Nginx, my script reads the log entry, extracts the IP address, and immediately begins evaluating it. This means the engine is lightweight and doesn't add any latency to the actual web application.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3wdsonq5gen1lxlxro5k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3wdsonq5gen1lxlxro5k.png" alt="Architecture Diagram"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzudjtlmgtseh9s8e66yv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzudjtlmgtseh9s8e66yv.png" alt="Python Script tailing logs running"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How the Sliding window works&lt;/strong&gt;&lt;br&gt;
To detect attacks, the engine needs to know how many requests are happening &lt;strong&gt;right now&lt;/strong&gt;, not just how many happened in the last full minute. A normal per-minute counter would be too slow because an attacker could send a burst of traffic and disappear before the next minute ends.&lt;/p&gt;

&lt;p&gt;To solve this, I used a sliding window with Python’s &lt;code&gt;deque&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Think of a &lt;code&gt;deque&lt;/code&gt; like a queue of timestamps. Every time a request comes in, the detector stores the current time inside the queue. Before calculating the request rate, it removes every timestamp older than 60 seconds. Whatever remains in the queue represents traffic from the last 60 seconds only.&lt;/p&gt;

&lt;p&gt;Here is the basic idea:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;python&lt;/span&gt;
&lt;span class="n"&gt;now&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;window&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;window&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;popleft&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;current_rate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;window&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The detector keeps two types of sliding windows:&lt;/p&gt;

&lt;p&gt;One global window for all requests hitting the server.&lt;br&gt;
One per-IP window for each source IP address.&lt;br&gt;
This helps the engine detect two different situations. If one IP is sending too many requests, it can be blocked directly. If traffic from many IPs rises at the same time, the engine treats it as a global traffic spike and sends a Slack alert without banning everyone.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. The Dual-Trigger System&lt;/strong&gt;&lt;br&gt;
What makes this engine robust is that it doesn't rely on a single point of failure. It evaluates every IP address through two distinct logic paths simultaneously.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Trigger A: The Statistical "Brain" (Z-Score)&lt;/strong&gt;&lt;br&gt;
This trigger is designed to catch "stealthy" attacks—bots that scrape your site slowly to avoid triggering basic alarms. It works by calculating a &lt;strong&gt;Baseline Mean&lt;/strong&gt; (your historical average traffic) and a &lt;strong&gt;Standard Deviation&lt;/strong&gt; (the normal "wobble" or fluctuation of your traffic).&lt;/p&gt;

&lt;p&gt;The engine evaluates incoming traffic using the Z-Score formula:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw0al4zgdprkddqvcaxzg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw0al4zgdprkddqvcaxzg.png" alt="Z-Score formula"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How the Baseline Learns From Traffic&lt;/strong&gt;&lt;br&gt;
The baseline is what tells the detector what “normal” looks like. Instead of hardcoding a fixed number like “10 requests per second,” the engine watches real traffic and learns from it.&lt;/p&gt;

&lt;p&gt;It stores per-second request counts over a rolling 30-minute window. Every 60 seconds, it recalculates the average request rate and the standard deviation. This means the baseline keeps adjusting as traffic changes throughout the day.&lt;/p&gt;

&lt;p&gt;For example, if the server normally gets 1 request per second at night but 8 requests per second during a busy hour, the detector should not treat both periods the same way. A fixed threshold would either be too strict during busy hours or too weak during quiet hours.&lt;/p&gt;

&lt;p&gt;The engine also keeps hourly traffic slots. When the current hour has enough data, it prefers that hour’s baseline because traffic patterns can change depending on the time of day. This makes the detector more adaptive and reduces false positives.&lt;/p&gt;

&lt;p&gt;In statistics, 99.7% of all normal activity falls within a Z-Score of 3.0. Therefore, if an IP's traffic generates a Z-Score of 3.0 (which my engine frequently caught during testing), the system mathematically proves this isn't a normal user, it’s an anomaly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Trigger B: The Volumetric "Shield" (Rate Multiplier)&lt;/strong&gt;&lt;br&gt;
While the Z-Score is brilliant for complex patterns, it takes a few seconds of data to calculate. What if an attacker tries to crash the server instantly with a massive flood of traffic?&lt;/p&gt;

&lt;p&gt;That is where the &lt;strong&gt;Rate Multiplier&lt;/strong&gt; comes in. This is a fail-safe configured with a multiplier of 5. It constantly compares the current traffic to the baseline. If your normal traffic is 1.0 request per second, and an IP suddenly spikes to over 5.0 requests per second, this trigger trips immediately. It acts as an emergency brake before the Z-Score even has time to finish its math.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frvmzmhqqgmjdw4vv4sa6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frvmzmhqqgmjdw4vv4sa6.png" alt="Screenshot showing Rate Multiplier and Z-score"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Automated Response: iptables and Docker Routing&lt;/strong&gt;&lt;br&gt;
Detecting a threat is only half the battle; the system must neutralize it without human intervention.&lt;/p&gt;

&lt;p&gt;When a trigger fires, the Python engine communicates directly with the Linux kernel's firewall (iptables). However, because modern applications run inside Docker containers, standard firewall rules often fail. Docker aggressively rewrites network rules, which means blocking an IP on the standard INPUT chain won't work—the traffic will slip right past it.&lt;/p&gt;

&lt;p&gt;To solve this, my engine targets the DOCKER-USER chain. By inserting a DROP rule here, the malicious IP is blocked at the lowest possible kernel level before the traffic is even allowed to route toward the Docker container.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Escalation Ladder&lt;/strong&gt;&lt;br&gt;
Security shouldn't be entirely unforgiving. I programmed an automated background worker (the "Unbanner") to manage a backoff schedule:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;First Offense:&lt;/strong&gt; The IP is banned for exactly 10 minutes to cool off.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Repeat Offenses:&lt;/strong&gt; If the IP returns and attacks again, the penalty increases to 30 minutes, and then 2 hours .&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Final Strike:&lt;/strong&gt; On the fourth offense, the engine flags the IP as purely hostile and issues a PERMANENT ban (999,999 minutes).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Full Visibility: Live Dashboards &amp;amp; Instant Alerts&lt;/strong&gt;&lt;br&gt;
You can't secure what you can't see. To ensure the engine was performing correctly, I built a full observability suite.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Live Metrics:&lt;/strong&gt; A web dashboard that plots a sliding window of the Current Requests Per Second against the Baseline Mean, making it incredibly easy to visualize traffic spikes in real-time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Active Threats:&lt;/strong&gt; A "Currently Banned" panel that lists the IPs currently sitting in the iptables penalty box.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Slack Webhooks:&lt;/strong&gt; Every time a block occurs, the engine constructs a JSON payload and fires it to a Slack channel. The alert details the offending IP, the exact trigger that caught them (e.g., Rate Multiplier), and how long they are banned for.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh5pvlt5x6ih8e19i25e8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh5pvlt5x6ih8e19i25e8.png" alt="Dashboard"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conclusion&lt;/strong&gt;&lt;br&gt;
Building this engine fundamentally changed how I view server security. Moving from static, hard-coded rules to dynamic, math-driven logic allows us to build systems that scale securely. The &lt;strong&gt;Dual-Trigger System&lt;/strong&gt; covers both bases: the Z-Score outsmarts the slow, sneaky attacks, while the Rate Multiplier stops brute-force floods dead in their tracks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Check out the live metrics dashboard at:&lt;/strong&gt; metrics.okeakwerigbe.name.ng&lt;/p&gt;

</description>
      <category>ai</category>
      <category>devops</category>
      <category>python</category>
      <category>security</category>
    </item>
  </channel>
</rss>
