<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Vijay Govindaraja</title>
    <description>The latest articles on Forem by Vijay Govindaraja (@vijaygovindaraja).</description>
    <link>https://forem.com/vijaygovindaraja</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3847163%2Fd79d2436-e162-476b-ad26-80f01e4c119a.jpeg</url>
      <title>Forem: Vijay Govindaraja</title>
      <link>https://forem.com/vijaygovindaraja</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/vijaygovindaraja"/>
    <language>en</language>
    <item>
      <title>Tuning ML hyperparameters with a swarm optimizer inspired by parrot behavior</title>
      <dc:creator>Vijay Govindaraja</dc:creator>
      <pubDate>Sun, 12 Apr 2026 18:39:36 +0000</pubDate>
      <link>https://forem.com/vijaygovindaraja/tuning-ml-hyperparameters-with-a-swarm-optimizer-inspired-by-parrot-behavior-1c3k</link>
      <guid>https://forem.com/vijaygovindaraja/tuning-ml-hyperparameters-with-a-swarm-optimizer-inspired-by-parrot-behavior-1c3k</guid>
      <description>&lt;p&gt;When you train a neural network or any ML model, performance depends heavily on hyperparameters — learning rate, batch size, number of layers, regularization strength. Finding good values is expensive because each evaluation means training a model end to end.&lt;/p&gt;

&lt;p&gt;The standard approaches each have tradeoffs:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Grid search&lt;/strong&gt; tries every combination on a predefined grid. It works for 2-3 parameters but scales exponentially. A 5-parameter search with 10 values each is 100,000 evaluations. Not practical.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Random search&lt;/strong&gt; samples uniformly and usually finds a decent region faster than grid search. But it has no memory — it doesn't learn from previous evaluations to focus on promising areas.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bayesian optimization&lt;/strong&gt; (what Optuna and Hyperopt use under the hood) builds a surrogate model of the objective and samples where improvement is most likely. Very sample-efficient in low dimensions. But the surrogate model itself becomes expensive to fit in high-dimensional spaces, and it can get stuck when the objective surface has many local optima.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Swarm methods&lt;/strong&gt; like PSO (Particle Swarm Optimization) maintain a population of candidate solutions that share information about good regions. They scale better to high dimensions than Bayesian methods. The failure mode is premature convergence: every particle gets pulled toward the same global best, the swarm loses diversity, and it can't escape a local optimum once it's trapped.&lt;/p&gt;

&lt;p&gt;That last problem — premature convergence in swarm methods — is what I was trying to address.&lt;/p&gt;

&lt;h2&gt;
  
  
  The idea behind MSPO
&lt;/h2&gt;

&lt;p&gt;Standard PSO gives every particle the same update rule every iteration: move toward your personal best, move toward the swarm's global best, add some inertia. The weights change over time, but the &lt;em&gt;type&lt;/em&gt; of movement is always the same. This makes the swarm predictable, which is exactly the wrong property when you're trying to explore a complex landscape.&lt;/p&gt;

&lt;p&gt;MSPO (Multi-Strategy Parrot Optimizer) takes a different approach: instead of one update rule, there are four. Each iteration, each agent in the swarm independently and randomly picks one of the four behaviors. Some agents explore aggressively, some exploit locally, some follow the crowd, some deliberately go against it. The swarm maintains diversity because different agents are doing genuinely different things at the same time.&lt;/p&gt;

&lt;p&gt;The four behaviors are loosely inspired by how parrots behave in groups:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Foraging.&lt;/strong&gt; The agent takes a Levy flight — a random step with a heavy-tailed distribution, meaning it usually takes small steps but occasionally jumps far. The step is scaled by the distance to the global best and pulled toward the population mean. This is the exploration behavior. Early in the run the mean-pull is strong (the flock stays together); late in the run it fades and agents explore independently.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Staying.&lt;/strong&gt; The agent drifts toward the global best with some random noise that shrinks over time. This is pure exploitation — fine-tuning around a known good region.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Communicating.&lt;/strong&gt; A coin flip. Half the time the agent moves toward the group mean (flocking). The other half it moves in a random direction with a decaying step size (going off alone). This creates a mix of conformist and nonconformist behavior in every iteration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fear of strangers.&lt;/strong&gt; The position update combines attraction toward the best solution with repulsion from the current position, modulated by a chaotic sequence from a Tent map. The chaos prevents the swarm from settling into a fixed pattern during the later stages of optimization.&lt;/p&gt;

&lt;p&gt;Three additional components support the behaviors:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Sobol initialization&lt;/strong&gt;: instead of scattering agents randomly, a low-discrepancy sequence covers the search space more uniformly from the start. Less wasted exploration in the opening iterations.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Exponentially decaying inertia weight&lt;/strong&gt;: starts high (broad jumps) and decays toward a low floor (small refinements). The algorithm naturally transitions from exploration to exploitation without manual tuning.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parametric Tent map&lt;/strong&gt;: generates a deterministic but unpredictable chaotic sequence that modulates the "fear of strangers" behavior. Structured chaos is better than pure randomness for escaping local optima.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How to use it
&lt;/h2&gt;

&lt;p&gt;Install:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;mspo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Basic usage — minimize any function that takes a numpy array and returns a float:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mspo&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MSPO&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;objective&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;opt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MSPO&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;objective&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;objective&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;bounds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;n_parrots&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_iter&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;seed&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;opt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;best_value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;     &lt;span class="c1"&gt;# ~1.6e-12
&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;best_params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;    &lt;span class="c1"&gt;# ~zeros
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Tuning a classifier
&lt;/h3&gt;

&lt;p&gt;Here's a more realistic example — tuning a random forest on your own dataset:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.ensemble&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RandomForestClassifier&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.model_selection&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;cross_val_score&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mspo&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MSPO&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;objective&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;n_estimators&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;max_depth&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;min_samples_split&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

    &lt;span class="n"&gt;clf&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;RandomForestClassifier&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;n_estimators&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;n_estimators&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;max_depth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;max_depth&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;min_samples_split&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;min_samples_split&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;random_state&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;cross_val_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;clf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;X_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cv&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt;  &lt;span class="c1"&gt;# MSPO minimizes, so negate accuracy
&lt;/span&gt;
&lt;span class="n"&gt;opt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MSPO&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;objective&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;objective&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;bounds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;    &lt;span class="c1"&gt;# n_estimators
&lt;/span&gt;        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;      &lt;span class="c1"&gt;# max_depth
&lt;/span&gt;        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;      &lt;span class="c1"&gt;# min_samples_split
&lt;/span&gt;    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;n_parrots&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_iter&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;seed&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;opt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Best CV accuracy: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;best_value&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A few practical notes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;MSPO minimizes.&lt;/strong&gt; If you want to maximize accuracy, negate the return value.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Integer parameters&lt;/strong&gt; need to be cast with &lt;code&gt;int()&lt;/code&gt; inside your objective. The optimizer works in continuous space; you handle the rounding.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The seed parameter&lt;/strong&gt; makes runs fully reproducible — same seed, same result, every time. Use this when comparing different objective functions or parameter bounds.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;n_parrots and max_iter&lt;/strong&gt; control the computational budget. Total evaluations = n_parrots × (1 + max_iter). For expensive objectives (each evaluation takes minutes), use fewer parrots and iterations. For cheap objectives (milliseconds per eval), you can afford more.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Tuning a neural network
&lt;/h3&gt;

&lt;p&gt;Same pattern works for PyTorch or TensorFlow — just wrap your training loop:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mspo&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MSPO&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;objective&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;lr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;          &lt;span class="c1"&gt;# log-scale: params[0] in [-5, -1]
&lt;/span&gt;    &lt;span class="n"&gt;weight_decay&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="c1"&gt;# log-scale: params[1] in [-6, -2]
&lt;/span&gt;    &lt;span class="n"&gt;batch_size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

    &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MyModel&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;optimizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;optim&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Adam&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="n"&gt;lr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;lr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;weight_decay&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;weight_decay&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;val_loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;train_and_evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;val_loss&lt;/span&gt;

&lt;span class="n"&gt;opt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MSPO&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;objective&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;objective&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;bounds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;       &lt;span class="c1"&gt;# log10(learning_rate)
&lt;/span&gt;        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;       &lt;span class="c1"&gt;# log10(weight_decay)
&lt;/span&gt;        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;      &lt;span class="c1"&gt;# batch_size
&lt;/span&gt;    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;n_parrots&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_iter&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;seed&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;opt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note the log-scale trick for learning rate and weight decay. These parameters span several orders of magnitude, so searching in log space gives the optimizer a more uniform landscape to work with.&lt;/p&gt;

&lt;h2&gt;
  
  
  Does it actually work?
&lt;/h2&gt;

&lt;p&gt;I validated against the official CEC 2022 benchmark suite — 12 functions (unimodal, multimodal, hybrid, composition) with the published shift vectors and rotation matrices from the competition organizers. This is the standard test used in the metaheuristics community to compare optimizers on a level playing field.&lt;/p&gt;

&lt;p&gt;Setup: 30 agents, 1000 iterations, 10 dimensions, 30 independent runs per function. Compared against canonical PSO (constriction coefficients) and random search (same evaluation budget).&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Function&lt;/th&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;MSPO&lt;/th&gt;
&lt;th&gt;PSO&lt;/th&gt;
&lt;th&gt;Random&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;F1 Zakharov&lt;/td&gt;
&lt;td&gt;unimodal&lt;/td&gt;
&lt;td&gt;45.06&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0.00&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;5436&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;F2 Rosenbrock&lt;/td&gt;
&lt;td&gt;unimodal&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;10.05&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;62.98&lt;/td&gt;
&lt;td&gt;256&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;F3 Schaffer F7&lt;/td&gt;
&lt;td&gt;multimodal&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1.7e-04&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0.03&lt;/td&gt;
&lt;td&gt;0.20&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;F4 Rastrigin&lt;/td&gt;
&lt;td&gt;multimodal&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;38.00&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;51.00&lt;/td&gt;
&lt;td&gt;87.90&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;F5 Levy&lt;/td&gt;
&lt;td&gt;multimodal&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0.55&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0.67&lt;/td&gt;
&lt;td&gt;2.47&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;F6 Hybrid 1&lt;/td&gt;
&lt;td&gt;hybrid&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;39637&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;58895&lt;/td&gt;
&lt;td&gt;10.4M&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;F7 Hybrid 2&lt;/td&gt;
&lt;td&gt;hybrid&lt;/td&gt;
&lt;td&gt;60.02&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;29.50&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;214&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;F8 Hybrid 3&lt;/td&gt;
&lt;td&gt;hybrid&lt;/td&gt;
&lt;td&gt;472&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;298&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1061&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;F9 Composition 1&lt;/td&gt;
&lt;td&gt;composition&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;398&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;426&lt;/td&gt;
&lt;td&gt;516&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;F10 Composition 2&lt;/td&gt;
&lt;td&gt;composition&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;-1254&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;29.57&lt;/td&gt;
&lt;td&gt;-366&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;F11 Composition 3&lt;/td&gt;
&lt;td&gt;composition&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2.32&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;79.14&lt;/td&gt;
&lt;td&gt;117&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;F12 Composition 4&lt;/td&gt;
&lt;td&gt;composition&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;165&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;170&lt;/td&gt;
&lt;td&gt;242&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Values are median error (f(x) - f*) over 30 runs. Lower is better.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MSPO wins on 9 of 12 functions.&lt;/strong&gt; PSO wins on Zakharov (a smooth unimodal function where simple attraction-to-best is enough) and two hybrid functions. MSPO's advantage shows up most clearly on the composition functions (F9-F12), where the landscape has multiple overlapping basins — exactly where behavioral diversity matters.&lt;/p&gt;

&lt;p&gt;Where it doesn't win: purely unimodal problems where the shortest path to the minimum is a straight line. PSO's simple "move toward best" is hard to beat when there are no local optima to escape from. If your hyperparameter landscape is likely smooth and unimodal, PSO or Bayesian optimization may be more appropriate.&lt;/p&gt;

&lt;h2&gt;
  
  
  Adapting it to your problem
&lt;/h2&gt;

&lt;p&gt;Some guidelines based on what I've found works:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When MSPO is a good fit:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hyperparameter spaces with 3+ dimensions where grid search is too expensive&lt;/li&gt;
&lt;li&gt;Objectives with multiple local optima (most neural network training landscapes)&lt;/li&gt;
&lt;li&gt;Situations where you can afford 500-30000 evaluations but need better results than random search&lt;/li&gt;
&lt;li&gt;When you want reproducible tuning (set the seed)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;When something else might be better:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Very expensive objectives where you can only afford &amp;lt;100 evaluations — use Bayesian optimization (Optuna)&lt;/li&gt;
&lt;li&gt;Purely combinatorial/discrete spaces — MSPO works in continuous space, so you'd need to round parameters. Evolutionary methods designed for discrete spaces may be more natural.&lt;/li&gt;
&lt;li&gt;Low-dimensional smooth problems (1-2 parameters) — grid search or scipy.optimize will be simpler and effective&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Tuning MSPO itself:&lt;/strong&gt;&lt;br&gt;
The default parameters (from the paper) work well as a starting point. The main knobs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;n_parrots&lt;/code&gt;: more agents = better coverage but more evaluations per iteration. 20-50 is a reasonable range.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;max_iter&lt;/code&gt;: more iterations = finer convergence. 200-1000 depending on your budget.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;seed&lt;/code&gt;: always set this for reproducibility. Run with 3-5 different seeds and take the best if you want robustness.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The inertia weight, Tent map parameters, and Levy flight beta all have paper-validated defaults and I haven't found a case where changing them helps significantly. Leave them alone unless you have a specific reason.&lt;/p&gt;
&lt;h2&gt;
  
  
  Source code and reproduction
&lt;/h2&gt;

&lt;p&gt;Everything is on GitHub with 73 unit tests:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Repo&lt;/strong&gt;: &lt;a href="https://github.com/vijaygovindaraja/mspo" rel="noopener noreferrer"&gt;github.com/vijaygovindaraja/mspo&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PyPI&lt;/strong&gt;: &lt;code&gt;pip install mspo&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To reproduce the CEC 2022 benchmarks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;mspo[benchmark]
python benchmarks/run_cec2022.py &lt;span class="nt"&gt;--quick&lt;/span&gt;  &lt;span class="c"&gt;# ~2 min smoke test&lt;/span&gt;
python benchmarks/run_cec2022.py          &lt;span class="c"&gt;# full run, ~2.5 hours&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The package is 6 source files, each under 200 lines. If you want to understand or modify the algorithm, start with &lt;code&gt;mspo/behaviors.py&lt;/code&gt; (the four update rules) and &lt;code&gt;mspo/optimizer.py&lt;/code&gt; (the main loop).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Paper&lt;/strong&gt;: Govindarajan, V. (2025). MSPO: A machine learning hyperparameter optimization method for enhanced breast cancer image classification. &lt;em&gt;Digital Health&lt;/em&gt;.&lt;/p&gt;

</description>
      <category>python</category>
      <category>parrot</category>
      <category>machinelearning</category>
      <category>opensource</category>
    </item>
    <item>
      <title>How I Built Aegis-5: An Ensemble Framework That Detects 99.98% of IIoT Intrusions</title>
      <dc:creator>Vijay Govindaraja</dc:creator>
      <pubDate>Sun, 29 Mar 2026 03:54:35 +0000</pubDate>
      <link>https://forem.com/vijaygovindaraja/how-i-built-aegis-5-an-ensemble-framework-that-detects-9998-of-iiot-intrusions-39bk</link>
      <guid>https://forem.com/vijaygovindaraja/how-i-built-aegis-5-an-ensemble-framework-that-detects-9998-of-iiot-intrusions-39bk</guid>
      <description>&lt;p&gt;Factory floors in 2026 look nothing like they did a decade ago. Robots collaborate with humans, sensors talk to cloud systems, and every machine is a node on the network. This is Industry 5.0 — and it's a massive attack surface.&lt;/p&gt;

&lt;p&gt;I spent the last year building &lt;strong&gt;Aegis-5&lt;/strong&gt;, a hybrid ensemble framework for intrusion detection in these environments. The work was published in &lt;a href="https://doi.org/10.1145/3787224" rel="noopener noreferrer"&gt;ACM Transactions on Autonomous and Adaptive Systems&lt;/a&gt;, and I've open-sourced the full implementation. Here's how it works and why existing approaches fall short.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;Industrial IoT networks generate diverse traffic — normal SCADA commands, sensor telemetry, actuator signals — alongside attack patterns that look increasingly like legitimate traffic. Traditional IDSs struggle here because:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Single classifiers can't generalize&lt;/strong&gt; across the wide variety of IIoT attack types (DDoS, reconnaissance, spoofing, botnet C2, etc.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Static models degrade&lt;/strong&gt; as traffic patterns shift during production cycles&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zero-day attacks&lt;/strong&gt; bypass signature-based and even some ML-based systems&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Class imbalance&lt;/strong&gt; — in real IIoT traffic, some attack types are extremely rare&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Idea Behind Aegis-5
&lt;/h2&gt;

&lt;p&gt;Instead of betting on one classifier, Aegis-5 combines five fundamentally different learners and lets them vote — but the voting isn't equal. Each classifier's vote is weighted dynamically based on how well it's been performing &lt;em&gt;on each specific attack class&lt;/em&gt; in recent predictions.&lt;/p&gt;

&lt;p&gt;The five classifiers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Random Forest&lt;/strong&gt; — handles high-dimensional feature spaces well&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gradient Boosting&lt;/strong&gt; — strong on structured/tabular data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;XGBoost&lt;/strong&gt; — efficient gradient boosting with regularization&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SVM&lt;/strong&gt; — effective decision boundaries in transformed feature space&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;KNN&lt;/strong&gt; — captures local neighborhood patterns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each brings a different inductive bias. That diversity is the whole point.&lt;/p&gt;

&lt;h2&gt;
  
  
  Dynamic Weighting: The Core Innovation
&lt;/h2&gt;

&lt;p&gt;Here's where Aegis-5 diverges from standard ensembles. Instead of fixed weights or simple majority voting, we maintain a sliding window of the last K=1000 predictions for each classifier and compute per-class F1 scores in real time.&lt;/p&gt;

&lt;p&gt;The weight for classifier &lt;em&gt;i&lt;/em&gt; on class &lt;em&gt;c&lt;/em&gt; is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;w_i,c = exp(beta * F1_i,c) / sum_j(exp(beta * F1_j,c))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is just softmax with a temperature parameter (beta=2.0). When a classifier is nailing a specific attack type, its weight for that class goes up. When it's struggling, the ensemble naturally shifts trust to the classifiers that are performing better.&lt;/p&gt;

&lt;p&gt;In Python:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;DynamicWeightManager&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n_classifiers&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n_classes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;window_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;beta&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;2.0&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;windows&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;deque&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;maxlen&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;window_size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_classifiers&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;weights&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ones&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;n_classifiers&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n_classes&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;n_classifiers&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;beta&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;beta&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_recompute_weights&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;f1_scores&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ones&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;n_classifiers&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;n_classes&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;

        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;n_classifiers&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;windows&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;continue&lt;/span&gt;
            &lt;span class="n"&gt;records&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;windows&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="n"&gt;y_true&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;records&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="n"&gt;y_pred&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;records&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="n"&gt;per_class_f1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;f1_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;y_true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_pred&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;average&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;n_classes&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="n"&gt;zero_division&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;f1_scores&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;per_class_f1&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;per_class_f1&lt;/span&gt;

        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;n_classes&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;scores&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;beta&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;f1_scores&lt;/span&gt;&lt;span class="p"&gt;[:,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="n"&gt;exp_scores&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scores&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scores&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;weights&lt;/span&gt;&lt;span class="p"&gt;[:,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;exp_scores&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;exp_scores&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Meta-Learner Layer
&lt;/h2&gt;

&lt;p&gt;On top of the dynamically-weighted base predictions, a Logistic Regression meta-learner synthesizes the final output. It takes the weighted probability vectors from all five classifiers as input features and learns the optimal combination.&lt;/p&gt;

&lt;p&gt;But here's the twist — we don't blindly trust the meta-learner either.&lt;/p&gt;

&lt;h2&gt;
  
  
  Hybrid Voting Protocol
&lt;/h2&gt;

&lt;p&gt;The final prediction uses a confidence threshold (tau=0.95):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;High confidence&lt;/strong&gt; (meta-learner probability &amp;gt;= tau): use soft voting with the meta-learner's output&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Low confidence&lt;/strong&gt;: fall back to hard voting (weighted majority) across all five classifiers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This hybrid approach means the system is aggressive when it's confident and conservative when it's uncertain — exactly what you want in a security-critical environment.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_hybrid_predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;meta_proba&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;meta_learner&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;predict_proba&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;meta_features&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;max_confidence&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;meta_proba&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;predictions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;empty&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_samples&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dtype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;high_conf&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;max_confidence&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;confidence_threshold&lt;/span&gt;
    &lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;high_conf&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;meta_proba&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;high_conf&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;argmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Hard voting fallback for low-confidence samples
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;~&lt;/span&gt;&lt;span class="n"&gt;high_conf&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="n"&gt;votes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;clf&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;classifiers&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;pred&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;clf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="n"&gt;weight&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;weight_manager&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_weights&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pred&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="n"&gt;votes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;pred&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;votes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pred&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;weight&lt;/span&gt;
        &lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;votes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;votes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;predictions&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Preprocessing Pipeline
&lt;/h2&gt;

&lt;p&gt;IIoT data is messy. Our pipeline:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Median imputation&lt;/strong&gt; for missing values (robust to outliers from sensor noise)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;StandardScaler&lt;/strong&gt; normalization&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ANOVA F-test + Recursive Feature Elimination with Cross-Validation (RFECV)&lt;/strong&gt; — keeps only statistically significant features&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PCA&lt;/strong&gt; for dimensionality reduction&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SMOTE&lt;/strong&gt; to handle class imbalance (rare attack types)&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;p&gt;We evaluated on two benchmark IIoT datasets:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dataset&lt;/th&gt;
&lt;th&gt;Accuracy&lt;/th&gt;
&lt;th&gt;Precision&lt;/th&gt;
&lt;th&gt;Recall&lt;/th&gt;
&lt;th&gt;F1-Score&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IoT-23&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;99.98%&lt;/td&gt;
&lt;td&gt;99.97%&lt;/td&gt;
&lt;td&gt;99.96%&lt;/td&gt;
&lt;td&gt;99.96%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;CIC-IoT 2023&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;99.95%&lt;/td&gt;
&lt;td&gt;99.93%&lt;/td&gt;
&lt;td&gt;99.92%&lt;/td&gt;
&lt;td&gt;99.93%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;These numbers beat prior state-of-the-art approaches on both datasets. More importantly, the per-class metrics show strong performance even on rare attack types — which is where most single-classifier systems fail.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It Yourself
&lt;/h2&gt;

&lt;p&gt;The full implementation is open-source:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/vijaygovindaraja/Aegis5.git
&lt;span class="nb"&gt;cd &lt;/span&gt;Aegis5
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
python demo.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or use it in your own project:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;aegis5&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Aegis5&lt;/span&gt;

&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Aegis5&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;confidence_threshold&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.95&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;beta&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;2.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;use_feature_selection&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;use_pca&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;use_smote&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_train&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_test&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Accuracy: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;accuracy&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What I Learned
&lt;/h2&gt;

&lt;p&gt;Building Aegis-5 reinforced a few things for me:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Diversity beats complexity.&lt;/strong&gt; Five relatively simple classifiers with smart weighting outperformed deeper, more complex individual models. The key is that each classifier fails differently — and the ensemble exploits that.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Adaptive systems matter in production.&lt;/strong&gt; Static models decay. The sliding window approach means Aegis-5 adapts as traffic patterns change without retraining from scratch.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hybrid strategies beat pure strategies.&lt;/strong&gt; The soft/hard voting hybrid outperformed both pure soft voting and pure hard voting. Knowing when to be confident and when to be cautious is underrated in ML system design.&lt;/p&gt;

&lt;h2&gt;
  
  
  Paper
&lt;/h2&gt;

&lt;p&gt;The full paper is published in ACM Transactions on Autonomous and Adaptive Systems:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Govindarajan, V., Ahmed, F., Faheem, Z.B., Bilal, M., Ayadi, M., &amp;amp; Ali, J. (2026). Aegis-5: A Hybrid Ensemble Framework for Intrusion Detection in Industry 5.0 Driven Smart Manufacturing Environment. &lt;em&gt;ACM TAAS&lt;/em&gt;. &lt;a href="https://doi.org/10.1145/3787224" rel="noopener noreferrer"&gt;DOI: 10.1145/3787224&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;If you're working on IIoT security or ensemble methods, I'd love to hear your thoughts. Drop a comment or open an issue on the &lt;a href="https://github.com/vijaygovindaraja/Aegis5" rel="noopener noreferrer"&gt;GitHub repo&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>iot</category>
      <category>networking</category>
      <category>programming</category>
    </item>
    <item>
      <title>I Built a Free WCAG Accessibility Audit CLI for Government Teams</title>
      <dc:creator>Vijay Govindaraja</dc:creator>
      <pubDate>Sat, 28 Mar 2026 06:57:43 +0000</pubDate>
      <link>https://forem.com/vijaygovindaraja/i-built-a-free-wcag-accessibility-audit-cli-for-government-teams-bbp</link>
      <guid>https://forem.com/vijaygovindaraja/i-built-a-free-wcag-accessibility-audit-cli-for-government-teams-bbp</guid>
      <description>&lt;p&gt;Every government website in the US is required to meet Section 508 accessibility standards. Most commercial tools cost hundreds per month. So I built an open source alternative.&lt;br&gt;
**&lt;br&gt;&lt;br&gt;
The Problem                                                                                                         **&lt;br&gt;&lt;br&gt;
  If you're a developer working on a .gov site, you need to verify WCAG compliance before every deploy. Your options&lt;br&gt;&lt;br&gt;
  are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Manual testing — slow, inconsistent, doesn't scale
&lt;/li&gt;
&lt;li&gt;Commercial tools (Siteimprove, Level Access) — $500+/month&lt;/li&gt;
&lt;li&gt;Browser extensions (axe DevTools) — great for one page, but can't scan a whole site or run in CI&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I wanted something that:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Runs from the terminal
&lt;/li&gt;
&lt;li&gt;Scans entire sites via sitemap
&lt;/li&gt;
&lt;li&gt;Outputs JSON/CSV for CI pipelines&lt;/li&gt;
&lt;li&gt;Costs nothing
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;wcag-audit                                                                                                            &lt;/p&gt;

&lt;p&gt;npx wcag-audit scan &lt;a href="https://your-site.gov" rel="noopener noreferrer"&gt;https://your-site.gov&lt;/a&gt;                                                                             &lt;/p&gt;

&lt;p&gt;That's it. No API keys, no account, no config files.&lt;/p&gt;

&lt;p&gt;It launches a headless browser, injects &lt;a href="https://github.com/dequelabs/axe-core" rel="noopener noreferrer"&gt;https://github.com/dequelabs/axe-core&lt;/a&gt; (the same engine Google and Microsoft&lt;br&gt;&lt;br&gt;
  use), and returns a report with every WCAG violation, the affected elements, and how to fix them.&lt;/p&gt;

&lt;p&gt;What the output looks like                                                                                          &lt;/p&gt;

&lt;p&gt;════════════════════════════════════════════════════&lt;br&gt;
    WCAG ACCESSIBILITY AUDIT REPORT&lt;br&gt;&lt;br&gt;
  ════════════════════════════════════════════════════&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;URL:      https://example.gov                                                                                     
Title:    Example Government Site
Level:    WCAG AA

Critical: 2  Serious: 5  Moderate: 8  Minor: 3                                                                      

[critical] image-alt: Images must have alternate text                                                               
  WCAG: wcag2a, wcag111                                                                                           
  Elements affected: 4
    → img.hero-banner
      Fix: Element does not have an alt attribute

[serious] color-contrast: Elements must meet minimum
  color contrast ratio                                                                                              
  WCAG: wcag2aa, wcag143                                                                                          
  Elements affected: 12
    → .nav-link
      Fix: Element has insufficient color contrast
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;════════════════════════════════════════════════════&lt;/p&gt;

&lt;p&gt;Every violation tells you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The rule that failed
&lt;/li&gt;
&lt;li&gt;The severity (critical, serious, moderate, minor)
&lt;/li&gt;
&lt;li&gt;Which WCAG criterion it violates
&lt;/li&gt;
&lt;li&gt;The CSS selector of the affected element&lt;/li&gt;
&lt;li&gt;How to fix it
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Scan an Entire Site&lt;/p&gt;

&lt;p&gt;wcag-audit crawl &lt;a href="https://example.gov/sitemap.xml" rel="noopener noreferrer"&gt;https://example.gov/sitemap.xml&lt;/a&gt; --max-pages 50&lt;/p&gt;

&lt;p&gt;Reads the sitemap, scans each page, and produces a consolidated report.                                             &lt;/p&gt;

&lt;p&gt;Drop it into CI&lt;/p&gt;

&lt;p&gt;The CLI exits with code 1 when violations are found:                                                                  &lt;/p&gt;

&lt;p&gt;# GitHub Actions                                                                                                      &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;name: Accessibility Audit
run: npx wcag-audit scan ${{ env.DEPLOY_URL }} --level AA&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now accessibility is enforced on every deploy. No violations, no merge.&lt;/p&gt;

&lt;p&gt;JSON and CSV for Reporting                                                                                            &lt;/p&gt;

&lt;p&gt;# JSON for programmatic use&lt;br&gt;&lt;br&gt;
  wcag-audit scan &lt;a href="https://example.gov" rel="noopener noreferrer"&gt;https://example.gov&lt;/a&gt; --format json --output report.json                                              &lt;/p&gt;

&lt;p&gt;# CSV for spreadsheets and compliance reports&lt;br&gt;
  wcag-audit scan &lt;a href="https://example.gov" rel="noopener noreferrer"&gt;https://example.gov&lt;/a&gt; --format csv --output report.csv&lt;/p&gt;

&lt;p&gt;The CSV is designed for compliance teams who need to track violations in spreadsheets and produce audit reports for&lt;br&gt;&lt;br&gt;
  management.&lt;/p&gt;

&lt;p&gt;WCAG Levels                                                                                                         &lt;/p&gt;

&lt;p&gt;# Level A (minimum)&lt;br&gt;
  wcag-audit scan &lt;a href="https://example.gov" rel="noopener noreferrer"&gt;https://example.gov&lt;/a&gt; --level A&lt;/p&gt;

&lt;p&gt;# Level AA (required for US federal, most common)&lt;br&gt;&lt;br&gt;
  wcag-audit scan &lt;a href="https://example.gov" rel="noopener noreferrer"&gt;https://example.gov&lt;/a&gt; --level AA&lt;/p&gt;

&lt;p&gt;# Level AAA (strictest)&lt;br&gt;&lt;br&gt;
  wcag-audit scan &lt;a href="https://example.gov" rel="noopener noreferrer"&gt;https://example.gov&lt;/a&gt; --level AAA&lt;/p&gt;

&lt;p&gt;Most government sites need AA. The tool defaults to AA.&lt;/p&gt;

&lt;p&gt;Use it as a Library                                                                                                   &lt;/p&gt;

&lt;p&gt;const { scanUrl, formatTextReport } = require('wcag-audit');                                                          &lt;/p&gt;

&lt;p&gt;const results = await scanUrl('&lt;a href="https://example.gov" rel="noopener noreferrer"&gt;https://example.gov&lt;/a&gt;', {&lt;br&gt;
    level: 'AA',&lt;br&gt;
    viewport: { width: 1280, height: 720 },&lt;br&gt;
  });&lt;/p&gt;

&lt;p&gt;if (results.summary.impactBreakdown.critical &amp;gt; 0) {&lt;br&gt;&lt;br&gt;
    console.error('Critical accessibility violations found!');&lt;br&gt;
    process.exit(1);&lt;br&gt;&lt;br&gt;
  }                                                                                                                   &lt;/p&gt;

&lt;p&gt;Why I Built This&lt;/p&gt;

&lt;p&gt;I've been contributing to accessibility-related projects across multiple government agencies — the US Web Design&lt;br&gt;&lt;br&gt;
  System (USWDS), the UK's GOV.UK Frontend, Singapore's GovTech accessibility tool (oobee), and Grafana's&lt;br&gt;
  colorblind-safe palette. Every one of these projects deals with the same problem: making sure websites are accessible &lt;br&gt;
  to everyone.                                                                                                        &lt;/p&gt;

&lt;p&gt;The tooling gap was obvious. Developers who care about accessibility shouldn't need to pay for the privilege of&lt;br&gt;&lt;br&gt;
  testing it.&lt;/p&gt;

&lt;p&gt;Tech Stack                                                                                                          &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/dequelabs/axe-core" rel="noopener noreferrer"&gt;https://github.com/dequelabs/axe-core&lt;/a&gt; — the accessibility engine used by Google, Microsoft, and government agencies 
worldwide&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://pptr.dev/" rel="noopener noreferrer"&gt;https://pptr.dev/&lt;/a&gt; — headless Chrome for reliable page rendering
&lt;/li&gt;
&lt;li&gt;Node.js — runs anywhere, no system dependencies beyond Chrome
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Install&lt;/p&gt;

&lt;p&gt;npm install -g wcag-audit&lt;/p&gt;

&lt;p&gt;Or try without installing:                                                                                            &lt;/p&gt;

&lt;p&gt;npx wcag-audit scan &lt;a href="https://example.com" rel="noopener noreferrer"&gt;https://example.com&lt;/a&gt;                                                                               &lt;/p&gt;

&lt;p&gt;Links&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/vijaygovindaraja/wcag-audit" rel="noopener noreferrer"&gt;https://github.com/vijaygovindaraja/wcag-audit&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;npm: &lt;a href="https://www.npmjs.com/package/wcag-audit" rel="noopener noreferrer"&gt;https://www.npmjs.com/package/wcag-audit&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;The tool is MIT licensed. PRs welcome. If you work on a government site and this saves you time, I'd love to hear&lt;br&gt;
  about it.&lt;/p&gt;

</description>
      <category>a11y</category>
      <category>cli</category>
      <category>opensource</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
