<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: bing yu</title>
    <description>The latest articles on Forem by bing yu (@bing_yu).</description>
    <link>https://forem.com/bing_yu</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3925406%2F697654db-d153-4858-b6df-52fc0bc3de4f.jpg</url>
      <title>Forem: bing yu</title>
      <link>https://forem.com/bing_yu</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/bing_yu"/>
    <language>en</language>
    <item>
      <title>Training Your Mouse Behavior Clone: Make AI Browser Agents Move Like You</title>
      <dc:creator>bing yu</dc:creator>
      <pubDate>Sun, 17 May 2026 09:39:18 +0000</pubDate>
      <link>https://forem.com/bing_yu/training-your-mouse-behavior-clone-make-ai-browser-agents-move-like-you-5293</link>
      <guid>https://forem.com/bing_yu/training-your-mouse-behavior-clone-make-ai-browser-agents-move-like-you-5293</guid>
      <description>&lt;p&gt;In May 2026, a paper titled "FP-Agent: Fingerprinting AI Browsing Agents" was published on arXiv. The research team measured 7 mainstream AI browsing agents and found that their mouse trajectories, typing rhythms, and other behavioral features form a &lt;strong&gt;distinctive fingerprint&lt;/strong&gt; -- not only distinguishing AI from humans, but also differentiating between different agent frameworks.&lt;/p&gt;

&lt;p&gt;What's more concerning: the behavioral consistency across existing schemes makes them easy to classify as automated traffic.&lt;/p&gt;

&lt;p&gt;This article explores, from a &lt;strong&gt;browser automation developer's perspective&lt;/strong&gt;, how to use deep learning to make AI agent mouse operations learn &lt;em&gt;your&lt;/em&gt; style instead of applying a generic "humanization" template.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem: Why Does "Humanization" Become a Fingerprint?
&lt;/h2&gt;

&lt;p&gt;Most mainstream browser automation frameworks handle "humanization" of mouse movements like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;move_mouse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;points&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;bezier_curve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;current_pos&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;px&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;py&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;points&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;mouse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;move&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;px&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;py&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;uniform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.01&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.03&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Seems reasonable? But here's the problem:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Everyone using this framework generates the same class of Bezier curves&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Random jitter follows the same distribution&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Overshoot trigger probability is the same fixed value&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you run 1,000 automation instances and cluster their mouse trajectories, you'll find they heavily overlap. This creates a &lt;strong&gt;behavioral fingerprint&lt;/strong&gt; — detect one pattern, identify all instances using that framework.&lt;/p&gt;

&lt;p&gt;The irony: the more "humanization" features you add, the less human it becomes — because there's no unified "human" pattern. &lt;strong&gt;If all your automation instances share the same "humanization" parameters, those parameters themselves become a massive collective fingerprint.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  A Different Approach: Learn "You", Not "Human"
&lt;/h2&gt;

&lt;p&gt;If the model is trained on &lt;strong&gt;your personal mouse operation data&lt;/strong&gt; and generates trajectories with &lt;strong&gt;your personal style&lt;/strong&gt;, the situation changes completely:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Generic Humanization&lt;/th&gt;
&lt;th&gt;Personal Behavior Clone&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Trajectory Distribution&lt;/td&gt;
&lt;td&gt;Shared by all users&lt;/td&gt;
&lt;td&gt;Unique per person&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Detection Difficulty&lt;/td&gt;
&lt;td&gt;One cluster identifies all&lt;/td&gt;
&lt;td&gt;Requires per-person modeling&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Model Size&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;~2MB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The core shift: &lt;strong&gt;instead of using more complex rules to simulate "human", let the model learn "you" from your own data&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Data Collection: Record Your Habits Non-Invasively
&lt;/h2&gt;

&lt;p&gt;To do behavior cloning, first you need your personal mouse trajectory data.&lt;/p&gt;

&lt;p&gt;The implementation is simple — a Tampermonkey userscript that listens to &lt;code&gt;mousemove&lt;/code&gt; events and records the complete trajectory from mouse movement to click. Movements shorter than 20px are treated as stationary clicks and discarded. We care about &lt;strong&gt;movement patterns&lt;/strong&gt;, not the click itself.&lt;/p&gt;

&lt;p&gt;The data format is straightforward:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"viewport"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"w"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2018&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"h"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1075&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"trajectory"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"x"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1480&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"y"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;322&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"t"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"x"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1504&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"y"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;317&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"t"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;31&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"x"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1501&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"y"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;319&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"t"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;69&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"target"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"tag"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"DIV"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Code"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;x/y are viewport coordinates, t is the time offset in milliseconds relative to the trajectory start. The target HTML tag is included because clicking a button vs. clicking a link produces genuinely different trajectories — buttons have larger target areas and more casual movements; links have smaller targets with more cautious end phases.&lt;/p&gt;

&lt;p&gt;After a few days of normal browsing, you'll have hundreds to thousands of trajectories. Then use the Tampermonkey menu to export a &lt;code&gt;.jsonl&lt;/code&gt; file and drop it into the project's &lt;code&gt;data/&lt;/code&gt; directory.&lt;/p&gt;




&lt;h2&gt;
  
  
  Architecture: Three Models, Each Handles One Thing
&lt;/h2&gt;

&lt;p&gt;With the data collected, the initial approach was to train a large GRU model for end-to-end spatial and temporal prediction. But experiments showed this didn't work well — the model would either learn overly smooth spatial trajectories (losing personalized arc styles) or uniform timing (losing acceleration/deceleration rhythms).&lt;/p&gt;

&lt;p&gt;The solution was to decompose the problem into three modules:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Bezier (Skeleton)  →  NoiseModel (Spatial Deviation)  →  GRU (Timing)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Bezier Curve&lt;/strong&gt; generates the macroscopic skeleton -- it's a fixed algorithm, doesn't learn anything, just guarantees a reasonable path from start to end.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;NoiseModel&lt;/strong&gt; is a small GRU (~166KB) that receives Bezier control points and outputs your personalized (x, y) path. It learns &lt;strong&gt;how you deviate from the ideal path&lt;/strong&gt; -- what arcs you prefer, how much you jitter.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;GRU&lt;/strong&gt; (~2MB) takes the NoiseModel's spatial path and predicts only &lt;strong&gt;when each point is reached&lt;/strong&gt; -- where you speed up, slow down, or hesitate.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Decomposing this way worked much better. Intuitively, the arc you take and when you accelerate/decelerate are two separate things. Learning them independently keeps each model cleaner.&lt;/p&gt;

&lt;h3&gt;
  
  
  How NoiseModel Learns Spatial Deviation
&lt;/h3&gt;

&lt;p&gt;NoiseModel's input is 8 Bezier curve parameters (start point, end point, two control point coordinates, normalized to viewport). It autoregressively generates a series of (x, y) points.&lt;/p&gt;

&lt;p&gt;During training, the first 5 epochs use pure teacher forcing with real trajectory points, then gradually reduce until the model fully self-predicts. This way it learns real data patterns while being able to generate paths independently at inference time.&lt;/p&gt;

&lt;p&gt;One detail: points in the final 20% of the trajectory get &lt;strong&gt;4x loss penalty&lt;/strong&gt; during training. The deceleration and fine-tuning phase near the target is where human vs. machine differences are most pronounced -- machines tend to arrive straight-on, while humans show subtle overshoot and correction.&lt;/p&gt;

&lt;h3&gt;
  
  
  How GRU Learns Timing
&lt;/h3&gt;

&lt;p&gt;GRU's input is per-step "relative space" features: distance remaining to target, how far the last step moved, how long the last step took, current progress. Not absolute coordinates — because when humans move a mouse, the brain processes "the target is still that direction, roughly how far away", not "the cursor is at pixel coordinates on screen".&lt;/p&gt;

&lt;p&gt;There's a gotcha with time handling: in the raw data, time differences between adjacent points range from 8ms to 3494ms (extreme outliers). Direct training would be dominated by these outliers. The fix is a &lt;strong&gt;log1p transform&lt;/strong&gt; -- compressing the range to [0, 8.8], then using expm1 inverse transform after training.&lt;/p&gt;

&lt;p&gt;Another practical finding: the model tends to predict slower times than actual. Adding a &lt;strong&gt;0.70 scaling factor&lt;/strong&gt; after training makes the generated total duration match the real median.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Not Transformer
&lt;/h3&gt;

&lt;p&gt;You might ask, why GRU when everyone's using Transformers now? Three reasons:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Trajectories are strongly continuous &lt;strong&gt;unidirectional&lt;/strong&gt; sequences — GRU's inductive bias naturally matches this&lt;/li&gt;
&lt;li&gt;Inference requires autoregressive generation, each step depends only on the previous one — GRU is much lighter than Transformer&lt;/li&gt;
&lt;li&gt;Small dataset (hundreds of samples) — GRU generalizes better on small data&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Training: Are a Few Hundred Samples Enough?
&lt;/h2&gt;

&lt;p&gt;Honest answer: a few hundred is barely enough. The ideal is thousands per person.&lt;/p&gt;

&lt;p&gt;In practice, both models are trained entirely on real data, no synthetic data dependency. The training strategy is to upweight the few hundred real samples (default ×10), telling the optimizer "prioritize fitting these real samples first".&lt;/p&gt;

&lt;p&gt;Training speed is also decent: GRU single epoch on GPU is about 45 seconds, 200 epochs under 3 minutes. CPU works too, just takes a bit longer.&lt;/p&gt;

&lt;p&gt;Final output: NoiseModel ~166KB, GRU ~2MB. Two files totaling under 3MB, CPU inference latency under 5ms.&lt;/p&gt;




&lt;h2&gt;
  
  
  An Overlooked Detail: Event Rate
&lt;/h2&gt;

&lt;p&gt;GRU-generated trajectories typically have only ~30 points, with ~27ms intervals — roughly 37Hz sampling rate.&lt;/p&gt;

&lt;p&gt;But what's the real mouse sampling rate? Browser-captured &lt;code&gt;mousemove&lt;/code&gt; is usually around 60Hz (~16ms). Actual mouse hardware sampling rate is typically 125Hz (~8ms).&lt;/p&gt;

&lt;p&gt;30 points looks too sparse. More critically: &lt;strong&gt;if your automation always produces exactly 30 points, that's an obvious artificial fingerprint&lt;/strong&gt;. Real human mouse movements have variable event counts — a 700px movement might generate 20 effective events when moving fast, or 80 when moving slowly with fine adjustments.&lt;/p&gt;

&lt;p&gt;So a &lt;strong&gt;resampling layer&lt;/strong&gt; is added after GRU output — "translating" the model's predicted timing contour into real mouse event rates:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Adaptive intervals: sparse during fast phases (~14ms), dense during deceleration (~4ms)&lt;/li&gt;
&lt;li&gt;Plus-minus 3ms time jitter: simulates hardware sampling noise&lt;/li&gt;
&lt;li&gt;Plus-minus 0.3px spatial jitter: simulates sensor accuracy&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Variable point count&lt;/strong&gt;: same start/end generates 20-80+ events each time&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This step only does "translation", it doesn't change the model's predicted velocity curve. When the model says "this action takes 800ms, accelerates in the middle, decelerates at the end", resampling just breaks that into irregularly-spaced events.&lt;/p&gt;




&lt;h2&gt;
  
  
  Results: How Does It Look
&lt;/h2&gt;

&lt;p&gt;Let's look at the comparison directly.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsl3jhl72iiopty12p3t5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsl3jhl72iiopty12p3t5.png" alt=" " width="800" height="301"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Left: three-stage pipeline generated trajectory (Bezier skeleton + NoiseModel spatial deviation + GRU timing). Right: pure Bezier curves. The pipeline trajectory isn't a smooth mathematical curve -- it has personalized arc deviations, uneven speed, and subtle hesitation near the end.&lt;/p&gt;

&lt;p&gt;The dynamic comparison is more intuitive:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fphocypjbezlfy9ni7qaa.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fphocypjbezlfy9ni7qaa.gif" alt=" " width="600" height="360"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Mechanical&lt;/strong&gt; (gray): instant teleport to target, pure algorithm&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bezier&lt;/strong&gt; (blue): smooth uniform glide, still algorithmic&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GRU Model&lt;/strong&gt; (gold): acceleration, hesitation, end-phase micro-adjustment — style learned from real data&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Quick Start
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1. Install dependencies&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;torch numpy matplotlib

&lt;span class="c"&gt;# 2. Place your exported .jsonl files in the data/ directory&lt;/span&gt;

&lt;span class="c"&gt;# 3. Train NoiseModel (spatial personalization)&lt;/span&gt;
python training/generate_trajectories.py &lt;span class="nt"&gt;--train-noise&lt;/span&gt; &lt;span class="nt"&gt;--epochs&lt;/span&gt; 100

&lt;span class="c"&gt;# 4. Train GRU (timing personalization)&lt;/span&gt;
python training/train_mouse_model.py &lt;span class="nt"&gt;--epochs&lt;/span&gt; 200

&lt;span class="c"&gt;# 5. Visualize results&lt;/span&gt;
python examples/demo.py &lt;span class="nt"&gt;--save&lt;/span&gt; output.png
python examples/animate_demo.py &lt;span class="nt"&gt;--save&lt;/span&gt; output.gif
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Integrate into browser automation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mouse_controller&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;move_to_humanized&lt;/span&gt;

&lt;span class="c1"&gt;# Before clicking
&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;move_to_humanized&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target_x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target_y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tag&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;BUTTON&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Then execute click
&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;click&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target_x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target_y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the model fails to load, the framework automatically falls back to Bezier curves — no breaking changes.&lt;/p&gt;




&lt;h2&gt;
  
  
  Limitations
&lt;/h2&gt;

&lt;p&gt;A few limitations worth being transparent about:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data volume.&lt;/strong&gt; A few hundred samples is barely sufficient; the ideal is thousands per person. The good news is this is a positive flywheel — more data means a better clone of you.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;NoiseModel's assumption.&lt;/strong&gt; The current "real = Bezier + residual" assumption is strong. A better approach would use a generative model (diffusion or VAE) to generate full trajectories directly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multimodal.&lt;/strong&gt; Mouse trajectory is just one dimension of behavioral fingerprinting. Keyboard rhythm, scroll patterns, mouse dwell time aren't modeled yet. For keyboard rhythm, add a &lt;code&gt;keyboard.jsonl&lt;/code&gt; under &lt;code&gt;data/&lt;/code&gt; (same format as trajectory) and adapt the scripts under &lt;code&gt;training/&lt;/code&gt; to reuse the existing GRU pipeline.&lt;/p&gt;

&lt;p&gt;These limitations are also directions for future iteration.&lt;/p&gt;




&lt;h2&gt;
  
  
  Closing Thoughts
&lt;/h2&gt;

&lt;p&gt;The AI browser agent ecosystem in 2026 is maturing rapidly. Operation accuracy is no longer the core bottleneck. The next bottleneck is &lt;strong&gt;trust&lt;/strong&gt; — platforms don't trust AI traffic, users don't trust mechanical operations.&lt;/p&gt;

&lt;p&gt;The future of AI operating computers shouldn't be "a generic robot simulating a generic human" — it should be &lt;strong&gt;your AI assistant, operating in your style, with your decision preferences, and behavioral characteristics that match you&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;All code and data in this project is for technical research and personalized AI research use only.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;Project code: &lt;a href="https://github.com/YuBing-link/mouse-behavioral-clone" rel="noopener noreferrer"&gt;https://github.com/YuBing-link/mouse-behavioral-clone&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Paper reference: FP-Agent: Fingerprinting AI Browsing Agents, arXiv:2605.01247, May 2026&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>python</category>
      <category>privacy</category>
      <category>automation</category>
    </item>
    <item>
      <title>Hexagonal Architecture is Not a Layered Architecture: Topology, Safety, and When to Walk Away</title>
      <dc:creator>bing yu</dc:creator>
      <pubDate>Mon, 11 May 2026 18:33:06 +0000</pubDate>
      <link>https://forem.com/bing_yu/hexagonal-architecture-is-not-a-layered-architecture-topology-safety-and-when-to-walk-away-4dln</link>
      <guid>https://forem.com/bing_yu/hexagonal-architecture-is-not-a-layered-architecture-topology-safety-and-when-to-walk-away-4dln</guid>
      <description>&lt;h2&gt;
  
  
  Most introductory articles draw it as an onion
&lt;/h2&gt;

&lt;p&gt;Every article about Hexagonal Architecture ends up with the same diagram: three concentric circles. Domain on the inside, Application in the middle, Infrastructure on the outside. Onion Architecture. Clean Architecture. Layered Architecture with a different hat.&lt;/p&gt;

&lt;p&gt;This is not what Ports &amp;amp; Adapters is.&lt;/p&gt;

&lt;p&gt;The difference is not in the number of layers—it's in the &lt;strong&gt;direction of coupling&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In a layered architecture, dependency flows vertically: Controller calls Service, Service calls Repository. Each layer depends on the one below it. The problem is not that this doesn't work—it works fine for many projects. The problem is that business logic and infrastructure still couple in the same direction. Your UserService calls UserRepository (an interface), but the implementation carries JPA annotations, transaction annotations, and caching annotations. Testing the Service layer still means dealing with these implicit dependencies.&lt;/p&gt;

&lt;p&gt;In Hexagonal Architecture, dependency flows &lt;strong&gt;radially inward&lt;/strong&gt;: Adapters depend on Ports, Ports belong to the Domain. There is no "upper layer" and "lower layer"—only "inside" and "outside."&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// Layered Architecture: dependency flows downward
Controller → Service → Repository (impl) → JPA/MySQL

// Hexagonal Architecture: dependency flows inward
Controller → [Port In] ← ApplicationService → [Port Out] ← Adapter
                            ↓
                    DomainService → [Port Out] ← Adapter
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The key difference: in layered architecture, the Service layer depends on the Repository implementation directly (even through an interface, the impl lives in the same layer). In hexagonal architecture, the business logic &lt;strong&gt;has no knowledge&lt;/strong&gt; of who implements &lt;code&gt;Port Out&lt;/code&gt;. It could be MyBatis, an in-memory Map, or a gRPC call. This "not knowing" is Dependency Inversion applied at the architectural level, not just the class level.&lt;/p&gt;

&lt;p&gt;Alistair Cockburn's original 2005 article described two "ports" (inbound and outbound) with "adapters" translating between the port protocol and whatever technology sits outside. There were exactly two ports, not three, four, or five layers. The proliferation of inner rings in modern interpretations comes from Clean Architecture's influence, not from Cockburn's original model. This matters because each additional ring creates a new boundary where objects must be mapped—and each mapping is a cost.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why not another architecture?
&lt;/h2&gt;

&lt;p&gt;I evaluated three alternatives before settling on Ports &amp;amp; Adapters. Each had a different cost structure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layered architecture&lt;/strong&gt; is the natural starting point. The project began this way and shipped in three days. But as the codebase grew, the problem wasn't "does it work"—it was "how much code must I read to make a change." In a layered architecture, a feature crosses three layers with no formal contract between them. Adding a field means verifying the Controller DTO, the Service parameter, and the Mapper mapping all changed. Nothing prevents a Controller from injecting a Mapper directly—and code review missed it multiple times. Layered architecture's constraints are soft constraints. They depend on team discipline in an environment where frameworks like Spring actively encourage cross-layer injection through &lt;code&gt;@Autowired&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Clean Architecture&lt;/strong&gt; shares the same inward-dependency topology as Hexagonal, but introduces more named layers (Entities, Use Cases, Interface Adapters, Frameworks). The naming imposes a cognitive cost: "why is this class an Entity but that one isn't?" "What's the boundary between a Use Case and a Domain Service?" Every team member needs to internalize these distinctions before they can make consistent architectural decisions. For a project that was already running, introducing Clean Architecture meant introducing terminology debates &lt;em&gt;before&lt;/em&gt; we could fix the actual coupling problems. Hexagonal's vocabulary is smaller—port, adapter, domain—and leaves the internal structure of the hexagon as an implementation detail rather than prescribing it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CQRS + Event Sourcing&lt;/strong&gt; is appropriate when the read/write ratio is extreme or when audit requirements demand a complete event log. Neither condition held for this project. The read and write models share the same shape, and the cost of maintaining a separate event store would have exceeded its value.&lt;/p&gt;

&lt;p&gt;Hexagonal Architecture won for three project-specific reasons:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Multiple inbound protocols, multiple outbound technologies.&lt;/strong&gt; The project doesn't just serve HTTP requests—it streams via SSE, receives Stripe webhooks, authenticates Chrome extension devices. It doesn't just use a database—it calls translation engines (HTTP), searches vectors (Redis), processes payments (Stripe), sends emails (SMTP). Hexagonal treats "inbound" and "outbound" as first-class architectural concepts. Adding a protocol means adding an adapter, not modifying business logic.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The translation engine is the most volatile dependency.&lt;/strong&gt; Translation providers change frequently—vendor switches, A/B tests, local vs cloud deployment. If engine selection leaks into business code, every switch becomes surgery across multiple files. Port Out + Adapter turns "switch engine" into "swap adapter."&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Testing business logic doesn't need infrastructure.&lt;/strong&gt; The translation pipeline depends on Python services, vector search depends on Redis, payments depend on Stripe SDK. In layered architecture, testing the Service layer means starting or mocking all of these. In hexagonal architecture, domain tests mock Port Out interfaces—no Spring context, no Redis, no WireMock. This matters in CI: domain tests run without any external service.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is not a universal recommendation. If your project has one inbound protocol (a standard HTTP API) and one outbound technology (a single database), layered architecture has a lower overhead cost and clearer payoff. Hexagonal's interface count and adapter classes are pure overhead in this scenario.&lt;/p&gt;
&lt;h2&gt;
  
  
  The direction of a Port determines the architecture's nature
&lt;/h2&gt;

&lt;p&gt;Many articles describe Ports as "just interfaces" where Inbound means "called by Controller" and Outbound means "implemented by Repository."&lt;/p&gt;

&lt;p&gt;The critical property of a Port is not "it's an interface"—it's &lt;strong&gt;who owns it&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;Port In&lt;/code&gt; belongs to the &lt;strong&gt;Application layer&lt;/strong&gt;—it defines "what the system can do," implemented by ApplicationService.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Port Out&lt;/code&gt; belongs to the &lt;strong&gt;Domain layer&lt;/strong&gt;—it defines "what the domain needs," implemented by Adapters.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is Dependency Inversion rendered at the architectural level. The domain defines the interfaces it needs; infrastructure implements them.&lt;/p&gt;

&lt;p&gt;I misunderstood this twice. The first time, I treated Port as "a place to define interfaces" without distinguishing who defined them. The second time, I placed Port Out inside the adapter package, thinking "domain-owned outbound interfaces" was DDD dogma. The practical consequence: when an interface needed to change, I couldn't tell whether the business requirement had changed (Port Out change means the domain's contract with the outside world changed) or the implementation had changed (adapter-only change, no business impact).&lt;/p&gt;

&lt;p&gt;Port ownership determines change impact scope. Changing a Port Out signature means the domain's expectation of the outside world has changed—not just that a repository implementation has been swapped.&lt;/p&gt;
&lt;h2&gt;
  
  
  Domain layer "zero dependency" is a matter of degree
&lt;/h2&gt;

&lt;p&gt;Hexagonal Architecture advocates often say "the domain layer has zero framework dependencies." Strictly speaking, this is a matter of degree, not an absolute.&lt;/p&gt;

&lt;p&gt;Does using &lt;code&gt;java.time.LocalDateTime&lt;/code&gt; count as a framework dependency? Technically yes (JSR 310), but nobody cares about this level. Does using Lombok &lt;code&gt;@Data&lt;/code&gt; count? It's an annotation processor with no runtime impact—most projects accept it.&lt;/p&gt;

&lt;p&gt;But using Spring &lt;code&gt;@Transactional&lt;/code&gt;, Jackson &lt;code&gt;@JsonProperty&lt;/code&gt;, or Hibernate &lt;code&gt;@Entity&lt;/code&gt; introduces runtime behavior changes that require framework context for testing. These should not appear in the domain package.&lt;/p&gt;

&lt;p&gt;Our domain layer uses Lombok and nothing else. No &lt;code&gt;@Service&lt;/code&gt;, no &lt;code&gt;@Entity&lt;/code&gt;, no &lt;code&gt;@Value&lt;/code&gt;. The test is not "does it have annotations" but "can the domain logic still compile and test if the framework is removed from the classpath."&lt;/p&gt;
&lt;h2&gt;
  
  
  Adapters are not just translators—they are security boundaries
&lt;/h2&gt;

&lt;p&gt;The conventional view of an Adapter is protocol translation: HTTP request to Java call, Java object to SQL statement, Java exception to HTTP status code.&lt;/p&gt;

&lt;p&gt;Adapters have an implicit second responsibility: &lt;strong&gt;input sanitization and exception classification&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;External systems are untrusted. A translation engine can be DNS-hijacked. Stripe callbacks can be replayed. Cached data in Redis can have the wrong format (cross-version deployment).&lt;/p&gt;

&lt;p&gt;If an adapter only translates protocols without validating inputs, the problem propagates to the domain layer. And the domain layer, by design, knows nothing about the outside world—making it poorly equipped to distinguish "valid data" from "data that happens to be valid-looking."&lt;/p&gt;

&lt;p&gt;The boundary between adapter validation and domain validation follows a useful heuristic:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Format validation&lt;/strong&gt; (JSON syntax, field type, non-empty string) — adapter layer.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Semantic validation&lt;/strong&gt; (order amount must be positive, status transition must be legal, user cannot delete self) — domain layer.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the validation rule remains valid after replacing the technology implementation, it belongs in the domain layer.&lt;/p&gt;
&lt;h2&gt;
  
  
  Code boundaries are not deployment boundaries
&lt;/h2&gt;

&lt;p&gt;Hexagonal Architecture solves a code organization problem: the dependency topology between different concerns.&lt;/p&gt;

&lt;p&gt;It does not solve a deployment problem: all adapters and the domain layer compile into one JAR, run in one JVM, and share memory and thread pools.&lt;/p&gt;

&lt;p&gt;If one adapter leaks memory (loading all translation cache into the process), it affects the entire application. If one adapter's HTTP connection pool is exhausted, it blocks other adapter requests.&lt;/p&gt;

&lt;p&gt;This is not a flaw in Hexagonal Architecture—it never promised to solve deployment isolation. But in a containerized environment, this is a separate concern that must be designed independently. Code boundaries and process boundaries can align or diverge depending on deployment strategy, not code structure.&lt;/p&gt;
&lt;h2&gt;
  
  
  The security semantics of Converters
&lt;/h2&gt;

&lt;p&gt;Domain models and persistence entities are separate classes, connected by Converters. This is not redundancy—it's isolation.&lt;/p&gt;

&lt;p&gt;Domain models reflect business concepts; persistence entities reflect database structure. They can legitimately differ—the entity may store concatenated values that the domain splits apart, or the domain may use enums while the database stores integers.&lt;/p&gt;

&lt;p&gt;Converters have a property that is easy to overlook: &lt;strong&gt;different directions have different safety constraints.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Converting from persistence entity to domain model requires validation—the data may have been written by an older version that accepted values now considered invalid. Converting from domain model to persistence entity assumes domain constraints have already been enforced—validation should not be delegated to the converter layer.&lt;/p&gt;
&lt;h2&gt;
  
  
  How to tell when Hexagonal Architecture is overkill
&lt;/h2&gt;

&lt;p&gt;Small projects don't need it. Not because it's wrong for small projects, but because the ROI scales with complexity.&lt;/p&gt;

&lt;p&gt;When your system has 3 Controllers, 5 Services, and 2 tables, Hexagonal Architecture's indirection is a concrete cost with an abstract benefit. Maintaining six files (Port interface, ApplicationService, Domain Model, Adapter, Converter, Entity) is measurably more expensive than maintaining two files (Controller, Service). The benefit—future-proofing against change—is speculative.&lt;/p&gt;

&lt;p&gt;When the system grows to multiple inbound protocols, multiple outbound technologies, and a complex domain, the clear dependency boundaries start paying back. Not because there's more code, but because &lt;strong&gt;the cost of reasoning about a change decreases&lt;/strong&gt;. When you know "translation engine changes go to &lt;code&gt;adapter/out/translate&lt;/code&gt;" and "billing logic changes go to &lt;code&gt;domain/service&lt;/code&gt;," you don't need to read the entire codebase to assess impact.&lt;/p&gt;

&lt;p&gt;This is a threshold problem. The threshold depends on team size, change frequency, and technology stack complexity. There is no universal answer.&lt;/p&gt;
&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;Hexagonal Architecture is not layered architecture rearranged. It's a different dependency topology—radially inward instead of vertically downward.&lt;/p&gt;

&lt;p&gt;Why choose it? Not because layered architecture is wrong, but because this project has:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;More than one inbound protocol and more than one outbound technology.&lt;/li&gt;
&lt;li&gt;A translation engine that changes frequently—the most volatile external dependency.&lt;/li&gt;
&lt;li&gt;Business logic that needs to be testable without infrastructure dependencies.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;What it solves: &lt;strong&gt;business logic should know about technology, but should not depend on it.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;What it doesn't solve: deployment isolation, automatic input validation, or security hardening. These are separate concerns that must be designed independently.&lt;/p&gt;

&lt;p&gt;The most useful thought to take away from this is not about Hexagonal Architecture specifically—it's about what architecture is for. Architecture patterns serve change management. The question to ask of any pattern is not "is this pattern good" but "what does changing a requirement cost in this structure—how many files must I read, how many tests must I run, what risks am I taking."&lt;/p&gt;

&lt;p&gt;Hexagonal Architecture has advantages here and costs here. Knowing both clearly matters more than picking the right label.&lt;/p&gt;



&lt;p&gt;Project source: &lt;/p&gt;
&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/YuBing-link" rel="noopener noreferrer"&gt;
        YuBing-link
      &lt;/a&gt; / &lt;a href="https://github.com/YuBing-link/noveltrans" rel="noopener noreferrer"&gt;
        noveltrans
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      AI-powered SaaS translation platform for web novels — batch translate with RAG memory, multi-agent collaboration, Stripe billing, Chrome extension &amp;amp; REST API. Java 21 + React 19 + Python.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;NovelTrans&lt;/h1&gt;
&lt;/div&gt;

&lt;p&gt;A production SaaS backend for AI-powered novel/document translation — multi-engine orchestration, RAG translation memory, Stripe subscription management, team collaboration, and multi-tenant data isolation.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://github.com/YuBing-link/noveltrans/README.zh.md" rel="noopener noreferrer"&gt;中文版&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://github.com/YuBing-link/noveltrans/actions/workflows/ci.yml" rel="noopener noreferrer"&gt;&lt;img src="https://github.com/YuBing-link/noveltrans/actions/workflows/ci.yml/badge.svg" alt="CI"&gt;&lt;/a&gt;
&lt;a href="https://openjdk.org/" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/1f6d6fc09797f6c4e51089421c8f56a3a1319516f61524ae475e9d67c23480a0/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4a6176612d32312d6f72616e67653f6c6f676f3d6f70656e6a646b" alt="Java"&gt;&lt;/a&gt;
&lt;a href="https://spring.io/" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/24f5ebe56aa4eee6224d3f5325d54c6cf5fa8a3c0ff5dd7b9b71686818c4b213/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f537072696e675f426f6f742d332e322d677265656e3f6c6f676f3d737072696e67" alt="Spring Boot"&gt;&lt;/a&gt;
&lt;a href="https://github.com/YuBing-link/noveltrans/docs/coverage-report-summary.md" rel="noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/b8910d0bbd628ef95348ff84ec006cc84e4f333a25d2687aa44c13e2e38a315b/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f436f7665726167652d38302e352532352d627269676874677265656e" alt="Coverage"&gt;&lt;/a&gt;
&lt;a href="https://github.com/YuBing-link/noveltrans/LICENSE" rel="noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/8174925d009b42074d50ab5cc7e29fcb1aa613b0d9cb2e43097697a40cf90fa4/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4c6963656e73652d4d49542d79656c6c6f77" alt="License"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Overview&lt;/h2&gt;
&lt;/div&gt;
&lt;p&gt;NovelTrans is a full-stack translation platform built for web novel authors and translators. It replaces the traditional "copy-paste into Google Translate" workflow with an intelligent pipeline that understands context, preserves character name consistency, and learns from past translations — all while reducing LLM API costs through RAG-based semantic reuse.&lt;/p&gt;
&lt;p&gt;Three client channels share the same backend:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;React web dashboard&lt;/strong&gt; — DeepL-style interface with real-time chapter preview&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Chrome extension (MV3)&lt;/strong&gt; — three modes: full-page, reader mode, text selection&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;External REST API&lt;/strong&gt; — API-key authenticated, for third-party integrations&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Features&lt;/h2&gt;
&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Multi-engine AI orchestration&lt;/strong&gt; — Routes translation requests across LLM (Python FastAPI + OpenAI SDK) and local engines (MTranServer) with probability-based load balancing using rolling 60-second performance windows; MTranServer serves dual purpose — fast translation mode for instant…&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/YuBing-link/noveltrans" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;Available for backend architecture and payment system design.&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>java</category>
      <category>webdev</category>
      <category>designpatterns</category>
    </item>
  </channel>
</rss>
