<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Deen Muaz</title>
    <description>The latest articles on Forem by Deen Muaz (@itxdeeni).</description>
    <link>https://forem.com/itxdeeni</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F388946%2Ffc218cf0-bf4f-4e38-a5aa-e2e4360e6743.jpg</url>
      <title>Forem: Deen Muaz</title>
      <link>https://forem.com/itxdeeni</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/itxdeeni"/>
    <language>en</language>
    <item>
      <title># I Built an Autonomous Multi-Agent Security Auditor using Gemini 3 and TypeScript</title>
      <dc:creator>Deen Muaz</dc:creator>
      <pubDate>Sat, 25 Apr 2026 15:10:32 +0000</pubDate>
      <link>https://forem.com/itxdeeni/-i-built-an-autonomous-multi-agent-security-auditor-using-gemini-3-and-typescript-2ach</link>
      <guid>https://forem.com/itxdeeni/-i-built-an-autonomous-multi-agent-security-auditor-using-gemini-3-and-typescript-2ach</guid>
      <description>&lt;p&gt;Let's be direct: most security tooling in CI pipelines is theater. Your SAST scanner fires off 400 warnings per week, your team mutes the Slack channel, and the one real IDOR that could've let an attacker read every customer's order history slips into production because it was buried on page three of a report full of false positives.&lt;/p&gt;

&lt;p&gt;We are all tired of this. So I built &lt;code&gt;sentinai-core&lt;/code&gt; — an open-source npm package that runs an autonomous, three-agent AI pipeline against your GitHub Pull Request diffs and produces context-aware, validated security findings with step-by-step exploit proof-of-concept.&lt;/p&gt;

&lt;p&gt;Here's exactly how it works.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem with Traditional SAST
&lt;/h2&gt;

&lt;p&gt;Static analysis tools operate on pattern matching. They're fast and cheap, which is why they're everywhere — but they have a structural limitation: they are &lt;strong&gt;context-blind&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Consider an IDOR (Insecure Direct Object Reference). A scanner can tell you that a route handler accepts a dynamic &lt;code&gt;:id&lt;/code&gt; parameter. It cannot tell you that the controller skips the &lt;code&gt;req.user.id === resource.userId&lt;/code&gt; ownership check that exists on every other route in the file. It doesn't understand that the new middleware you added to &lt;code&gt;app.use()&lt;/code&gt; globally &lt;em&gt;doesn't actually apply&lt;/em&gt; to this specific sub-router because it was mounted before the middleware was registered.&lt;/p&gt;

&lt;p&gt;The result: &lt;strong&gt;an 80% false positive rate&lt;/strong&gt; on modern Node.js/Express codebases is common. Teams tune their noise thresholds so aggressively that real findings disappear with the noise.&lt;/p&gt;

&lt;p&gt;RBAC flaws are even worse. A scanner has no idea that &lt;code&gt;role: "admin"&lt;/code&gt; is set client-side and trusted on the backend without re-verification. That requires reading business logic.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Solution: A Multi-Agent AI Pipeline
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;sentinai-core&lt;/code&gt; treats security review as a multi-step adversarial reasoning problem — not a pattern match.&lt;/p&gt;

&lt;p&gt;The library ships as a single function:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;runOrchestrator&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sentinai-core&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;findings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;runOrchestrator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;diff&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It takes a raw PR diff string (up to &lt;strong&gt;80,000 characters&lt;/strong&gt; — a hard guard against context window overflow) and a logger callback, then internally dispatches three specialized AI agents in sequence. Each agent has a different model, a different thinking budget, and a different adversarial role.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Architecture: Three Agents, One Pipeline
&lt;/h2&gt;

&lt;p&gt;Here's the full data flow from a GitHub PR event to a validated security report:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;flowchart LR
    A(["GitHub PR\nDiff"]) --&amp;gt; B["🏗️ Architect\ngemini-3.1-flash-lite-preview\nthinking: low"]
    B --&amp;gt;|"Access control map\n+ vulnerability surface"| C["🥷 Adversary\ngemini-3-flash-preview\nthinking: medium"]
    C --&amp;gt;|"Up to 3 exploit\nwalkthrough reports"| D["🛡️ Guardian\ngemini-3-flash-preview\nthinking: high"]
    D --&amp;gt;|"Confidence scored\n+ OWASP mapped findings"| E(["PR Review\nComment"])
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each stage is sequential by design — the Adversary needs the Architect's typed map as context, and the Guardian needs both to arbitrate. The cost of sequencing is latency; the benefit is that no agent is flying blind.&lt;/p&gt;

&lt;h3&gt;
  
  
  Agent 1: The Architect — &lt;code&gt;gemini-3.1-flash-lite-preview&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;The Architect runs first, on a &lt;strong&gt;low thinking budget&lt;/strong&gt; (&lt;code&gt;thinkingLevel: 'low'&lt;/code&gt;). Its job is cheap, fast, and structural — not deep reasoning.&lt;/p&gt;

&lt;p&gt;It reads the diff and produces a machine-readable map:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"endpoints"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"GET /api/orders/:id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"PATCH /api/orders/:id/status"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"auth_middleware"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"isAuthenticated on /api/orders/:id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"MISSING ownership check"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"rbac_mapping"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Admin can set any status; User role should only read own orders"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"vulnerability_surface"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Route ID parameter flows directly into DB query with no userId assertion"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This isn't just for human readability — it becomes the &lt;strong&gt;structured intelligence report&lt;/strong&gt; fed into the next agent. You're not passing raw text; you're passing typed, reasoned context.&lt;/p&gt;

&lt;p&gt;Critically, the system prompt includes a &lt;strong&gt;prompt injection defence&lt;/strong&gt;: the diff is wrapped in &lt;code&gt;&amp;lt;source_diff_for_analysis&amp;gt;&lt;/code&gt; XML tags, and the model is explicitly instructed to treat everything inside those tags as untrusted raw data. An attacker who embeds &lt;code&gt;IGNORE ALL PREVIOUS INSTRUCTIONS&lt;/code&gt; in a comment inside the PR will be ignored.&lt;/p&gt;

&lt;h3&gt;
  
  
  Agent 2: The Adversary — &lt;code&gt;gemini-3-flash-preview&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;The Adversary is a red teamer. It receives the Architect's map plus the full diff and its sole objective is to find exploitable paths and prove them.&lt;/p&gt;

&lt;p&gt;It runs on a &lt;strong&gt;medium thinking budget&lt;/strong&gt; and is instructed to find up to three distinct vulnerabilities, ordered by severity, and produce a step-by-step exploit walkthrough for each — actual HTTP requests and predicted responses included:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"attack_vector"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Parameter tampering — IDOR on order endpoint"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"exploit_steps"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"step"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Attacker authenticates as User A (ID: 101)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"request"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"POST /api/auth/login {&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;email&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;userA@test.com&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;, &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;password&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;...&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;}"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"expected_response"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"200 OK, JWT token issued"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"step"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Attacker requests User B's order by guessing sequential ID"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"request"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"GET /api/orders/202 (with User A's JWT)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"expected_response"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"200 OK, full order details for User B returned"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"bypass_technique"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Controller queries DB by req.params.id only. No userId comparison against req.user.id."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"affected_endpoint"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"GET /api/orders/:id"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The 3-finding cap is a deliberate resource exhaustion guard — not an arbitrary limit. An adversarial PR with hundreds of endpoints could otherwise drive up token costs and latency to the point of making the tool unusable in CI.&lt;/p&gt;

&lt;h3&gt;
  
  
  Agent 3: The Guardian — &lt;code&gt;gemini-3-flash-preview&lt;/code&gt; (High Thinking Budget)
&lt;/h3&gt;

&lt;p&gt;This is where the false positive filtering happens. The Guardian uses the same &lt;code&gt;gemini-3-flash-preview&lt;/code&gt; model as the Adversary, but is given a &lt;strong&gt;high thinking budget&lt;/strong&gt; — the most expensive reasoning level in the pipeline. The distinction isn't about a different model; it's about giving the validation step the longest chain-of-thought to scrutinise the Adversary's work before anything reaches the output.&lt;/p&gt;

&lt;p&gt;It receives both the Architect's structural map and the Adversary's exploit report, then cross-examines the diff against a specific validation checklist:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Does any &lt;code&gt;app.use()&lt;/code&gt; global middleware apply to this route that would block the attack?&lt;/li&gt;
&lt;li&gt;Does the ORM/framework provide implicit ownership filtering (e.g., Prisma's &lt;code&gt;where: { userId }&lt;/code&gt; in a shared query)?&lt;/li&gt;
&lt;li&gt;Is the exploit logically consistent with the actual code — not just the route signature?&lt;/li&gt;
&lt;li&gt;Assign a &lt;strong&gt;confidence score&lt;/strong&gt; from 0–100.&lt;/li&gt;
&lt;li&gt;Map to the appropriate OWASP Top 10 category.&lt;/li&gt;
&lt;li&gt;Produce a concrete code fix.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The output is a &lt;code&gt;GuardianReport&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;GuardianReport&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;vulnerability&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;         &lt;span class="c1"&gt;// "IDOR on Order Endpoint"&lt;/span&gt;
  &lt;span class="nl"&gt;severity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;LOW&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;MEDIUM&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;HIGH&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;CRITICAL&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;confidence_score&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;      &lt;span class="c1"&gt;// 0–100&lt;/span&gt;
  &lt;span class="nl"&gt;reasoning&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;             &lt;span class="c1"&gt;// Why this is real, not noise&lt;/span&gt;
  &lt;span class="nl"&gt;false_positive_risk&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;   &lt;span class="c1"&gt;// What could make this wrong&lt;/span&gt;
  &lt;span class="nl"&gt;owasp_category&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;        &lt;span class="c1"&gt;// "A01:2021 – Broken Access Control"&lt;/span&gt;
  &lt;span class="nl"&gt;exploit_simulation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ExploitStep&lt;/span&gt;&lt;span class="p"&gt;[];&lt;/span&gt;
  &lt;span class="nl"&gt;affected_endpoint&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;suggested_fix&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;         &lt;span class="c1"&gt;// Actual corrected code snippet&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Any finding below the &lt;code&gt;MIN_CONFIDENCE&lt;/code&gt; threshold (configurable, defaulting to 40%) is suppressed before results are returned. The entire pipeline produces &lt;strong&gt;zero output&lt;/strong&gt; on a clean diff.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Tech: Vercel AI SDK + Robust JSON Extraction
&lt;/h2&gt;

&lt;p&gt;The stack is TypeScript with the &lt;strong&gt;Vercel AI SDK&lt;/strong&gt; (&lt;code&gt;ai&lt;/code&gt; and &lt;code&gt;@ai-sdk/google&lt;/code&gt;). The SDK's &lt;code&gt;generateText()&lt;/code&gt; abstraction handles the API surface cleanly, and the &lt;code&gt;providerOptions.google.thinkingConfig.thinkingLevel&lt;/code&gt; parameter controls the reasoning depth per-agent.&lt;/p&gt;

&lt;p&gt;The gnarliest engineering challenge wasn't the prompting — it was making JSON parsing bulletproof. LLMs are inconsistent about how they return structured data. Sometimes you get clean JSON. Sometimes you get it inside a &lt;code&gt;&lt;/code&gt;&lt;code&gt;json&lt;/code&gt;&lt;code&gt;&lt;/code&gt; fence. Sometimes you get two paragraphs of explanation followed by the JSON buried in the middle.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;sentinai-core&lt;/code&gt; uses a three-strategy fallback:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;extractJSON&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// Strategy 1: Direct parse — ideal path&lt;/span&gt;
  &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;trim&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;trim&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;

  &lt;span class="c1"&gt;// Strategy 2: Strip a single outermost ```&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;endraw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nx"&gt;json&lt;/span&gt;&lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="s2"&gt;``` fence
  const fenceMatch = raw.trim().match(/^```&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;endraw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;(?:&lt;/span&gt;&lt;span class="nx"&gt;json&lt;/span&gt;&lt;span class="p"&gt;)?&lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="nx"&gt;s&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="nx"&gt;n&lt;/span&gt;&lt;span class="p"&gt;?([&lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="nx"&gt;s&lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="nx"&gt;S&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;?)&lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="nx"&gt;n&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="s2"&gt;```\s*$/i);
  if (fenceMatch) {
    try { JSON.parse(fenceMatch[1].trim()); return fenceMatch[1].trim(); } catch {}
  }

  // Strategy 3: Walk the string to find the first balanced { } or [ ] block
  const startIdx = raw.trim().search(/[{[]/);
  if (startIdx !== -1) {
    // ... depth counter to find matching close bracket ...
  }

  return 'null'; // Caller handles graceful degradation
}
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If all three strategies fail, each agent has its own graceful fallback — either a safe empty result or a low-confidence rejection — so a single bad LLM response never crashes the pipeline.&lt;/p&gt;




&lt;h2&gt;
  
  
  Deploying in a Real CI Pipeline
&lt;/h2&gt;

&lt;p&gt;The library is built to slot into a GitHub App webhook handler. When a PR is opened or synchronized, you fetch the diff from the GitHub API, pass it through &lt;code&gt;runOrchestrator&lt;/code&gt;, and post the results as a PR review comment with severity badges.&lt;/p&gt;

&lt;p&gt;The production SentinAI platform runs on &lt;strong&gt;Cloud Run&lt;/strong&gt; (for cost-effective auto-scaling to zero) with &lt;strong&gt;Vertex AI&lt;/strong&gt; as the model backend, which gives you enterprise data residency guarantees. The &lt;code&gt;getModel()&lt;/code&gt; function in the core automatically switches between Google AI Studio and Vertex AI based on environment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Development: GEMINI_API_KEY → Google AI Studio&lt;/span&gt;
&lt;span class="c1"&gt;// Production:  USE_VERTEX=true → Vertex AI (Cloud Run service account auth)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The architecture handles multi-tenancy cleanly — each GitHub App installation gets its own analysis scope, and Supabase RLS enforces tenant isolation at the database level.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install &lt;/span&gt;sentinai-core
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;runOrchestrator&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sentinai-core&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;diff&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`diff --git a/src/routes/orders.ts ...`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// your raw PR diff&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;findings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;runOrchestrator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;diff&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;findings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;✅ No confirmed vulnerabilities found.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;f&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;findings&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`[&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;severity&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;] &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;vulnerability&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; — Confidence: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;confidence_score&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;%`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`OWASP: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;owasp_category&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Fix: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;suggested_fix&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Set your &lt;code&gt;GEMINI_API_KEY&lt;/code&gt; environment variable and you're running a three-agent security pipeline in minutes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;On execution time:&lt;/strong&gt; the full pipeline runs sequentially, so wall-clock time is the sum of three LLM round trips. On a standard PR diff (a few hundred lines), expect roughly &lt;strong&gt;15–30 seconds&lt;/strong&gt; end-to-end. For a large diff near the 80,000-character cap with multiple findings for the Guardian to validate independently, budget up to &lt;strong&gt;45–60 seconds&lt;/strong&gt;. That's acceptable for a post-push CI check; it would be too slow for a pre-commit hook.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;The current pipeline is sequential — Architect → Adversary → Guardian. The next architectural evolution is parallel Adversary runs: spin up three independent red-team agents simultaneously, each primed with a different attack category (access control, injection, business logic), and let the Guardian arbitrate across all findings.&lt;/p&gt;

&lt;p&gt;I'm also working on SARIF output support so findings can be ingested directly by GitHub's Security tab as code scanning alerts — no custom UI required.&lt;/p&gt;

&lt;p&gt;The full source, test suite, and a demo diff are available at &lt;strong&gt;&lt;a href="https://github.com/itxDeeni/SentinAI-Core" rel="noopener noreferrer"&gt;github.com/itxDeeni/SentinAI-Core&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;⭐ &lt;strong&gt;Star the repo&lt;/strong&gt; to track progress on SARIF output and parallel Adversary runs. Try it on your next PR, and if you want to contribute patterns to the vulnerability database — the threat model evolves faster than any one team can keep up with — open an issue. The more patterns in the library, the sharper the Architect's initial surface map gets.&lt;/p&gt;

</description>
      <category>security</category>
      <category>typescript</category>
      <category>ai</category>
      <category>devops</category>
    </item>
    <item>
      <title>My Goals for HNG I8</title>
      <dc:creator>Deen Muaz</dc:creator>
      <pubDate>Tue, 17 Aug 2021 00:47:16 +0000</pubDate>
      <link>https://forem.com/itxdeeni/my-goals-for-hng-i8-1643</link>
      <guid>https://forem.com/itxdeeni/my-goals-for-hng-i8-1643</guid>
      <description>&lt;p&gt;Hi There,&lt;/p&gt;

&lt;p&gt;How are you doing?&lt;/p&gt;

&lt;p&gt;So as part of my goal this year to build up my skills as a software engineer, I enrolled in an annual 8-months intensive intership organised by Hotels.ng, a nigerian tech company owned by Mark Essien.&lt;/p&gt;

&lt;p&gt;MY GOALS AT THE 8 WEEKS HNG INTERNSHIP PROGRAMME&lt;/p&gt;

&lt;p&gt;Build up my skillset by working on more projects outside my job&lt;br&gt;
Improve on my problem-solving skills, efficiency and speed.&lt;br&gt;
Collaborate and grow team spirit.&lt;/p&gt;

&lt;p&gt;Learn Figma Today&lt;br&gt;
&lt;a href="https://www.youtube.com/watch?v=g6rQFP9zCAM"&gt;https://www.youtube.com/watch?v=g6rQFP9zCAM&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Learn Git Today!&lt;br&gt;
&lt;a href="https://www.youtube.com/watch?v=8JJ101D3knE"&gt;https://www.youtube.com/watch?v=8JJ101D3knE&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Learn HTML Today!&lt;br&gt;
&lt;a href="https://www.youtube.com/watch?v=qz0aGYrrlhU"&gt;https://www.youtube.com/watch?v=qz0aGYrrlhU&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Learn Nodejs Today!&lt;br&gt;
&lt;a href="https://www.youtube.com/watch?v=TlB_eWDSMt4"&gt;https://www.youtube.com/watch?v=TlB_eWDSMt4&lt;/a&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Adding bootstrap to an exisiting Angular Project using angular-cli</title>
      <dc:creator>Deen Muaz</dc:creator>
      <pubDate>Tue, 17 Aug 2021 00:19:47 +0000</pubDate>
      <link>https://forem.com/itxdeeni/adding-bootstrap-to-an-exisiting-angular-project-using-angular-cli-1ge3</link>
      <guid>https://forem.com/itxdeeni/adding-bootstrap-to-an-exisiting-angular-project-using-angular-cli-1ge3</guid>
      <description>&lt;p&gt;Hello there,&lt;/p&gt;

&lt;p&gt;I assume you are running an angular project and want to find out how to install bootstrap and add it to your project.&lt;/p&gt;

&lt;p&gt;If so then, lets get right onto it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Step 1 - installing Angular cli&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;First, we'll start by installing the angular-cli if it hasn't been done already by typing in the following command :&lt;/p&gt;

&lt;p&gt;&lt;code&gt;npm install -g @angular/cli&lt;/code&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;this will install angular-cli globally on your system&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;2. Step 2 - Starting a new angular project&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Start a new angular project by typing:&lt;br&gt;
&lt;code&gt;ng new &amp;lt;ProjectName&amp;gt;&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Step 3 - Installing bootstrap&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In your projects root directory, install bootstrap by typing in :&lt;/p&gt;

&lt;p&gt;&lt;code&gt;npm install bootstrap&lt;/code&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;you can add the version number at the end i.e npm install &lt;a href="mailto:bootstrap@5.0"&gt;bootstrap@5.0&lt;/a&gt;, however not adding it installs the latest version to your project&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;4. Importing bootstrap into our project&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For the final step, we will need to import bootstrap to our project by adding an import statement in our root folders styles.css files&lt;/p&gt;

&lt;p&gt;Open the src/styles.css file of your Angular project and import the bootstrap.css file as follows:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;@import "~bootstrap/dist/css/bootstrap.css"&lt;/code&gt;&lt;/p&gt;

</description>
      <category>angular</category>
      <category>javascript</category>
      <category>css</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
