Forem: Muhammad ALhilali

Automating AI Red Teaming: From Manual Prompts to Fuzzing Pipelines 🧪

Muhammad ALhilali — Thu, 05 Feb 2026 14:19:49 +0000

Manual red teaming is dead.

If you are still copy-pasting "DAN" prompts into ChatGPT to test your agent's security, you have already lost.

The speed of AI development means new vulnerabilities emerge daily. You patch one prompt injection, and tomorrow a new "jailbreak" variant bypasses your filters.

The Problem: Static Defense vs. Dynamic Offense

Most security tools (WAFs, static analysis) look for known signatures. But LLM attacks are semantic. They depend on context.

To secure an agent, you need to think like an attacker who never sleeps. You need Continuous Automated Red Teaming.

Building a Fuzzing Pipeline

We need to move from "testing" to "fuzzing".

Generate Payloads: Use an adversarial LLM to generate thousands of attack variations.
Inject: Feed these into your target agent automatically.
Evaluate: Check if the agent performed the forbidden action (e.g., executing code, revealing PII).

Enter ExaAiAgent

I built ExaAiAgent to wrap this entire workflow into a single CLI.

It doesn't just run a list of bad words. It uses an "Attacker LLM" to mutate prompts dynamically until it finds a crack in your defenses.

# example-scan.yaml
target: "http://my-agent-api/v1/chat"
attacks:
  - "prompt-injection"
  - "pii-leakage"
  - "rce-attempt"
fuzzing_depth: 50

Security as Code

Your AI security policy shouldn't be a PDF document. It should be a CI/CD step that fails the build if your agent is vulnerable.

Stop guessing. Start fuzzing.

Check out the repo: github.com/hleliofficiel/ExaAiAgent

Why Your AI Agent is a Security Nightmare (And How to Fix It) 🛡️

Muhammad ALhilali — Thu, 05 Feb 2026 14:09:16 +0000

Everyone is building AI Agents.
But almost nobody is securing them.

We give them long-term memory, access to APIs, and permission to execute code. Then we act surprised when a simple Prompt Injection tricks them into leaking keys or running malicious commands.

This isn't just a "bug"—it's an architectural vulnerability.

The Agentic Web Needs an Immune System

Standard security tools (WAFs, static scanners) don't work here. They don't understand context. They can't see that a user asking to "ignore previous instructions" is an attack.

That's why I built ExaAiAgent.

It's not just a scanner. It's a real-time security layer for AI agents.

What's New in v2.1.2? 🚀

We just shipped a massive update:

Real-time Injection Detection: catches attacks before the LLM processes them.
Payload Fuzzing: tests your agent against thousands of known jailbreaks.
Reporting: gives you a clear view of your agent's risk posture.

Don't Wait for the Hack

If you're deploying agents to production, you need to test them.

👉 Star the repo & try it out: github.com/hleliofficiel/ExaAiAgent

Let's build a secure Agentic Web, together. 🦞

AI #CyberSecurity #OpenSource #DevSecOps

🚀 ExaAiAgent v2.1.2 is OFFICIALLY LIVE!

Muhammad ALhilali — Tue, 03 Feb 2026 14:19:19 +0000

🚀 ExaAiAgent v2.1.2 is OFFICIALLY LIVE!
The ultimate AI-Powered Security Agent just got a massive upgrade:
We don't just find bugs; we secure the future of AI agents.
Star the repo & join the revolution:🔗
#AI #OpenSource @GithubProjects

Stop trusting LLMs: I built an Open Source Prompt Injection Scanner 🤖🛡️

Muhammad ALhilali — Tue, 03 Feb 2026 10:33:07 +0000

We are rushing to integrate LLMs into everything. But we are forgetting one thing: LLMs are gullible.

If you connect an LLM to your database or internal APIs, a simple prompt injection can leak your data or delete your production DB.

So I built a tool to fix this.

Meet ExaAiAgent v2.1 🛡️

I just released a major update to ExaAiAgent, my open-source AI pentesting framework.

It now includes a dedicated AI Prompt Injection Scanner that tests for:

💉 Direct Injection: Overriding system instructions.
🔓 Jailbreaks: DAN, Developer Mode, Roleplay attacks.
📝 Data Extraction: Leaking system prompts and configuration.
🏃 Exfiltration: Sending data to external servers via markdown/URLs.

How it works (Python)

The scanner uses a library of 50+ payloads to probe your LLM application.


python
from exaaiagnt.tools.prompt_injection import PromptInjectionScanner

# Define your target
def chat_with_my_app(prompt):
    return client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": prompt}]
    )

# Scan it
scanner = PromptInjectionScanner()
results = scanner.scan(chat_with_my_app)

print(f"Vulnerabilities found: {results['vulnerabilities_found']}")

**
New: Kubernetes Security Scanner ⚓**

Because AI apps run on the cloud, I also added a K8s scanner to check for:

• Risky RBAC permissions (wildcard verbs)
• Privileged containers
• Missing Network Policies
Try it out

It's 100% open source. I'd love your feedback!

👉 [GitHub Repo](https://github.com/hleliofficiel/ExaAiAgent)

Let me know if you find any vulnerabilities in your own apps! 😈