Forem: Grumpy Sage

Why Every CISO Needs an AIBOM in 2026 — And What Vendors Miss

Grumpy Sage — Mon, 11 May 2026 15:56:58 +0000

A friend of mine runs security at a mid-market fintech. Last month she got asked a question by her board that should have been trivial: "How many AI models are in production at our company, and where did they come from?"

She had a vendor-provided AIBOM. A real one. Generated by a well-known platform you've heard of. She pulled it up on the projector during the board meeting.

The AIBOM listed 14 models. She knew there were more.

After the meeting she spent two days with her platform team running their own inventory. The real number was 47. Some were embedded in SaaS tools her business teams had bought without telling her. Some were running locally on engineering workstations — llama.cpp instances developers had spun up to avoid the OpenAI rate limits. Two were fine-tuned variants of Llama 3 that a data science team had deployed inside a Kubernetes namespace nobody was scanning. One was a vLLM server somebody stood up on a GPU node six months ago and forgot about.

The vendor AIBOM had captured the API-based stuff. Anthropic. OpenAI. Bedrock. Easy targets. Everything that left a billing trail.

What it missed was the actual AI surface area. The part that sits inside her perimeter, runs on her hardware, processes her data, and has no rate limit or vendor SOC 2 to fall back on. The part that, if compromised, doesn't ring an alarm at a third party.

This is the AIBOM problem in 2026. The artifact exists. The compliance checkbox gets ticked. And the inventory is still wrong.

The thesis

An AIBOM is not an SBOM with a "model" row added. It's a fundamentally different artifact because AI systems have a fundamentally different supply chain — one that includes weights, prompts, embeddings, retrieval indexes, fine-tuning datasets, inference runtimes, and the agent scaffolding that ties them together. If your AIBOM doesn't capture all of those, what you have is a marketing document. And most of what's being shipped right now is exactly that.

What an AIBOM actually has to contain

Let me be specific, because the vendor space has gotten lazy about this.

A real AIBOM tracks the model itself — name, version, weights hash, license, provenance. That's the easy part. The part everyone gets right.

Then it has to track the inference runtime. This is where the wheels start coming off. Are you running Ollama? vLLM? TGI? LocalAI? Triton? LM Studio? llama.cpp? Each of those has its own CVEs, its own auth model, its own default configurations, and its own attack surface. A Llama 3 8B running on vLLM behind proper auth is a different risk than the same weights running on a default Ollama install with the API exposed on 0.0.0.0. The AIBOM has to know the difference.

Then the data lineage. What did the model get trained on? What does it get fine-tuned on? What sits in the retrieval index it's pulling from at inference time? An AIBOM that doesn't capture the RAG corpus is missing maybe 40% of the actual attack surface, because that's where prompt injection lives now. The model is fine. The PDFs your sales team uploaded last Tuesday are the threat.

Then the prompt layer. System prompts, tool definitions, agent loops, MCP server bindings. If your model has access to ten tools through an MCP server, those ten tools are part of the bill of materials. If one of them is a "send_email" tool with no human approval gate, that's a fact your AIBOM should be screaming about. Not buried in an appendix.

Finally, the runtime context. What network does this thing live on? What service account does it run under? What does it have IAM access to? You cannot reason about AI risk without that context, because the same model is a different risk profile depending on whether it can read your S3 buckets.

If you accept that list, you've already disqualified maybe 80% of the AIBOM tooling on the market. Most of it stops at "model name + version + license."

Where vendors go wrong, specifically

I want to name patterns, not vendors, because the patterns will outlive the vendors.

Pattern one: the SBOM-with-extra-columns approach. Some vendor took their existing software composition analysis tool and added a "model" detection rule. They find references to openai in your package.json and call that an AIBOM entry. This catches nothing self-hosted, nothing embedded in vendor SaaS, and nothing running outside the codebase you happen to be scanning. It's a checkbox.

Pattern two: the API-trail approach. Vendor watches your egress traffic or your cloud billing and infers AI usage. Better than nothing — catches shadow Anthropic accounts. But useless for anything inside the perimeter. A vLLM server on your internal GPU cluster generates zero egress traffic. It also generates zero AIBOM entries in this model.

Pattern three: the survey approach. Vendor sends a questionnaire to your dev teams. "List all AI systems in production." This is governance theater. The teams that fill it out conscientiously are not the teams you're worried about.

Pattern four: the model-registry approach. Vendor integrates with MLflow or SageMaker Model Registry and treats that as ground truth. Great if your entire organization uses one model registry. Nobody's entire organization uses one model registry. The shadow Ollama instance isn't in MLflow.

What all four of these share is that they're trying to generate an AIBOM from one perspective — the codebase, the network, the people, or the registry. AI systems live across all of those. You need detection that lives across all of those too.

The detection problem is a code problem first

Here's an opinionated take. The single highest-leverage place to build AI inventory is the codebase itself. Not because that's where everything lives, but because that's where most of the self-hosted, embedded, and shadow stuff originates. Somebody, somewhere, wrote an import statement.

This is what cyscan does in our platform. We've got 1,815 detection rules across 75+ languages, and a meaningful chunk of those are AI-specific patterns — runtime imports, model loading calls, agent framework usage, embedding library references, MCP client instantiations. If a developer imported vllm or instantiated an Ollama client or wired up a LangChain agent with a tool list, we want to know.

cyscan ai-inventory --repo ./monorepo --output aibom.json

The output isn't a list of models. It's a graph. Here's a service that loads Llama-3-8B-Instruct, runs it on vLLM, exposes it on port 8000, and is called by these three other services, one of which has an MCP server attached with these four tools. That's an AIBOM entry that you can actually reason about.

But code scanning alone isn't enough — that's the lesson I keep watching CISOs learn the expensive way. Code tells you what should exist. It doesn't tell you what's actually running on the GPU node nobody documented.

The runtime side: scanning what's actually live

This is where cyradar comes in, and where the architectural choice we made matters. cyradar specifically targets the self-hosted inference layer — Ollama, vLLM, TGI, LocalAI, Triton, LM Studio, llama.cpp. We picked those seven because they cover almost everything self-hosted in 2026. If you've got a GPU running an LLM, it's almost certainly one of those.

The point isn't just to find them. The point is to fingerprint them. What model is loaded? What version of the runtime? Is the auth configured? Is the API exposed on the management network or the data network? What's the context window, the max tokens, the system prompt baked in at startup?

cyradar discover --cidr 10.0.0.0/8 --runtimes all
cyradar fingerprint --target 10.4.12.88:11434

That second command tells you not just "there's an Ollama at this IP" but "there's an Ollama 0.5.7 with llama3:70b and nomic-embed-text loaded, the API is open, no auth, last queried 14 minutes ago." That's an AIBOM entry the code scanner can't produce because the code that spun this up may not exist in any repo you scan. Someone ran ollama pull on a server.

Combine the code-side inventory with the runtime-side inventory, reconcile them, and now you have something that looks like a real AIBOM. The reconciliation is the hard part. The code says service X should be talking to Ollama. The runtime says Ollama is running on host Y. Are those the same instance? You need topology.

The agent and tool layer

I said earlier that tools are part of the bill of materials. I want to push on that, because it's where I see the most magical thinking in current AIBOM standards.

In 2024 you had models. In 2025 you had models with tools. In 2026 you have agents with toolchains that span MCP servers, traditional APIs, and other agents. The "thing" you're inventorying isn't really a model anymore. It's a capability graph.

Our own MCP server exposes 10 tools. Each one represents a capability — scan a repo, fingerprint a runtime, pull a fuzz template, query the rule database. Any agent that connects to our MCP server inherits those 10 capabilities. If your AIBOM lists "Claude" as one entry, you've underspecified the system by an order of magnitude. The relevant entry is "Claude + these MCP servers + these tool permissions + this system prompt + this RAG corpus."

That's a mouthful. It's also reality. Any AIBOM standard that can't express that — and most of the current ones can't, cleanly — is going to be obsolete within a year.

Web-facing AI surface, which everyone forgets

The other gap I see constantly: AI in the web tier. Chatbots embedded in marketing sites. AI search bars. Internal admin tools with an LLM assistant bolted on. Customer support widgets backed by some RAG pipeline somebody set up in a hurry.

These rarely show up in model registries. They rarely show up in code scans of the main monorepo because they live in their own little frontend repo. They almost never show up in network discovery because they call out to a vendor, not in.

cyweb's 22 fuzz categories include LLM-specific ones — prompt injection across the wire, jailbreak attempts via input fields, system prompt extraction, tool invocation abuse. When we scan a web property, we're not just looking for SQLi anymore. We're testing whether the friendly chatbot in the bottom corner can be talked into revealing the system prompt or executing tool calls it shouldn't. If it can, that goes into the AIBOM as a finding, attached to the model and runtime entry for that chatbot.

Our 95% template conversion rate vs upstream community templates matters here because the upstream community is fast — new prompt injection payloads land daily, and the gap between "known technique" and "we can test for it" needs to be small. An AIBOM that catalogs your AI systems but can't test them is a museum exhibit.

Why one platform

I keep getting asked why we built all of this — cyscan, cyradar, cyweb, the MCP server — instead of just picking one and going deep. The answer is exactly the AIBOM problem we've been talking about.

You cannot generate a real AI bill of materials from one vantage point. Code-only misses runtime. Runtime-only misses provenance. Network-only misses the SaaS-embedded stuff. Survey-only misses everything anyone forgot. To get an inventory that's actually correct, you have to triangulate from at least three of those.

If those three tools are bought from three vendors with three data models, the reconciliation happens in a spreadsheet maintained by an exhausted security engineer. I've watched this fail in real organizations. The spreadsheet drifts. The board gets the wrong number.

When the inventory comes from one platform with one data model, reconciliation is a join, not a meeting. That's the architectural choice. It's not about wanting to sell more SKUs. It's that the AIBOM problem is fundamentally a correlation problem, and correlation across vendor boundaries doesn't work.

The recomposition

Here's what I think is actually happening, beyond the AIBOM specifically.

The security industry spent twenty years building tools for a world where software was deployed by humans, ran in known places, and changed on quarterly release cycles. Every tool category — SAST, DAST, SCA, EDR, CSPM — assumes that model.

AI broke the model. Software is now partly deployed by agents, runs in places nobody documented, and changes when a developer types ollama pull. The asset isn't a server or a service anymore. It's a capability graph that includes weights, prompts, tools, data, and runtime. The discovery problem isn't "what hosts do I have" but "what can my systems do, and who taught them to do it."

The AIBOM is the first artifact that tries to express this. The current versions of it are bad because the standards bodies are still thinking in SBOM terms. The good versions, the ones that will actually matter when regulators start asking for them — and they will, by end of 2026 in at least three jurisdictions I'm tracking — those are going to look like capability graphs, not parts lists.

The vendors who get this right are the ones rebuilding their data model around the AI supply chain rather than retrofitting their old one. Everyone else is going to spend 2027 explaining to their customers why the AIBOM they shipped missed half the surface area.

What to do Monday

If you're a CISO reading this and your current AIBOM came from a vendor demo, do one experiment. Run your own inventory — survey the engineering teams, scan the internal network for the seven self-hosted runtimes, grep the monorepo for AI imports. Compare your number to the vendor's number.

If they match, congratulations, you picked well. If they don't, you have a problem that no compliance report will surface until something goes wrong.

We can help with the inventory side. cyscan handles the code, cyradar handles the runtime, cyweb handles the web-facing surface, and the MCP server lets your own agents query the AIBOM directly — which is, in a meta way, how I think AIBOMs will mostly get consumed by 2027 anyway. By other agents.

If you want to talk through yours, find me at anand@cybrium.ai.

Why I Stopped Letting Claude Shell Out for Security Scans

Grumpy Sage — Mon, 11 May 2026 04:44:53 +0000

A founder I know spent last Tuesday night debugging what he thought was a Claude bug. He'd wired up Claude Code to his repo with the default shell tool, asked it to "scan this codebase for secrets and SQL injection," and watched it confidently produce a clean report. Zero findings. He shipped to staging. Twelve hours later his Datadog alert fired on a Postgres error trace that exposed a hardcoded service account key in a config file Claude had supposedly scanned.

He called me at 11pm. We screen-shared. The problem was almost funny once we saw it. Claude had run cyscan — correctly, with the right flags — against the wrong directory. It had cd'd into a subfolder earlier in the conversation to read a file, never cd'd back, and then run the scan from there. The scan completed in 400ms because there were six files in scope. Claude wrote up a confident summary of those six files, called it a codebase audit, and moved on.

That's not a Claude failure. That's a tool design failure. Shell is a terrible interface for a security scanner when the caller is a probabilistic agent with no model of working directory state, no schema for what "done" looks like, and no way to know if the tool it just invoked actually understood the request. The whole exchange was vibes. The agent produced confident output because shell tools produce stdout and stdout looks like an answer.

I've been building Cybrium for two years now, and the single most important architectural decision we made in the last six months was to stop telling people to invoke our scanners via shell. Today everything routes through an MCP server. Ten tools. Typed inputs. Structured outputs. No working directory drift. Let me explain why this matters and what we learned along the way.

The thesis

If your agent talks to security tooling over a shell, you've built a system where the agent's confidence is decoupled from the scanner's actual coverage. MCP fixes this by making the contract between agent and tool explicit, machine-checkable, and inspectable after the fact. This isn't a UX upgrade. It's the difference between a security pipeline you can audit and one you cannot.

What "default shell tool" actually gives you

When Claude Code, Cursor, or any agent runs cyscan --path . --format json through a bash tool, here's what's actually happening. The agent constructs a string. The string goes to a shell. The shell maintains its own state — working directory, environment variables, prior exit codes — that the agent only partially observes. The scanner runs, writes to stdout, maybe also stderr, exits with a code, and the agent reads it all back as a single blob of text that it then has to parse.

Every step there is a place where things break in ways the agent can't see.

The agent doesn't know if cyscan was the binary it thought it was, or some alias, or a different version on PATH. It doesn't know if the path it passed was a symlink, was expanded by the shell glob, or got truncated. It doesn't know if stderr contained warnings that materially change the meaning of stdout. It doesn't know if the exit code maps to "clean scan" or "scanner crashed after partial run." It just sees text.

And here's the part that haunts me as someone shipping a security product: the agent doesn't know how many rules ran. If cyscan ran 1,815 rules across 75+ languages on a 200-engineer monorepo, that's one outcome. If it ran 12 rules because it only found two file types in the subdirectory it was actually pointed at, that's a completely different outcome. Stdout looks similar in both cases — a JSON array of findings, possibly empty. The agent summarizes "no findings." The CISO sleeps poorly.

Shell tools optimize for human flexibility. Humans cross-reference, notice anomalies, get suspicious when a scan finishes too fast. Agents don't, at least not reliably, and certainly not under pressure when they're four turns deep in a conversation about something else.

What MCP changes structurally

Model Context Protocol is, at its core, a typed RPC layer between agents and tools. That sounds dry. The implications are not.

When Claude calls cyscan_repository through our MCP server, it isn't writing a shell string. It's calling a function with a typed schema. The schema declares that path is required, that it must be an absolute path, that language_filter is an optional enum, that rule_packs defaults to "all 1,815." The MCP server validates the call before our scanner ever runs. If the agent forgets a required arg, the call fails with a structured error the agent can actually reason about — not a bash error that says "missing argument" in some format the agent has to text-match against.

The response is structured too. Not stdout. A JSON object with fixed fields: scan_id, files_scanned, rules_executed, findings, coverage_metadata, duration_ms. The agent doesn't have to parse anything. It just reads files_scanned. If that number is 6 when the repo has 4,000 files, the agent has a fighting chance of noticing, because files_scanned is a first-class field that the agent's system prompt can be told to check.

This is what I mean by making the contract machine-checkable. With shell, "did the scan actually scan the thing" is a vibes question. With MCP, it's a field.

The ten tools and why ten

Our MCP server exposes exactly ten tools right now. I get asked sometimes why so few — surely a security platform has more surface area than that. The answer is that ten is the result of a lot of arguing about granularity.

Too few tools and each tool becomes a god-function with twenty parameters and the agent has to learn a sub-language to drive it. Too many tools and the agent's context window fills with tool descriptions before it's done a single useful thing. Ten was where we landed after watching agents actually use the server for three months.

The tools split roughly into three families. Code and repo scanning lives in cyscan-backed tools that handle static analysis across 75+ languages. AI-specific scanning lives in cyradar-backed tools that probe local inference endpoints — Ollama, vLLM, TGI, LocalAI, Triton, LM Studio, llama.cpp — for the kinds of misconfigurations that don't show up in any conventional vuln scanner. Web and API fuzzing lives in cyweb-backed tools that drive our 22 fuzz categories with 95% template-conversion fidelity against upstream community signatures.

Each tool does one thing. The agent composes them. That composition is where the real power lives, and it's also what shell tools fundamentally can't do, because shell composition happens through pipes and string parsing instead of through structured data the agent actually understands.

A concrete example

Here's the kind of workflow that's trivial with MCP and miserable with shell. Suppose I want my agent to do a full security pass on a new microservice: scan the source for vulns and secrets, then if any of the findings touch an AI inference path, probe the running inference endpoint for those specific issues, then if any of the findings touch an HTTP route, fuzz that route with relevant templates.

With shell, this is a small program. The agent has to invoke cyscan, parse the output, build a follow-up command, invoke that, parse, build another. Every parsing step is a place where the agent can hallucinate field names, miss findings, or get tripped up by formatting changes between versions. I've seen agents miss findings because they expected severity and got risk_level.

With MCP, here's roughly what it looks like from the agent's side:

1. call cyscan_repository(path=/repo/order-service)
   -> returns findings[] with structured types
2. for findings where category == "ai_inference":
     call cyradar_probe(endpoint=finding.endpoint, checks=["prompt_injection","model_extraction"])
3. for findings where category == "http_route":
     call cyweb_fuzz(target=finding.url, template_packs=relevant_packs)
4. call generate_report(scan_ids=[...])

The agent doesn't write parsing code. It doesn't construct strings. It calls functions on objects. When cyradar_probe finds a prompt injection vector against a llama.cpp endpoint, that finding is a typed object with a CVE-style identifier, a severity, a remediation hint, and a pointer back to the originating cyscan finding. The lineage is preserved. The audit trail is automatic.

You can build something similar with shell. People do. It involves jq, bash heredocs, and a lot of prayer. It is not robust to scanner version updates, scanner output changes, or agent context drift across turns. I have watched these pipelines work flawlessly for two weeks and then silently start dropping findings because someone added a field to the JSON output and the jq filter didn't match anymore. Nobody noticed for a month.

The state problem

This is the one I care about most, and it's the one that bit the founder I mentioned at the top. Shell sessions have state. Agents have an imperfect model of that state.

Working directory is the obvious one but it's not the only one. There's environment variables, which the agent often sets early in a conversation and then forgets about. There's PATH ordering, which can change which binary gets executed. There's shell history affecting tab completion if the agent uses it. There's locale settings affecting how filenames with non-ASCII characters get handled. There's umask affecting permissions on output files. Every one of these is a state surface the agent has to track or risk getting wrong.

MCP tools are stateless by default in the way the protocol is designed. Each call is a self-contained, fully-specified invocation. If you want state — say, a long-running scan whose results you want to retrieve later — that state is explicit and addressable. Our scan_id is a first-class thing. The agent passes it in, gets the same results back, can hand it to another tool. There's no "where am I in the filesystem" question because the filesystem isn't part of the protocol. Paths are arguments. Arguments are typed. The scanner resolves them against a known, fixed base.

This eliminates a whole class of failure mode that I genuinely believe is responsible for most agent-driven security incidents I've seen in the last year. Not zero-days. Not novel attacks. Agents scanning the wrong directory and confidently reporting clean.

Why one server and not ten CLI wrappers

I get the architectural question a lot: why does Cybrium ship one MCP server that exposes scanners, instead of three separate MCP servers wrapping cyscan, cyradar, and cyweb? Why couple them?

Because the findings need to talk to each other. A cyscan SAST finding about an unsafe deserialization in an LLM-output handler is interesting on its own. It becomes urgent when cyradar finds that the upstream inference endpoint accepts prompts from untrusted users. It becomes a P0 when cyweb confirms that the HTTP route exposing that handler is reachable without auth. None of those tools, in isolation, can tell you you have a critical incident chain. The MCP server holds the cross-tool context that makes correlation possible.

You could rebuild this on top of three separate MCP servers if you put a coordinator agent in front of them. People will try. I've tried. The coordinator agent has to know the semantics of findings from each scanner well enough to correlate them, which means baking scanner-specific knowledge into the agent's prompts, which means every scanner version bump becomes a prompt-engineering exercise. We did this. It was bad. Centralizing the correlation in the MCP server itself — where it can be versioned, tested, and updated alongside the scanners — is the better factoring.

The same logic, by the way, is why I don't believe in "bring your own scanner" MCP servers as a long-term architecture. Generic shells over arbitrary security tools sound great in a slide deck. In practice, the semantic gap between tools is where all the value lives, and a generic shell can't bridge it.

The recomposition

What's actually happening across the security tooling industry right now is a quiet recomposition. For fifteen years, the unit of integration was the CLI. You wrote a scanner that emitted SARIF or some custom JSON, and CI systems plumbed it together with bash. That worked when the orchestrator was a human writing YAML.

The orchestrator now is an agent. The agent doesn't write YAML. The agent makes decisions turn-by-turn based on what it just saw. The unit of integration for that world is not CLI output. It's a typed protocol that lets the agent reason about tools the same way a human reasons about a library. MCP is the first credible attempt at that protocol, and the products that win the next five years of security tooling will be the ones that ship native MCP surfaces, not the ones that bolt an MCP wrapper around their existing CLI as an afterthought.

The reason this is recomposition and not just integration is that once you have MCP-native tooling, the right unit of work changes. You stop thinking about "the scan" as a CI step and start thinking about "the security question" as an agent conversation. What did this PR change that touches PII? Did any of those changes introduce new attack surface that wasn't there yesterday? Are the inference endpoints we just deployed exposed to the same prompt injection that bit us last quarter? Those questions don't have YAML-shaped answers. They have agent-shaped answers, and they need tools the agent can actually drive.

What I'd do tomorrow if I were you

If you're using Claude Code or any agentic dev tool with shell access to security scanners right now, I'd do two things this week. Try our MCP server end-to-end on a real repo. The setup is one config block in your MCP client. Compare the findings count against whatever you're getting via shell. I would bet money you find a delta, and I'd bet that delta is in the direction of "MCP found things shell missed because shell was scanning the wrong scope."

The second thing: audit one of your agent conversations from last week. Pick a security-related one. Read the transcript. Count the number of places the agent made an assumption about shell state that it had no way to verify. Then ask yourself how many of those assumptions would still be assumptions if the tool had a typed schema.

You can pull the MCP server from cybrium.ai/mcp. The ten tools are documented there. Source for cyscan, cyradar, and cyweb lives in the same place. If you want to talk through your setup — especially if you're running local inference at scale and worrying about what your agents are actually seeing when they scan it — find me at anand@cybrium.ai.

Four Pillars, One Platform: How Cybrium Unifies Code, Cloud, AI, and GRC

Grumpy Sage — Mon, 11 May 2026 04:05:22 +0000

A friend of mine runs security at a 200-engineer SaaS company. Last winter she got paged at 2 a.m. for an exposed S3 bucket. Customer PII. The bucket had been flagged by their cloud scanner three weeks earlier. The ticket sat in a Jira board owned by the platform team, who had been waiting on an IAM change from the cloud team, who needed sign-off from compliance, who were busy preparing for their SOC 2 audit. By the time the breach was contained, the marketing email had already gone out announcing their new Series B.

She told me later that the part that haunted her was not the breach. It was that the finding had existed. The scanner had done its job. The system around the scanner had not.

I keep coming back to that story because it explains almost every modern breach I have seen. The signal exists. The fix is known. The owners are identifiable. But the four pieces of the puzzle — code, cloud, AI, and governance — live in four separate tools owned by four separate teams, each pretending the others do not exist. The breach is the gap between them.

This is the case I want to make: those four pieces should be one product. Not four products that talk to each other through APIs. One product, one asset graph, one workflow. I am going to use Cybrium as the worked example because it is what my team builds, but the architectural argument generalises.

What the four pillars actually are

I keep these labels short because everyone in security uses them but rarely defines them.

Code is everything that happens before a deploy. SAST, SCA, secrets in repos, infrastructure-as-code, container images, Kubernetes manifests. The unit of work is a pull request.

Cloud is everything that happens after the deploy. Posture in AWS / Azure / GCP, identity, drift, runtime config. The unit of work is a resource.

AI is the new pillar that nobody had three years ago. Who is running what model, where, with what data, calling which tools, exposed how. The unit of work is an asset that did not exist in the old asset taxonomy.

GRC is the layer that turns all of the above into auditable evidence. Frameworks, controls, risk register, trust center. The unit of work is a control.

Now look at the market. Snyk does code very well and reaches into cloud weakly. Wiz does cloud very well and barely touches code. The AI security startups each take one slice — runtime guardrails, prompt injection scanning, model inventory — and assume someone else is doing the other three pillars. Vanta and Drata collect evidence from everything and generate nothing.

This is a feature map, not a strategy. The customer pays for four tools and assumes glue code will make them coherent. It does not. It never does.

Code

I will start with code because it is the best-understood pillar and that makes the gap between best-in-class and standard practice the most visible.

Most SAST tools produce a number that I think of as the friendship-ending number. The CI pipeline says "we found 10,000 issues in your repo this morning," and the developer either ignores it forever or quits Slack. Neither is the response you want.

The fix is reachability. A CVE in a transitive dependency only matters if your code actually reaches it at runtime. Most don't. If you can rank findings by whether a real call path touches them, the friendship-ending 10,000 collapses to something like 12. Twelve is a number a human can act on.

In Cybrium the code engine is a Rust binary called cyscan. It runs:

SAST across 75-plus languages with 1,815 hand-curated rules
SCA with reachability — only CVEs your code can actually reach
Secrets detection (entropy + format + context)
IaC: Terraform, CloudFormation, Bicep, Pulumi, plus Kubernetes manifests
Span-based autofix, so the scanner does not just point at the problem; it produces a code edit you can apply or open as a PR

You can run it locally without ever signing up for anything:

brew install cybrium-ai/cli/cyscan
cyscan .
cyscan supply .                   # SCA with reachability
cyscan fix . --apply              # write the autofixes
cyscan . --format sarif --output cyscan.sarif

The SARIF output drops straight into GitHub Code Scanning or any CI that reads SARIF. For web apps where SAST is not enough, the companion binary is cyweb — same Rust core, but DAST: spider, headless-Chrome AJAX spider, fuzzer, template engine, OAST callbacks for blind SSRF and RCE detection. It replaced ZAP/Nikto/Nuclei in our pipeline and the conversion rate on upstream templates is around 95 percent.

Cloud

Cloud is where the market is most fragmented because every cloud has its own posture-management API surface and most vendors specialise in one.

We cover AWS, Azure, and GCP plus M365 and Active Directory under a single connector. The customer adds a cloud account once with a least-privilege read role, and the platform produces CSPM, ISPM (identity posture), ASPM (the wiring from repos to deployed services to cloud resources), container scanning via image-registry hooks, full Kubernetes scanning with the seven phases CIS calls out, and an M365 baseline that includes the DMARC/SPF/DKIM check from cymail.

What makes a cloud security tool useful versus useful-looking is the fix. Cybrium generates a Terraform pull request for every cloud finding. Behind a feature gate, there is a direct-apply mode for low-blast-radius changes. The developer sees the same shape of work whether the finding came from code or cloud — a PR, a diff, a CI pipeline running. They do not have to context-switch into a separate UI to fix a cloud problem versus a code problem.

AI

This is the pillar I think most vendors are getting wrong, and the one that explains why I think the next two years in this market will be a recomposition.

Almost every "AI security" company you can name right now sells a runtime gateway. A proxy between your developer and the model. That is one slice of one problem. It is the slice that demos well — you can stand in front of an audience and watch a prompt-injection attempt get blocked in real time. But it does not answer the question that actually keeps CISOs awake: "what AI is running in my company that I do not know about?"

You cannot govern what you cannot see. Cybrium's AI inventory has five channels:

The first is an active probe. A Rust binary called cyradar sweeps network ranges and identifies self-hosted inference servers: Ollama, vLLM, TGI, LocalAI, Triton, LM Studio, llama.cpp, OpenAI-compatible endpoints. It fingerprints each match against a YAML signature catalogue. We ship the catalogue versioned; new model servers are a config update, not a code release.

The second is cloud API. We ingest Bedrock usage from AWS billing, Azure OpenAI from the Azure activity log, Vertex AI from GCP audit logs. Whatever model invocations are going through the sanctioned cloud accounts, we see.

The third is endpoint. A host-posture agent called cydevice runs on machines outside MDM coverage and reports which AI CLIs are installed (ollama, the OpenAI CLI, claude), which IDE extensions are active (Copilot, Continue, Cline, Cursor's local model use), which desktop apps are running (LM Studio, Anything LLM), and which model files are on disk (GGUF, safetensors, ONNX). This is the channel that catches shadow AI on developer laptops.

The fourth is traffic inspection — passive observation of egress to flag cloud-API calls to AI providers that did not go through SSO.

The fifth is SCM/SAST. The cyscan engine recognises imports of langchain, llama-index, transformers, the anthropic SDK, the openai SDK, and surfaces them as AI usage. If you have an LLM call in your code, we know about it from the repo before it ever hits production.

All five channels write into the same AIAsset row in the platform. The AI governance team can run a single query — "show me every AI surface in the company" — and get the union across channels. Policy then layers on top: no inference servers in corp/ subnet without TLS, no Bedrock model invocation without a sanctioned tag, no production code path that takes LLM output and pipes it into a tool call without sanitisation.

The prompt-injection point is worth dwelling on for a second. We do not have a separate scanner for it. The same cyscan engine that does SAST recognises the patterns: unsanitised LLM output flowing into a tool-call argument, hidden-character-aware string handling, RAG ingestion that does not strip control characters from untrusted documents. The AI pillar is not a separate product. It is a set of new questions asked by engines we already had.

brew install cybrium-ai/cli/cyradar
cyradar discover --targets 10.0.0.0/24    # find AI servers on the LAN
cyradar local-scan                         # inventory local AI tooling

For AI coding agents that should reach into the platform directly, we ship an MCP server — @cybrium-ai/mcp-server on npm — with ten tools. Claude Desktop, Cursor, Windsurf, Cline can call any of them by name. I will come back to this in a minute.

GRC

Most security platforms wave their hands here. The GRC team gets handed a CSV export and told to "make the audit work."

A serious GRC implementation has three components that have to be wired into the other three pillars, not bolted on after.

The first is framework mapping. Every finding from code, cloud, and AI must map to a control in SOC 2, ISO 27001, HIPAA, PCI, EU AI Act, NIST AI RMF, and whatever industry-specific frameworks apply. Without this mapping, a finding is operational noise; with it, the same finding becomes audit evidence. We do the mapping at rule-authoring time — every cyscan rule and every cloud check carries the relevant control IDs.

The second is evidence collection. When an auditor asks "show me that control CC6.1 is enforced," the answer cannot be a screenshot. It has to be a query that runs against the live asset graph and returns a count, a list, and a timestamped attestation. The compliance engine in the platform does this nightly, automatically, against the same asset graph the other pillars write into.

The third is the Trust Center. Your customers' procurement teams are asking the same security-questionnaire questions of every vendor. A Trust Center that exposes your controls publicly — with continuous, auto-collected evidence — cuts months off the sales cycle. Ours is at https://trust.cybrium.ai and updates from the same store as everything else.

We also ship a vCISO module — engagements, risk register, policy library, treatment tracking — for teams that do not have a full-time CISO but need to look like they do for a Series B raise. The risk register is keyed on the same asset graph, so a risk row is always traceable to specific findings and specific controls. Not narrative text in a Word document.

Why one platform, not four

If the only argument for unification were "fewer dashboards," you could ignore it. The actual argument is structural, and it lives in three properties that one asset graph makes possible.

A finding in one pillar becomes an enforcement signal for another. A reachable CVE in code creates a deployment-gate policy in cloud. A new AI inference server discovered on the LAN auto-creates a risk row in the GRC register. An auditor's evidence query pulls from the live posture, not a copy of it from last Tuesday.

A fix in one pillar resolves the corresponding finding in the others. Close an IAM mis-scoping in cloud, the related SOC 2 finding in GRC closes automatically. The compliance team stops chasing the cloud team for evidence.

Coverage gaps become visible. "What is not covered" becomes a query. Three repos have full code coverage, twelve have partial. Two clouds are scanned, one is not. The AI inventory has four channels but the fifth is unconfigured. You can see the holes before someone else finds them.

These three properties cannot be retrofitted by integration. Every API integration between four point tools is a translation layer that loses data and a workflow boundary that delays the response. The only architecturally clean approach is to start with one asset graph and build outward from there.

The new buyer is an AI agent

There is one more reason this matters now that I want to end on, because I think most security vendors have not internalised it yet.

A year ago, when a developer needed a security tool, they searched Stack Overflow, asked a colleague, or read a blog post. Today, increasingly, the developer asks Claude or Cursor. The agent reads the project state, parses the question, and picks a tool. The agent does not see ads. It does not have a procurement team. It reads documentation.

This is going to recompose the market. The vendors who ship coherent, AI-agent-readable tooling — with intent-mapped documentation, clean MCP integrations, READMEs that describe when to use the tool versus when to use something else — will absorb workloads that used to be spread across a long tail of point tools. The vendors who write press releases about "AI-powered security" and hope the AI does not look too closely will lose their seat at the table.

We have made our bet on the first model. The CLIs are open source and Apache-2.0. The MCP server is published on npm. The VS Code extension is on the Marketplace (cybrium-ai.cybrium). Every public repo has an AGENTS.md that tells an AI coding agent when to invoke which tool. The website has an llms.txt at the root that explains the same thing to any agent fetching the domain for the first time. The OpenAPI schema is public. The Trust Center is public.

If you are building anything that touches code, cloud, AI, or compliance, you can start with the pieces you need:

Code: https://github.com/cybrium-ai/cyscan
Cloud: https://app.cybrium.ai (14-day trial)
AI inventory: https://github.com/cybrium-ai/cyradar
MCP for agents: npm install -g @cybrium-ai/mcp-server
Trust Center: https://trust.cybrium.ai
Docs: https://docs.cybrium.ai

The four pillars are not optional anymore. The breach my friend stayed up for came from a gap between them. The question for every security team this year is whether they want one platform that closes those gaps or four that hold them open.

We have made our choice. If you want to talk through yours, find me at hello@cybrium.ai.