Forem: Pavel Espitia

The Provider Pattern: How I Added Ollama Support in 50 Lines

Pavel Espitia — Tue, 21 Apr 2026 12:03:03 +0000

When I started building spectr-ai, it only worked with Claude. The Anthropic SDK was hardcoded everywhere — in the analysis function, the prompt formatting, the response parsing. It worked, but it meant every user needed an Anthropic API key and an internet connection.

I wanted to add Ollama support so developers could run audits locally, completely offline, using open-source models. The naive approach would have been scattering if (useOllama) checks throughout the codebase. Instead, I used the Provider pattern, and the entire Ollama integration took about 50 lines of code.

The Interface

The core idea is simple: define what a "provider" does, not how it does it.

interface Provider {
  analyze(systemPrompt: string, userContent: string): Promise<string>;
  readonly name: string;
  readonly model: string;
}

That's it. Three members. A provider takes a system prompt and user content, returns a string. It has a name and a model identifier. Every LLM API in existence can satisfy this contract — they all accept text and return text.

The interface deliberately returns a raw string, not a parsed object. Parsing and validation happen in a separate layer (the Zod schemas from yesterday's post). The provider's only job is to talk to the model and give back its response.

The Anthropic Provider

import Anthropic from "@anthropic-ai/sdk";

function createAnthropicProvider(
  apiKey: string,
  model: string,
): Provider {
  const client = new Anthropic({ apiKey });

  return {
    name: "anthropic",
    model,
    async analyze(systemPrompt, userContent) {
      const response = await client.messages.create({
        model,
        max_tokens: 8192,
        system: systemPrompt,
        messages: [{ role: "user", content: userContent }],
      });

      const block = response.content[0];
      if (block.type !== "text") {
        throw new Error(
          `Unexpected response type: ${block.type}`
        );
      }
      return block.text;
    },
  };
}

The Anthropic-specific details — the SDK client, the message format, the content block extraction — are all encapsulated. Nothing outside this function knows or cares about Anthropic's API shape.

The Ollama Provider

function createOllamaProvider(
  model: string,
  baseUrl: string = "http://localhost:11434",
): Provider {
  return {
    name: "ollama",
    model,
    async analyze(systemPrompt, userContent) {
      const response = await fetch(
        `${baseUrl}/api/chat`,
        {
          method: "POST",
          headers: { "Content-Type": "application/json" },
          body: JSON.stringify({
            model,
            stream: false,
            messages: [
              { role: "system", content: systemPrompt },
              { role: "user", content: userContent },
            ],
          }),
        },
      );

      if (!response.ok) {
        throw new Error(
          `Ollama returned ${response.status}: ${await response.text()}`
        );
      }

      const data = await response.json();
      return data.message.content;
    },
  };
}

No SDK dependency. Just a fetch call to Ollama's local API. The provider returns the same raw string that the Anthropic provider returns. The rest of the application can't tell the difference.

The Factory

interface ProviderConfig {
  provider: "anthropic" | "ollama";
  model: string;
  apiKey?: string;
  baseUrl?: string;
}

function createProvider(config: ProviderConfig): Provider {
  switch (config.provider) {
    case "anthropic": {
      if (!config.apiKey) {
        throw new Error(
          "Anthropic provider requires an API key. " +
          "Set ANTHROPIC_API_KEY or pass --api-key."
        );
      }
      return createAnthropicProvider(
        config.apiKey,
        config.model,
      );
    }
    case "ollama": {
      return createOllamaProvider(
        config.model,
        config.baseUrl,
      );
    }
  }
}

The factory reads from configuration and returns the right provider. The switch is exhaustive — TypeScript will error if you add a new provider to the union type without handling it here.

Using It

The analysis pipeline doesn't know which provider it's using:

async function runAudit(
  provider: Provider,
  contract: string,
): Promise<AuditResult> {
  console.log(
    `Analyzing with ${provider.name} (${provider.model})...`
  );

  const raw = await provider.analyze(SYSTEM_PROMPT, contract);
  return parseAuditResult(raw);
}

From the CLI, the user switches with a flag:

# Use Claude (default)
spectr-ai analyze contract.sol

# Use a local Ollama model
spectr-ai analyze contract.sol --provider ollama --model llama3

# Use a specific Anthropic model
spectr-ai analyze contract.sol --model claude-sonnet-4-20250514

Why This Matters

Testing becomes trivial. You can create a mock provider that returns predetermined responses:

function createMockProvider(
  response: string,
): Provider {
  return {
    name: "mock",
    model: "test",
    async analyze() {
      return response;
    },
  };
}

// In tests
const provider = createMockProvider(
  JSON.stringify({
    vulnerabilities: [],
    summary: "No issues found",
    riskScore: 0,
  }),
);
const result = await runAudit(provider, sampleContract);

No HTTP mocking, no SDK stubs, no environment variables. Just a function that returns a string.

Adding new providers is isolated. Want to add OpenAI? Write a createOpenAIProvider function, add "openai" to the union type, handle it in the factory. Zero changes to the analysis pipeline, the CLI, the web frontend, or the tests.

Users choose their tradeoffs. Claude gives better audit quality. Ollama gives privacy, offline access, and zero API costs. The application doesn't need to have an opinion — it just needs a string back from the model.

The Pattern Beyond LLMs

This isn't a new idea. The Provider pattern is just the Strategy pattern with a more descriptive name. You see it everywhere:

Database drivers: same query interface, different backends (Postgres, MySQL, SQLite)
Storage: same read/write interface, different destinations (local disk, S3, GCS)
Auth: same verify interface, different mechanisms (JWT, session, API key)
Logging: same log interface, different transports (console, file, remote service)

The principle is always the same: define the smallest interface that captures what you need, then implement it for each backend. The consuming code depends on the interface, never the implementation.

What Makes a Good Provider Interface

Keep it minimal. My first draft of the Provider interface had methods for streamAnalyze, countTokens, getModelInfo, and estimateCost. I deleted all of them. The only method the application actually needed was analyze. Everything else was speculative — features I might want someday but didn't need today.

If you need streaming later, add a StreamingProvider interface that extends Provider. If you need token counting, add it to the providers that support it. Don't pollute the base interface with capabilities that not every implementation can satisfy.

The 50-line Ollama provider worked because the interface was small enough that any LLM API could implement it. That's the goal: an interface so simple that adding a new provider is boring. Boring is good. Boring means your abstraction is right.

Zod + LLMs: How to Validate AI Responses Without Losing Your Mind

Pavel Espitia — Mon, 20 Apr 2026 12:02:08 +0000

You ask an LLM a carefully crafted question with a system prompt demanding JSON output. You get back a beautifully formatted response wrapped in triple backticks, prefixed with "Here's the JSON you requested:", and trailing with "Let me know if you need any changes!" The actual JSON is buried somewhere in the middle. Sometimes it's valid. Sometimes it's not.

This is the fundamental challenge of building tools on top of LLMs: they're probabilistic text generators, not API endpoints. And if you're using smaller local models through Ollama, the problem gets worse. Much worse.

Here's how I solved it in spectr-ai, an AI-powered smart contract auditor, using Zod for runtime validation.

The Schema Is Your Contract

Every structured response from the LLM passes through a Zod schema. The schema defines exactly what shape the data must have, what types each field must be, and what values are acceptable.

import { z } from "zod";

const SeveritySchema = z.enum([
  "critical",
  "high",
  "medium",
  "low",
  "informational",
]);

const VulnerabilitySchema = z.object({
  id: z.string(),
  title: z.string(),
  severity: SeveritySchema,
  description: z.string(),
  lineStart: z.number().int().positive(),
  lineEnd: z.number().int().positive(),
  recommendation: z.string(),
});

const AuditResultSchema = z.object({
  vulnerabilities: z.array(VulnerabilitySchema),
  summary: z.string(),
  riskScore: z.number().min(0).max(100),
});

type AuditResult = z.infer<AuditResult>;

The z.infer at the bottom is the magic — your runtime validation and your TypeScript types are derived from the same source. No drift between what you validate and what you type-check.

Extracting JSON from LLM Chaos

LLMs love wrapping their JSON in markdown fences, adding explanatory text, or returning partial objects. The first step is extracting the actual JSON from whatever the model sends back.

function extractJson(raw: string): string {
  // Strip markdown code fences
  const fencePattern = /```
{% endraw %}
(?:json)?\s*\n?([\s\S]*?)\n?\s*
{% raw %}
```/;
  const match = raw.match(fencePattern);
  if (match?.[1]) {
    return match[1].trim();
  }

  // Try to find a JSON object directly
  const objectStart = raw.indexOf("{");
  const objectEnd = raw.lastIndexOf("}");
  if (objectStart !== -1 && objectEnd > objectStart) {
    return raw.slice(objectStart, objectEnd + 1);
  }

  // Last resort: return the raw string and let Zod handle the error
  return raw.trim();
}

This function handles the three most common cases: JSON wrapped in code fences, JSON with surrounding text, and bare JSON. The key insight is that lastIndexOf("}") grabs the outermost closing brace, so even if there's trailing text, you still get the complete object.

safeParse Over parse, Every Time

Zod offers two parsing methods: parse throws on invalid input, safeParse returns a discriminated union. For LLM responses, always use safeParse.

function parseAuditResult(raw: string): AuditResult {
  const json = extractJson(raw);

  let parsed: unknown;
  try {
    parsed = JSON.parse(json);
  } catch {
    throw new ParseError(
      `LLM returned invalid JSON. ` +
      `First 200 chars: ${json.slice(0, 200)}`
    );
  }

  const result = AuditResultSchema.safeParse(parsed);

  if (!result.success) {
    const issues = result.error.issues
      .map((i) => `  ${i.path.join(".")}: ${i.message}`)
      .join("\n");
    throw new ParseError(
      `LLM response failed schema validation:\n${issues}`
    );
  }

  return result.data;
}

Why safeParse? Because parse throws a ZodError with a stack trace and internal formatting that's useless for debugging LLM behavior. With safeParse, you control the error message. You can log exactly which fields failed and why, include a preview of the raw response, and surface something actionable to the user.

The Error Messages Matter

When a local model returns garbage, you need to know why it failed. Zod's error issues tell you exactly what went wrong:

LLM response failed schema validation:
  vulnerabilities.0.severity: Invalid enum value.
    Expected 'critical' | 'high' | 'medium' | 'low' | 'informational',
    received 'Critical'
  riskScore: Expected number, received string

That first error is incredibly common with smaller models — they capitalize enum values, use "High" instead of "high", or invent new severity levels like "moderate". The fix is either to normalize the data before validation or to make your schema more forgiving:

const SeveritySchema = z
  .string()
  .transform((s) => s.toLowerCase())
  .pipe(
    z.enum([
      "critical",
      "high",
      "medium",
      "low",
      "informational",
    ])
  );

The transform + pipe pattern lets you preprocess the value before validating it. The input is any string, the transform lowercases it, and the pipe validates the transformed value against the enum. Clean and composable.

Handling the riskScore Problem

Models frequently return "85" instead of 85 — a string instead of a number. You can handle this with z.coerce:

const AuditResultSchema = z.object({
  vulnerabilities: z.array(VulnerabilitySchema),
  summary: z.string(),
  riskScore: z.coerce.number().min(0).max(100),
});

z.coerce.number() calls Number() on the input first. So "85" becomes 85, and "not a number" becomes NaN which fails the subsequent validation. This is the right tradeoff: be lenient on types the model frequently gets wrong, strict on values.

Retry With Context

Sometimes the model just fails. When it does, retry with the error message injected into the prompt:

async function auditWithRetry(
  provider: Provider,
  contract: string,
  maxAttempts: number = 3,
): Promise<AuditResult> {
  let lastError = "";

  for (let attempt = 1; attempt <= maxAttempts; attempt++) {
    const prompt = lastError
      ? `${basePrompt}\n\nYour previous response had errors:\n${lastError}\nPlease fix and respond with valid JSON only.`
      : basePrompt;

    const raw = await provider.analyze(prompt, contract);

    try {
      return parseAuditResult(raw);
    } catch (err) {
      lastError = err instanceof ParseError ? err.message : String(err);
    }
  }

  throw new Error(
    `Failed to get valid response after ${maxAttempts} attempts. Last error: ${lastError}`
  );
}

This works surprisingly well. Most models self-correct when you tell them what went wrong. The key is including the specific Zod error — "severity must be one of critical, high, medium, low, informational" gives the model enough context to fix its output.

What I Learned

Never trust LLM output. Validate everything at the boundary, just like you would with user input or API responses.
safeParse is non-negotiable. You need control over error formatting to debug model behavior.
Be lenient on representation, strict on semantics. Use z.coerce and transform for type mismatches. Keep enum validation tight.
Extract JSON defensively. Models wrap, prefix, suffix, and annotate their JSON output in creative ways.
Retry with error context. Models are good at self-correction when you tell them exactly what failed.

The combination of Zod's runtime validation and TypeScript's static types gives you a safety net that catches model failures before they propagate through your application. Your schema becomes the contract between your code and the LLM — and unlike the LLM, Zod never hallucinates.

I Made a CLI That Talks to Any Smart Contract in Plain English

Pavel Espitia — Sun, 19 Apr 2026 23:31:44 +0000

What if you could just ask a smart contract questions in plain English?

"What's the total supply?" → calls totalSupply() → "1,000,000 USDC"
"Who is the owner?" → calls owner() → "0x1234...abcd"
"How many holders are there?" → "This contract doesn't have a holder count function, but you could check Transfer events."

I built AbiLens — a chat interface for EVM smart contracts. Paste an address, pick a chain, start asking.

How It Works

The architecture is simple — four steps:

1. Resolve the ABI

When you paste a contract address, AbiLens tries two approaches:

Etherscan API → verified ABI (best case)
        ↓ (if not verified)
whatsabi → reconstruct ABI from bytecode

whatsabi is the secret weapon here. It reads the deployed bytecode, detects function selectors, follows proxy patterns (EIP-1967), and looks up signatures in the 4byte directory. You get a usable ABI even for unverified contracts.

2. Build Context for the LLM

The system prompt tells the LLM what functions are available:

You are AbiLens. This contract is USDC at 0xA0b8...eB48 on Ethereum.

Available read functions:
  name() → string
  symbol() → string
  decimals() → uint8
  totalSupply() → uint256
  balanceOf(address account) → uint256
  allowance(address owner, address spender) → uint256

The LLM now knows exactly what it can call.

3. LLM Decides What to Call

When you ask "what's the total supply?", the LLM responds with:

{"calls": [{"functionName": "totalSupply", "args": []}]}

AbiLens executes the call using viem:

const result = await client.call({
  to: contractAddress,
  data: encodeFunctionData({ abi, functionName: "totalSupply", args: [] }),
});

4. LLM Explains the Result

The raw result goes back to the LLM: totalSupply() = 43941622816877670. The LLM knows USDC has 6 decimals (it checked decimals() first) and responds:

"The total supply of USDC is approximately 43.94 billion tokens."

Supported Chains

AbiLens works with any EVM chain. Currently configured:

Ethereum
Base
Arbitrum
Polygon
Optimism
Sepolia (testnet)

Adding a new chain is one object in the config.

Unverified Contracts

This is where AbiLens gets interesting. Most tools require a verified ABI from Etherscan. AbiLens doesn't.

For unverified contracts, whatsabi reconstructs an approximate ABI. The function names might be generic (function_0x1a2b3c), but the types are correct. The LLM adapts:

"This contract has an unverified ABI. I can see a function at selector 0x1a2b3c4d that takes an address and returns a uint256 — this is likely a balance lookup."

The Stack

viem — EVM interaction (lighter than ethers.js, fully typed)
whatsabi — ABI reconstruction from bytecode
Next.js 15 — Web UI with App Router
Claude / Ollama — LLM provider (works with both)

Try It

git clone https://github.com/pavelEspitia/abilens
cd abilens
cp .env.example .env
# Add your ETHERSCAN_API_KEY to .env
pnpm install && pnpm dev

Paste the USDC address: 0xA0b86991c6218b36c1d19D4a2e9Eb0cE3606eB48

Ask: "What is this contract and what can I do with it?"

What's Next

Write support (with wallet connection)
Event log querying ("show me the last 10 transfers")
Multi-contract conversations ("compare the TVL of these two pools")

The code is open source at github.com/pavelEspitia/abilens.

How to Run LLMs Locally with Ollama — A Developer's Guide

Pavel Espitia — Fri, 17 Apr 2026 15:10:50 +0000

You don't need an API key or a cloud subscription to use LLMs. Ollama lets you run models locally on your machine — completely free, completely private. Here's how to set it up and start building with it.

What is Ollama?

Ollama is a tool that downloads, manages, and serves LLMs locally. It exposes an OpenAI-compatible API at localhost:11434, so any code that works with the OpenAI API works with Ollama — zero changes.

Installation

# Linux / WSL
curl -fsSL https://ollama.com/install.sh | sh

# macOS
brew install ollama

# Windows
# Download from https://ollama.com/download

Start the server:

ollama serve

Pick a Model

# Code-focused (best for dev tools)
ollama pull qwen2.5-coder:7b      # 4.7GB, good balance
ollama pull qwen2.5-coder:1.5b    # 1.0GB, fast, good enough for many tasks
ollama pull deepseek-coder-v2      # 8.9GB, top quality

# General purpose
ollama pull llama3.1:8b            # 4.7GB, Meta's latest
ollama pull mistral:7b             # 4.1GB, fast and capable

My recommendation: start with qwen2.5-coder:1.5b for speed, upgrade to 7b when you need quality.

Your First API Call

Ollama serves an OpenAI-compatible endpoint. Here's a call with plain fetch:

const response = await fetch("http://localhost:11434/v1/chat/completions", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({
    model: "qwen2.5-coder:7b",
    messages: [
      { role: "system", content: "You are a helpful assistant." },
      { role: "user", content: "Explain what a closure is in JavaScript." },
    ],
    temperature: 0,
    stream: false,
  }),
});

const data = await response.json();
console.log(data.choices[0].message.content);

That's it. No API key, no SDK, no account.

Structured Output (JSON Mode)

The key to building real tools with LLMs is getting structured output. Tell the model to respond with JSON:

const response = await fetch("http://localhost:11434/v1/chat/completions", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({
    model: "qwen2.5-coder:7b",
    messages: [
      {
        role: "system",
        content: `Respond with ONLY valid JSON matching this schema:
        { "summary": "string", "topics": ["string"], "difficulty": "beginner|intermediate|advanced" }`,
      },
      {
        role: "user",
        content: "Analyze this article topic: Building REST APIs with Express.js",
      },
    ],
    temperature: 0,
    stream: false,
  }),
});

Tip: always validate the response with Zod or a similar schema validator. Smaller models sometimes return invalid JSON.

Building a Provider Abstraction

If you want your app to work with both Ollama (local) and Claude/OpenAI (cloud), create a simple interface:

interface LlmProvider {
  chat(system: string, messages: Message[]): Promise<string>;
}

class OllamaProvider implements LlmProvider {
  constructor(private model: string) {}

  async chat(system: string, messages: Message[]): Promise<string> {
    const response = await fetch("http://localhost:11434/v1/chat/completions", {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({
        model: this.model,
        messages: [{ role: "system", content: system }, ...messages],
        temperature: 0,
        stream: false,
      }),
    });
    const data = await response.json();
    return data.choices[0].message.content;
  }
}

Now your code doesn't care where the model runs. Swap OllamaProvider for AnthropicProvider with a flag.

Performance Tips

First call is slow — the model loads into memory. Subsequent calls are fast.
Keep the server running — don't start/stop per request.
Use smaller models for dev — 1.5b for iteration, 7b for production quality.
Set temperature: 0 for deterministic output (important for structured responses).
Add a timeout — local models on CPU can take minutes for long prompts.

When to Use Local vs Cloud

Use Case	Local (Ollama)	Cloud (Claude/GPT)
Development	Great	Expensive
Privacy-sensitive data	Required	Risky
Production quality	Good (7b+)	Best
Speed	Depends on hardware	Fast
Cost	Free	Per-token

What I Built With It

spectr-ai — an AI smart contract auditor that works with both Claude and Ollama. The --model ollama:qwen2.5-coder:1.5b flag runs everything locally, free, no API key.

Local LLMs are good enough for real developer tools. The quality gap is closing fast.

5 Smart Contract Vulnerabilities That AI Catches Better Than Static Analyzers

Pavel Espitia — Thu, 16 Apr 2026 12:46:45 +0000

Static analysis tools like Slither and Mythril are essential for smart contract security. But they work by pattern matching — they can only find what they've been programmed to look for. LLMs reason about code differently. They understand intent, context, and business logic.

Here are 5 vulnerability classes where AI consistently outperforms traditional static analyzers.

1. Business Logic Flaws

Static analyzers check for known patterns: reentrancy, integer overflow, unchecked return values. But they can't understand what your contract is supposed to do.

function withdraw(uint256 amount) external {
    require(balances[msg.sender] >= amount);
    balances[msg.sender] -= amount;
    payable(msg.sender).transfer(amount);
}

A static analyzer sees this as safe — checks-effects-interactions pattern is followed. But an AI auditor can ask: "Should there be a minimum withdrawal? A cooldown period? A daily limit?" It reasons about the business context, not just the code pattern.

2. Access Control Gaps Across Multiple Functions

Slither will flag a function that's missing onlyOwner. But it won't notice that setFeeRecipient() and withdrawFees() together create a privilege escalation path — even if each function individually looks fine.

AI can analyze the interaction between functions:

// AI catches: anyone can set themselves as fee recipient, then withdraw
function setFeeRecipient(address _recipient) external {
    feeRecipient = _recipient;
}

function withdrawFees() external {
    require(msg.sender == feeRecipient);
    payable(feeRecipient).transfer(address(this).balance);
}

The AI output: "These two functions together allow any address to drain the contract. setFeeRecipient has no access control, and withdrawFees only checks the caller matches the recipient — which they just set to themselves."

3. Incorrect Event Parameters

Static analyzers verify that events exist. They don't verify that the emitted values are correct.

event Transfer(address indexed from, address indexed to, uint256 amount);

function transfer(address to, uint256 amount) external {
    balances[msg.sender] -= amount;
    balances[to] += amount;
    emit Transfer(msg.sender, to, balances[to]); // Bug: emits balance, not amount
}

An AI catches this because it understands that the Transfer event should emit the amount transferred, not the resulting balance. No static rule covers this — it requires understanding what the event means.

4. Inconsistent Decimal Handling

DeFi protocols interact with tokens that have different decimal places (USDC has 6, WETH has 18). Static analyzers don't track decimal context across function calls.

function swap(uint256 usdcAmount) external {
    uint256 ethAmount = usdcAmount * getEthPrice() / 1e18;
    // Bug: usdcAmount is 6 decimals, but division assumes 18
}

AI recognizes that USDC uses 6 decimals and flags the math: "The division by 1e18 assumes 18-decimal precision, but USDC has 6 decimals. This will return values 1e12 times smaller than expected."

5. Missing Edge Case Handlers

What happens when the array is empty? When the balance is zero? When the deadline has already passed? Static analyzers check for specific known edge cases. AI reasons about all of them.

function getAveragePrice(uint256[] memory prices) public pure returns (uint256) {
    uint256 sum;
    for (uint256 i = 0; i < prices.length; i++) {
        sum += prices[i];
    }
    return sum / prices.length; // Division by zero if empty array
}

Beyond the obvious division-by-zero, AI also asks: "What if one price is extremely large and causes sum to overflow? Should there be a maximum array length to prevent gas exhaustion?"

The Bottom Line

Static analyzers are necessary — they're fast, deterministic, and catch the obvious stuff. But AI auditors add a layer that reasons about intent, context, and cross-function interactions.

The best approach: run both. Use Slither/Mythril for deterministic checks, then use an AI auditor for the things only reasoning can catch.

If you want to try this yourself:

# Free, local, no API key needed
ollama pull qwen2.5-coder:1.5b
npx spectr-ai --model ollama:qwen2.5-coder:1.5b your-contract.sol

spectr-ai is open source and works with Claude or local models via Ollama.

I Built an AI Smart Contract Auditor in a Weekend — Here's How

Pavel Espitia — Tue, 14 Apr 2026 18:20:10 +0000

Smart contract audits cost $5K-$50K and take weeks. I built a CLI tool that catches the same classes of vulnerabilities in seconds, using AI — and it works with free local models too.

What is spectr-ai?

spectr-ai is a command-line tool that analyzes Solidity and Vyper smart contracts for security vulnerabilities, gas optimizations, and best practice violations. It uses Claude (Anthropic's API) or local models via Ollama.

spectr-ai contracts/Vault.sol

Output:

   CRITICAL  — 2 issues

  ● Reentrancy vulnerability in withdraw()
    #1 withdraw() at contracts/Vault.sol:20

    External call via msg.sender.call() before updating balances.
    → Apply checks-effects-interactions pattern.

    ┌─ suggested fix
    │ function withdraw() public {
    │     uint256 amount = balances[msg.sender];
    │     balances[msg.sender] = 0;
    │     (bool success, ) = msg.sender.call{value: amount}("");
    │     require(success, "Transfer failed");
    │ }
    └─

  ┌────────────────────────────────────────┐
  │ Summary                                │
  │ ● critical     2  ████████████████     │
  │ ● high         1  ████████             │
  │ ▲ medium       1  ████████             │
  │  RISK: CRITICAL                        │
  └────────────────────────────────────────┘

Why I Built It

I'm a fullstack TypeScript developer getting deeper into blockchain and AI. The intersection of these two fields has a clear gap: security tooling that's accessible to individual developers.

Static analyzers like Slither and Mythril are powerful but limited to pattern matching. They can't reason about business logic or explain why something is dangerous. LLMs can.

The question was: can an LLM reliably audit smart contracts and produce structured, actionable output?

The Architecture

spectr-ai is intentionally simple — ~800 lines of TypeScript across 12 source files:

src/
  cli.ts          → Arg parsing, orchestration
  analyzer.ts     → Sends contract to provider, parses response
  provider.ts     → Anthropic + Ollama abstraction
  schema.ts       → Zod validation of model responses
  prompts.ts      → Language-specific system prompts
  validator.ts    → Input validation (Solidity + Vyper)
  formatter.ts    → Color terminal output
  sarif.ts        → SARIF format for GitHub Code Scanning
  html.ts         → Self-contained HTML reports
  files.ts        → Recursive file finder
  diff.ts         → Git diff integration
  watcher.ts      → File watch mode

Key Design Decisions

1. Provider abstraction over SDK lock-in

Instead of coupling to the Anthropic SDK, I created a Provider interface:

interface Provider {
  complete(system: string, userMessage: string): Promise<CompletionResult>;
}

This let me add Ollama support in ~50 lines. The OllamaProvider uses the OpenAI-compatible endpoint at localhost:11434 — zero additional dependencies.

# Free, local, no API key
spectr-ai --model ollama:qwen2.5-coder:7b contracts/

2. Structured output with Zod validation

LLMs sometimes return malformed JSON, especially smaller models. Instead of blindly JSON.parse-ing, every response is validated against a Zod schema:

const issueSchema = z.object({
  severity: z.enum(["critical", "high", "medium", "low", "info"]),
  title: z.string(),
  location: z.string(),
  description: z.string(),
  recommendation: z.string(),
  codefix: z.string().optional(),
});

When validation fails, the error message tells you exactly what the model got wrong — instead of a cryptic undefined is not an object deep in the formatter.

3. Multiple output formats for different workflows

Text (default): Color-coded terminal output grouped by severity
JSON: Structured data for scripting
SARIF: GitHub Code Scanning integration
HTML: Self-contained audit report you can share

This means spectr-ai fits into CI pipelines, PR reviews, and manual audits.

4. Language-specific prompts

Solidity and Vyper have different vulnerability profiles. The system prompt adapts:

Solidity: reentrancy, tx.origin, delegatecall, selfdestruct
Vyper: raw_call misuse, storage collisions, default visibility, @nonreentrant limitations

What I Learned

LLMs are surprisingly good at security analysis

The model consistently catches the OWASP-equivalent vulnerabilities in smart contracts — reentrancy, access control, integer handling, input validation. For a contract like the classic "VulnerableVault", it finds every intentional vulnerability and suggests correct fixes.

Smaller models are usable but not great

I tested with qwen2.5-coder:1.5b (runs on CPU, free). It finds the right vulnerabilities but the code fixes are generic ("add access control" instead of actual code). The 7B model is better but needs a GPU or patience. Claude Sonnet produces the best output by far.

Structured output is the hard part

Getting the model to return valid JSON with the exact schema you want is the main engineering challenge. The combination of a strict system prompt + Zod validation + markdown fence stripping handles 99% of cases.

CI integration is the killer feature

The --fail-on flag with exit codes makes spectr-ai a CI gate:

# Fail the pipeline if medium+ issues are found
spectr-ai --fail-on medium --json contracts/ || exit 1

Combined with --diff HEAD~1, you only analyze changed contracts per PR — saving tokens and time.

Try It

# With Claude
export ANTHROPIC_API_KEY=sk-ant-...
npx spectr-ai examples/vulnerable.sol

# With Ollama (free)
ollama pull qwen2.5-coder:1.5b
npx spectr-ai --model ollama:qwen2.5-coder:1.5b examples/vulnerable.sol

The full source is at github.com/pavelEspitia/spectr-ai. MIT licensed.

What's Next

Rate limit retry with exponential backoff for multi-file analysis
Streaming output (see results as the model generates)
Comparative mode (before/after analysis)
Support for more chains (Cairo for StarkNet, Move for Aptos)

If you're building with smart contracts and want to catch vulnerabilities before deployment, give spectr-ai a try. And if you have ideas or find bugs, open an issue.