Forem: mtdevworks

I Built an API for LLM JSON Validation in Rust — Here’s What I Learned

mtdevworks — Mon, 09 Feb 2026 14:41:05 +0000

I kept hitting the same wall: LLM outputs breaking production. Trailing commas, unquoted keys, JSON wrapped in markdown—every deploy felt like a game of whack-a-mole. Prompt engineering helped a bit, but I didn’t want to spend the next year tuning “please return valid JSON” for every new feature. So I built JSON Guardian: an API that validates, repairs, and enforces JSON from LLM outputs. This post is about the technical choices, the hard parts, and what I’d tell myself on day one.

Why an API instead of “just fix the prompts”?

Prompts can improve things, but they don’t fix the underlying issue: LLMs are trained on messy data and they’re not deterministic. You can ask for JSON and still get prose, markdown, or invalid syntax. I wanted a single place that sits between the model and my app—something that always returns either valid, schema-conforming data or a clear error. That’s easier to reason about than scattering retries and regex across the codebase.

I also wanted it to work from any stack: Node, Python, n8n, whatever. So I built an API, not a library. You get a key, you send HTTP requests, you get back structured success or failure. No language lock-in, no “install this SDK first.”

Why Rust?

Performance.

This layer runs on every LLM response. If it adds 50–100ms, users notice. I was aiming for sub-10ms p99 so that validation feels free compared to the model call. Rust made that realistic: no GC pauses, predictable latency, and the ability to tune hot paths without fighting a runtime.

Memory safety and no runtime surprises.

No null-pointer or type surprises at runtime. The compiler catches a lot of bugs before they hit production. For a service that parses untrusted LLM output, that’s a big deal.

Trust and reliability.

Rust’s ecosystem gave me what I needed without having to build a JSON parser or a web server from scratch. I could focus on the validation and repair logic instead of fighting the runtime. For a closed-source service, that means we can keep the implementation details in-house while still being able to talk about why we chose Rust: speed, safety, and predictable behaviour.

Architecture in a nutshell

API layer: HTTP, async, built for the “one request → validate/repair → response” model. Kept simple so latency stays predictable.
Validation: JSON Schema Draft 7. Same standard everyone knows; easy to document and reuse.
Storage: A database for API keys and usage tracking. The service stays stateless per request; state lives in the backend.
Endpoints: Validate, repair, enforce (repair + schema + type coercion), extract (strip JSON from prose/markdown), partial (complete streaming JSON), batch (any operation on many items). One concern per endpoint so callers can compose what they need.

Endpoint	Purpose	Best For
Validate	Check JSON against schema	Strict validation only
Repair	Fix common syntax errors	Malformed JSON cleanup
Enforce	Repair + schema + type coercion	Fixing and conforming LLM output
Extract	Strip JSON from prose/markdown	Raw model responses
Partial	Complete streaming JSON	Real-time UI updates
Batch	Bulk operation on many items	High-volume processing

I kept deployment simple: single region, no Kubernetes. Latency stays low because the service is small and the runtime is predictable.

The hard parts

Parsing malformed JSON.

You can’t validate what you can’t parse. So the repair step had to come first: fix trailing commas, single quotes, unquoted keys, and then feed the result into the schema validator. Getting the repair logic right without over-correcting (or breaking valid edge cases) took most of the early iteration. I had to draw a line: fix obvious syntax, don’t guess at semantics.

What to repair vs reject.

Too strict and you reject fixable output; too loose and you “fix” things into wrong data. I ended up with a small, well-defined set of repairs (trailing commas, quote normalization, etc.) and left the rest to validation. If repair can’t produce valid JSON, we return an error with a clear message instead of guessing.

Type coercion.

The enforce endpoint can coerce types—e.g. "twenty-five" → 25 for integer fields. Useful for LLM output; dangerous if overdone. I limited it to a few patterns (numbers, booleans, missing required with defaults) and made the response include a changes_made list so callers see what was altered.

Partial/streaming JSON.

Completing half-written JSON (e.g. for real-time UI during streaming) is a different problem from “is this string valid?” I added a dedicated partial endpoint that closes strings and brackets in a minimal way. It’s best-effort: great for display, but I don’t use it as the source of truth until the stream is done.

Launch: direct API + RapidAPI

Direct API: https://api.jsonguardian.com

For speed and control. Sub-10ms when you call us directly; no proxy in the middle.

RapidAPI: https://rapidapi.com/mtdevworks2025/api/json-guardian

For distribution and discovery. Same API, same behaviour; we accept RapidAPI headers and track those calls separately. Good for reaching people who already browse the marketplace.

Choice	Speed	Best For	Discovery
Direct API	Sub-10ms p99	Maximum control & lowest latency	Manual signup at jsonguardian.com
RapidAPI	Same via proxy	Marketplace reach & existing users	Browse RapidAPI marketplace

Pricing:

Free tier 10k requests/month (no card), then Starter (100k) and Pro (1M). Batch requests count each item as one. I kept the free tier generous so people can try it in real workflows without worrying about the first mistake.

Early feedback so far: people care about speed (“is it really under 10ms?”) and docs (“can I see the exact request/response?”). So I doubled down on the public API docs and the OpenAPI spec. The dashboard at jsonguardian.com handles signup and usage so you can see your own numbers.

What I’d do again (and what I’d change)

Ship fast, then iterate.

I could have spent months polishing the repair heuristics. Instead I shipped a small set of repairs and added more as real payloads showed up. That was the right call. You learn more from real usage than from hypothetical edge cases.

Documentation is part of the product.

The API is useless if people can’t figure out the body shape and error format. PUBLIC_README, OpenAPI, and a few copy-paste examples (Node, Python, curl) took real time but made the API feel “just works.” I’d do that from day one next time.

First 10 users > perfect product.

I’m optimising for the first 10–20 teams that actually use it. Their feedback (what broke, what they expected, what they’d pay for) is worth more than another week of tuning type coercion. So: ship, share in communities, and listen.

What I’d change:

I’d add integration examples (e.g. n8n, LangChain) earlier. A lot of interest comes from “I use X, how do I plug this in?” A short “use with n8n” or “use with OpenAI function calling” post would have saved back-and-forth. I’m doing that now.

Try it

If you’re tired of LLM JSON breaking your app, you can try JSON Guardian with no credit card:

Site and signup: jsonguardian.com
Direct API: api.jsonguardian.com (health: api.jsonguardian.com/health)
RapidAPI: rapidapi.com/mtdevworks2025/api/json-guardian

Free tier: 10,000 requests/month. If you build something with it, I’d love to hear what worked and what didn’t—drop a comment or reach out.

JSON Guardian: jsonguardian.com · RapidAPI · Free tier: 10k requests/month

Building Reliable AI Applications: A Validation Strategy

mtdevworks — Sun, 08 Feb 2026 10:53:49 +0000

AI is unreliable by design. It hallucinates, drifts off-prompt, and—when you ask for structured output—often returns JSON that doesn’t parse or doesn’t match your schema. In production, one bad response can break a flow, confuse a user, or trigger a support ticket. So if you’re building with LLMs, validation isn’t optional; it’s part of the architecture.

This post is a practical validation strategy: why it matters, what kinds of validation you need, where to put it, and how to keep it fast and maintainable.

Why validation matters

Production failures are expensive.

A parsing error in a critical path can mean a failed checkout, a broken dashboard, or a silent wrong answer. Downtime and rollbacks cost time and trust.

User trust is fragile.

One visible bug—“Sorry, something went wrong”—undermines confidence. Users don’t care that the model returned a trailing comma; they care that your app didn’t handle it.

Examples of what goes wrong:

Parse error: The LLM returns {"name": "John",}. Your code does JSON.parse(llmOutput) and crashes.
Wrong type: The schema says age is a number; the LLM returns "age": "twenty-five". Your validator fails or your downstream logic breaks.
Missing field: You require email. The model omits it. You either fail the request or pass incomplete data into your system.

Validation catches these before they reach business logic. The goal is to either get valid, schema-conforming data or fail in a controlled, debuggable way.

Types of validation

Not all validation is the same. It helps to separate three layers.

1. Syntax validation — Is it valid JSON?

Can you parse it? No trailing commas, no unquoted keys, no single quotes. If this fails, you don’t have a data structure at all.

2. Schema validation — Does it match the shape?

Given a JSON Schema (or similar), does the parsed object have the right properties and types? Are required fields present? This is where you enforce the contract between the LLM and your app.

3. Semantic validation — Does it make sense?

Domain rules: “email must look like an email,” “date must be in the future,” “status must be one of these enums.” This is usually custom logic in your code, after syntax and schema are satisfied.

For most LLM pipelines, syntax + schema are the foundation. Get those right first; add semantic checks where needed.

Where to validate: the middleware layer

The right place for syntax and schema validation is between the LLM and your application—a thin middleware layer that:

Receives the raw LLM output (string).
Optionally extracts JSON from prose or markdown.
Repairs syntax if possible (trailing commas, quotes, etc.).
Validates (or enforces) against your JSON Schema.
Returns either valid data or a clear error.

Your app then consumes only validated, typed data. You don’t scatter try/catch and regex across the codebase; you have one place that either succeeds or fails with a structured error.

Performance matters.

This layer runs on every LLM response. If it adds 100ms, users notice. Aim for sub-10ms so that validation is effectively free compared to the LLM call. That usually means a dedicated service (or a very fast library) rather than a heavyweight script.

Error handling.

Decide your policy: retry (e.g. once with a “fix your JSON” prompt), repair (if the service can fix syntax and enforce schema), or fail fast with a clear error to the user or to your monitoring. A repair step often gives the best balance: fewer failed requests, fewer retries, happier users.

Tools and approaches

DIY: regex and custom parsers.

You can strip trailing commas, normalize quotes, and even extract JSON from markdown with regex and a bit of parsing. It works for simple cases but gets brittle: edge cases, nested structures, and escape sequences. Maintenance cost is high.

Libraries: JSON Schema validators.

Use a solid validator (e.g. Ajv, jsonschema) to check parsed JSON against a schema. Great for “is this valid?” They don’t repair; they don’t extract. So you still need to fix syntax and strip prose yourself.

API services: validation + repair + extraction.

Offload the whole pipeline to a service that accepts raw LLM output and returns valid, schema-conforming data (or a clear error). You keep your code simple and get repair, extraction, and enforcement in one place. Latency stays low if the API is built for speed (e.g. Rust, edge deployment).

JSON Guardian is built for this flow. You send the raw string; you get back:

Validate — Check against JSON Schema (Draft 7).
Repair — Fix trailing commas, single quotes, unquoted keys, and similar.
Enforce — Repair first, then enforce schema with type coercion (e.g. "twenty-five" → 25).
Extract — Pull JSON out of markdown or prose.
Partial — Complete partial/streaming JSON for real-time UI.
Batch — Run any of the above on multiple items in one request.

So you can do: extract → repair → enforce in sequence, or call a single endpoint that fits your step. Free tier: 10,000 requests/month at jsonguardian.com; also on RapidAPI.

Summary

Concern	What to do
Why	Avoid production failures and protect user trust; one validation layer catches parse and schema errors.
What	Syntax validation (valid JSON), schema validation (shape + types), then semantic rules if needed.
Where	Between LLM and app—a thin middleware that extracts, repairs, validates/enforces.
How	Keep it fast (sub-10ms); choose retry, repair, or fail-fast; prefer a dedicated service over DIY for repair + schema.

Validation is non-negotiable for production AI. Start with syntax and schema; add extraction and streaming support as your use cases grow. Your future self (and your users) will thank you.

Try JSON Guardian: jsonguardian.com · RapidAPI · Free tier: 10k requests/month

5 Ways LLMs Break JSON in Production (And How to Fix It)

mtdevworks — Sat, 07 Feb 2026 08:29:09 +0000

You've wired up GPT function calling or hooked LangChain into your app. Everything works in testing - until you deploy. Suddenly, you're seeing Unexpected token in JSON at position 42, or your schema validator rejects half the responses. Sound familiar?

LLMs are great at meaning, but they’re surprisingly bad at syntax. Training data is full of inconsistent JSON, and models often mix it with JavaScript, YAML, or plain prose. The result is broken JSON that breaks your app.

Here are the five most common ways LLMs break JSON—and practical ways to fix them, including an API that handles all of these automatically.

1. Trailing commas

What you get:

{"name": "John", "age": 30,}

That comma after 30 is invalid in JSON. In JavaScript it’s fine; in JSON it’s not. JSON.parse() throws.

Why it happens:

Models see both valid JSON and JavaScript in training data. They don’t always distinguish. Trailing commas also appear in arrays: [1, 2, 3,].

How to fix:

Strip trailing commas before the closing } or ], or run the string through a repair step that normalizes this. If you use a validation layer, choose one that can repair as well as validate—so you get valid JSON out instead of just an error.

2. Unquoted keys

What you get:

{name: "John", age: 30}

Valid in JavaScript; invalid in JSON. Keys must be double-quoted strings.

Why it happens:

LLMs are heavily trained on JavaScript/TypeScript. Object literals with unquoted keys are everywhere. The model reproduces that style.

How to fix:

A repair step can wrap unquoted keys in double quotes: name → "name". Regex can handle simple cases; for nested structures and edge cases, a dedicated parser/repairer is safer.

3. Missing required fields

What you get:

Your schema says name and age are required. The LLM returns:

{"name": "John"}

No age. Your validator fails, and your app doesn’t know whether to retry, default, or show an error.

Why it happens:

Context limits, vague instructions, or the model “forgetting” part of the schema. It’s a semantic/schema problem, not just syntax.

How to fix:

Two approaches: (1) Strict validation — reject and retry or show a clear error. (2) Enforcement — fix what you can (e.g. repair syntax), then enforce the schema with defaults for missing required fields. Enforcement is useful when you’d rather have a best-effort result than a hard failure.

4. Mixed or single quotes

What you get:

{'name': 'John'}

{"name": 'John'}

JSON allows only double quotes for strings and keys. Single quotes are invalid.

Why it happens:

Training data includes Python dicts, shell-style strings, and other formats. The model mixes quote styles.

How to fix:

Normalize to double quotes. Be careful with apostrophes inside strings (e.g. "John's car") so you don’t break them when converting. A repair layer that understands string boundaries handles this correctly.

5. JSON buried in prose or markdown

What you get:

Sure! Here's the data you asked for:

'```

json
{"name": "John", "age": 30}


```'

Hope that helps!

Your code expects a raw JSON string. Instead you get a paragraph, markdown fences, and maybe extra text before or after. JSON.parse() on the whole thing fails.

Why it happens:

LLMs are conversational. They explain, wrap code in markdown, and add pleasantries. That’s helpful for readability and terrible for parsing.

How to fix:

Extract the JSON first: strip markdown code fences, find the first { or [, then parse to the matching } or ], or use a dedicated “extract JSON from prose” step. Only then validate or repair. Doing extraction before validation keeps your pipeline robust.

A single layer that handles all five

Fixing each of these by hand (regex, custom parsers, retry logic) gets messy fast. A cleaner approach is to put a small validation-and-repair layer between your LLM and your app.

JSON Guardian is an API built for exactly this: it validates, repairs, and enforces JSON from LLM outputs in under 10ms.

Trailing commas / unquoted keys / single quotes → POST /api/v1/repair returns valid JSON.
Missing required fields → POST /api/v1/enforce with your JSON Schema repairs syntax, then enforces the schema (including defaults for missing required fields when applicable).
JSON in prose or markdown → POST /api/v1/extract strips fences and surrounding text and returns the extracted JSON.

You send the raw LLM response; you get back something you can safely parse and pass to the rest of your app. Built in Rust, so latency stays low—important when you’re calling it on every LLM response.

Quick example — repair:

curl -X POST https://api.jsonguardian.com/api/v1/repair \
  -H "Content-Type: application/json" \
  -H "X-API-Key: YOUR_KEY" \
  -d '{"data": "{\"name\": \"John\", \"age\": 30,}"}'

Response includes a repaired string and repaired_data object—ready to use.

Quick example — extract from markdown:

curl -X POST https://api.jsonguardian.com/api/v1/extract \
  -H "Content-Type: application/json" \
  -H "X-API-Key: YOUR_KEY" \
  -d '{"data": "Here is the result: ```

json\n{\"name\": \"John\"}\n

``` Hope that helps!"}'

You get extracted and extracted_data without the surrounding text.

Free tier: 10,000 requests/month. No credit card required. You can try it at jsonguardian.com or via RapidAPI.

Summary

Handling these in one place—between the LLM and your business logic—keeps your app stable and your code simple. If you’re tired of debugging JSON parse errors in production, give a validation layer a try.

What JSON issues are you running into with your LLM projects? I’d love to hear in the comments.

Try JSON Guardian: jsonguardian.com · RapidAPI · Free tier: 10k requests/month