<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Akhona Eland</title>
    <description>The latest articles on Forem by Akhona Eland (@akhona_eland_072dac9e0c2c).</description>
    <link>https://forem.com/akhona_eland_072dac9e0c2c</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3857770%2F3a45453f-4618-4d5d-b8ec-16606097be8b.png</url>
      <title>Forem: Akhona Eland</title>
      <link>https://forem.com/akhona_eland_072dac9e0c2c</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/akhona_eland_072dac9e0c2c"/>
    <language>en</language>
    <item>
      <title>Your LLM Passes Type Checks but Fails the "Vibe Check": How I Fixed AI Reliability</title>
      <dc:creator>Akhona Eland</dc:creator>
      <pubDate>Sun, 05 Apr 2026 10:40:11 +0000</pubDate>
      <link>https://forem.com/akhona_eland_072dac9e0c2c/your-llm-passes-type-checks-but-fails-the-vibe-check-how-i-fixed-ai-reliability-38ac</link>
      <guid>https://forem.com/akhona_eland_072dac9e0c2c/your-llm-passes-type-checks-but-fails-the-vibe-check-how-i-fixed-ai-reliability-38ac</guid>
      <description>&lt;h1&gt;
  
  
  Your LLM Passes Type Checks but Fails the "Vibe Check": How I Fixed AI Reliability
&lt;/h1&gt;

&lt;p&gt;You validate your LLM outputs with Pydantic. The JSON is well-formed. The fields are correct. Life is good.&lt;/p&gt;

&lt;p&gt;Then your model returns a "polite decline" that says &lt;em&gt;"I'd rather gouge my eyes out."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;It passes your type checks. It fails the vibe check.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This is the Semantic Gap&lt;/strong&gt; — the space between &lt;em&gt;structural correctness&lt;/em&gt; and &lt;em&gt;actual meaning&lt;/em&gt;. Every team shipping LLM-powered features hits it eventually. I got tired of hitting it, so I built &lt;a href="https://github.com/labrat-akhona/semantix-ai" rel="noopener noreferrer"&gt;Semantix&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Semantic Gap: Shape vs. Meaning
&lt;/h2&gt;

&lt;p&gt;Here's what most validation looks like today:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;tone&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Literal&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;polite&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;neutral&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;firm&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This tells you the &lt;em&gt;shape&lt;/em&gt; is right. It tells you nothing about whether the &lt;em&gt;meaning&lt;/em&gt; is right. Your model can return &lt;code&gt;{"message": "Go away.", "tone": "polite"}&lt;/code&gt; and Pydantic will happily accept it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Semantix flips the script.&lt;/strong&gt; Instead of validating structure, you validate intent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;semantix&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Intent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;validate_intent&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ProfessionalDecline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Intent&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;The text must politely decline an invitation
    without being rude or aggressive.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

&lt;span class="nd"&gt;@validate_intent&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;decline_invite&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;ProfessionalDecline&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;call_my_llm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The docstring &lt;em&gt;is&lt;/em&gt; the contract. A judge (LLM-based, NLI, or embedding) reads the output, reads the requirement, and decides: does this text actually do what it claims?&lt;/p&gt;




&lt;h2&gt;
  
  
  What's New in v0.1.3: The Self-Healing Update
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Informed Self-Healing
&lt;/h3&gt;

&lt;p&gt;The biggest feature in v0.1.3 is &lt;strong&gt;informed retries&lt;/strong&gt;. When an LLM output fails validation, the decorator doesn't just retry blindly — it tells the LLM &lt;em&gt;exactly what went wrong&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Declare a &lt;code&gt;semantix_feedback&lt;/code&gt; parameter in your function, and the decorator injects a structured Markdown report on each retry:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;semantix&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;validate_intent&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;semantix.judges.nli&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;NLIJudge&lt;/span&gt;

&lt;span class="nd"&gt;@validate_intent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;judge&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;NLIJudge&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;retries&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;decline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;semantix_feedback&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;ProfessionalDecline&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Decline this invite: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;semantix_feedback&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;semantix_feedback&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;call_llm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On the first call, &lt;code&gt;semantix_feedback&lt;/code&gt; is &lt;code&gt;None&lt;/code&gt;. If validation fails, the next call receives something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Semantix Self-Healing Feedback&lt;/span&gt;

Attempt &lt;span class="gs"&gt;**1**&lt;/span&gt; failed validation.

&lt;span class="gu"&gt;### What went wrong&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="gs"&gt;**Intent:**&lt;/span&gt; &lt;span class="sb"&gt;`ProfessionalDecline`&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="gs"&gt;**Score:**&lt;/span&gt; 0.3210 (threshold not met)
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="gs"&gt;**Judge reason:**&lt;/span&gt; too vague

&lt;span class="gu"&gt;### What is required&lt;/span&gt;
The text must politely decline an invitation without being rude or aggressive.

&lt;span class="gu"&gt;### Your previous output (rejected)&lt;/span&gt;
Go away.

Please generate a new response that satisfies the requirement above.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The LLM gets the score, the reason, the requirement, and its own rejected output. It can &lt;em&gt;learn from the failure&lt;/em&gt; in real time.&lt;/p&gt;

&lt;h3&gt;
  
  
  NLI as the Default Judge
&lt;/h3&gt;

&lt;p&gt;We moved from &lt;code&gt;LLMJudge&lt;/code&gt; to &lt;code&gt;NLIJudge&lt;/code&gt; as the default. Why?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No API key required&lt;/strong&gt; — runs fully locally using a cross-encoder model&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Entailment &amp;gt; Cosine similarity&lt;/strong&gt; — NLI asks "does A entail B?" which is fundamentally the right question for intent validation. Cosine similarity asks "are A and B &lt;em&gt;about&lt;/em&gt; the same thing?" which is a weaker signal&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fast enough&lt;/strong&gt; — the default &lt;code&gt;nli-MiniLM2-L6-H768&lt;/code&gt; model is ~85MB and runs in milliseconds&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can still use any judge you want — &lt;code&gt;LLMJudge&lt;/code&gt;, &lt;code&gt;EmbeddingJudge&lt;/code&gt;, or your own custom &lt;code&gt;Judge&lt;/code&gt; subclass.&lt;/p&gt;

&lt;h3&gt;
  
  
  Granular Scoring
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;LLMJudge&lt;/code&gt; no longer returns a binary Yes/No. It now returns a &lt;strong&gt;0.0-1.0 confidence score&lt;/strong&gt; and a &lt;strong&gt;text reason&lt;/strong&gt;, giving the self-healing system richer feedback to work with.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Proof: Benchmark Results
&lt;/h2&gt;

&lt;p&gt;Talk is cheap. Here are the real numbers from &lt;code&gt;tools/benchmark.py&lt;/code&gt;, comparing single-shot validation (no retries) against Semantix self-healing (2 retries with feedback injection):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;No Healing&lt;/th&gt;
&lt;th&gt;Self-Healing&lt;/th&gt;
&lt;th&gt;Improvement&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Professional Tone&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;13.3%&lt;/td&gt;
&lt;td&gt;56.7%&lt;/td&gt;
&lt;td&gt;+43.3%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Technical Explanation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;36.7%&lt;/td&gt;
&lt;td&gt;96.7%&lt;/td&gt;
&lt;td&gt;+60.0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Actionable Summary&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;13.3%&lt;/td&gt;
&lt;td&gt;56.7%&lt;/td&gt;
&lt;td&gt;+43.3%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Overall&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;21.1%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;70.0%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;+48.9%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Self-healing nearly &lt;strong&gt;triples&lt;/strong&gt; the overall success rate. For technical explanations specifically, it pushes reliability from 36.7% to 96.7%.&lt;/p&gt;

&lt;p&gt;These numbers are from a simulated LLM with a 40% baseline quality rate. Real LLMs start higher, so the absolute numbers will be better — but the &lt;em&gt;relative improvement&lt;/em&gt; from self-healing holds.&lt;/p&gt;




&lt;h2&gt;
  
  
  How It Works Under the Hood
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Your Function
     |
     v
@validate_intent
     |
     v
Call function -&amp;gt; Get raw string
     |
     v
Judge.evaluate(output, intent_description, threshold)
     |
     +-- PASS --&amp;gt; return Intent(output)
     |
     +-- FAIL --&amp;gt; SemanticIntentError
                    |
                    v
              retries left?
                    |
                    +-- YES --&amp;gt; inject semantix_feedback -&amp;gt; retry
                    |
                    +-- NO  --&amp;gt; raise error
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The decorator resolves the &lt;code&gt;Intent&lt;/code&gt; subclass from your return type annotation, calls the judge, and manages the retry loop. The &lt;code&gt;semantix_feedback&lt;/code&gt; injection is zero-boilerplate — just add the parameter and it works.&lt;/p&gt;




&lt;h2&gt;
  
  
  Get Started in 30 Seconds
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="s2"&gt;"semantix-ai[nli]"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;semantix&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Intent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;validate_intent&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PositiveSentiment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Intent&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;The text must express a clearly positive, optimistic,
    or encouraging sentiment.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

&lt;span class="nd"&gt;@validate_intent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;retries&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;encourage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;semantix_feedback&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;PositiveSentiment&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Write an encouraging message for &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;semantix_feedback&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;semantix_feedback&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;call_your_llm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Your LLM output is now semantically typed and self-healing.&lt;/p&gt;




&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/labrat-akhona/semantix-ai" rel="noopener noreferrer"&gt;github.com/labrat-akhona/semantix-ai&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PyPI:&lt;/strong&gt; &lt;a href="https://pypi.org/project/semantix-ai/" rel="noopener noreferrer"&gt;pypi.org/project/semantix-ai&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Install:&lt;/strong&gt; &lt;code&gt;pip install semantix-ai&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Star the repo if this is useful. Open an issue if it isn't — I want to know what's missing.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built by &lt;a href="https://github.com/labrat-akhona" rel="noopener noreferrer"&gt;Akhona Eland&lt;/a&gt; in South Africa.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>ai</category>
      <category>agenticai</category>
      <category>programming</category>
    </item>
    <item>
      <title>Your LLM Passes Type Checks but Fails the Vibe Check — Here's How to Fix It</title>
      <dc:creator>Akhona Eland</dc:creator>
      <pubDate>Thu, 02 Apr 2026 14:08:34 +0000</pubDate>
      <link>https://forem.com/akhona_eland_072dac9e0c2c/your-llm-passes-type-checks-but-fails-the-vibe-check-heres-how-to-fix-it-1dkm</link>
      <guid>https://forem.com/akhona_eland_072dac9e0c2c/your-llm-passes-type-checks-but-fails-the-vibe-check-heres-how-to-fix-it-1dkm</guid>
      <description>&lt;p&gt;You ask your LLM to write a polite decline to a meeting invite. It returns:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"I appreciate the invitation, but I would rather set myself on fire than attend your team-building retreat."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You run it through your Pydantic model. It passes. It's a string. The right length. Valid UTF-8. Technically a "response."&lt;/p&gt;

&lt;p&gt;But it's not a polite decline. It's a career-ending email.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This is the gap nobody's filling.&lt;/strong&gt; We have type systems for data structures — &lt;code&gt;int&lt;/code&gt;, &lt;code&gt;str&lt;/code&gt;, Pydantic models. We validate &lt;em&gt;shape&lt;/em&gt; obsessively. But we have nothing for &lt;em&gt;meaning&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Until now.&lt;/p&gt;

&lt;h2&gt;
  
  
  Introducing Semantix
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/labrat-akhona/semantix-ai" rel="noopener noreferrer"&gt;Semantix&lt;/a&gt; is a semantic type system for LLM outputs. Instead of checking "is this a string?", it checks "does this string actually say what it's supposed to say?"&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;semantix&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Intent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;validate_intent&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ProfessionalDecline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Intent&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;The text must politely decline an invitation 
    without being rude or aggressive.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

&lt;span class="nd"&gt;@validate_intent&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;decline_invite&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;ProfessionalDecline&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;call_my_llm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;decline_invite&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;the company retreat&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# ✓ Validated — the output actually IS a polite decline
# ✗ Raises SemanticIntentError if the LLM went off the rails
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three lines of setup. One decorator. Your LLM output is now semantically typed.&lt;/p&gt;

&lt;h2&gt;
  
  
  How It Works
&lt;/h2&gt;

&lt;p&gt;The core idea is simple:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;You define an Intent&lt;/strong&gt; — a class whose docstring describes the semantic contract.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You decorate your LLM function&lt;/strong&gt; — the return type hint tells Semantix what to validate against.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A Judge evaluates the output&lt;/strong&gt; — comparing what the LLM said against what it was supposed to mean.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The Judge is the interesting part. Semantix ships with three:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;EmbeddingJudge&lt;/strong&gt; — compares sentence embeddings using cosine similarity. Fast, runs locally, no API key. Good for clear-cut intents.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;semantix&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;validate_intent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;EmbeddingJudge&lt;/span&gt;

&lt;span class="nd"&gt;@validate_intent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;judge&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;EmbeddingJudge&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;summarize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;ConciseSummary&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;call_llm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;LLMJudge&lt;/strong&gt; — asks GPT-4o-mini "does this text satisfy this requirement? Yes or No." More accurate, needs an API key, costs fractions of a cent per call.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;NLIJudge&lt;/strong&gt; — uses a cross-encoder NLI model to check if the output &lt;em&gt;entails&lt;/em&gt; the intent. Best of both worlds: accurate like an LLM judge, local like an embedding judge.&lt;/p&gt;

&lt;p&gt;You pick the speed/accuracy tradeoff that fits your use case. And you can swap judges without changing any other code.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Feature That Made Me Build This
&lt;/h2&gt;

&lt;p&gt;Here's what pushed me over the edge. I was building an AI agent for a client that needed to generate customer-facing responses. The responses had to be:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Professional in tone&lt;/li&gt;
&lt;li&gt;Factually grounded in the company's data&lt;/li&gt;
&lt;li&gt;Free of any promises or commitments&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Pydantic could check that the response was a non-empty string under 500 characters. Great. But the LLM kept slipping in phrases like "I guarantee this will be resolved" — structurally valid, semantically dangerous.&lt;/p&gt;

&lt;p&gt;So I built Semantix. And the feature I'm most proud of is &lt;strong&gt;smart retries&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;semantix&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;validate_intent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;get_last_failure&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;EmbeddingJudge&lt;/span&gt;

&lt;span class="nd"&gt;@validate_intent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;judge&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;EmbeddingJudge&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;retries&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;respond&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;SafeCustomerResponse&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;hint&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;failure&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_last_failure&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;hint&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s"&gt;Your previous attempt scored &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;failure&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;. &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Remove any promises or guarantees.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;call_llm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Respond to: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="si"&gt;}{&lt;/span&gt;&lt;span class="n"&gt;hint&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;get_last_failure()&lt;/code&gt; gives your LLM function access to the &lt;em&gt;reason&lt;/em&gt; the previous attempt failed. So each retry isn't just "try again" — it's "try again, but here's what went wrong." The LLM gets smarter with each attempt.&lt;/p&gt;

&lt;h2&gt;
  
  
  Composable Intents
&lt;/h2&gt;

&lt;p&gt;Real-world requirements are rarely one-dimensional. Semantix lets you combine intents:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;semantix&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AllOf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AnyOf&lt;/span&gt;

&lt;span class="c1"&gt;# Must satisfy ALL — polite AND positive
&lt;/span&gt;&lt;span class="n"&gt;SafeResponse&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ProfessionalTone&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;NoPromises&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;FactuallyGrounded&lt;/span&gt;

&lt;span class="c1"&gt;# Must satisfy AT LEAST ONE — either formal or casual decline
&lt;/span&gt;&lt;span class="n"&gt;FlexibleDecline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AnyOf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;FormalDecline&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;CasualDecline&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@validate_intent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;judge&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;EmbeddingJudge&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;respond&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;SafeResponse&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;call_llm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;&amp;amp;&lt;/code&gt; and &lt;code&gt;|&lt;/code&gt; operators work on Intent classes directly. Under the hood, &lt;code&gt;AllOf&lt;/code&gt; concatenates the docstrings with "AND" and uses the minimum threshold. &lt;code&gt;AnyOf&lt;/code&gt; uses "OR" and the maximum threshold.&lt;/p&gt;

&lt;h2&gt;
  
  
  Streaming Support
&lt;/h2&gt;

&lt;p&gt;If you're streaming LLM responses (and you probably should be), Semantix validates once the full stream is assembled:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;semantix&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;StreamCollector&lt;/span&gt;

&lt;span class="n"&gt;collector&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;StreamCollector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ProfessionalDecline&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;judge&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;my_judge&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;collector&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;wrap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;llm_stream&lt;/span&gt;&lt;span class="p"&gt;()):&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# stream to user in real-time
&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collector&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;result&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# validate the complete output
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your users see the response streaming in. Behind the scenes, Semantix is collecting chunks. The moment the stream ends, it validates. If it fails, you catch the error and handle it — retry, fall back to a template, or flag for human review.&lt;/p&gt;

&lt;h2&gt;
  
  
  How It Compares
&lt;/h2&gt;

&lt;p&gt;I built Semantix because the existing tools solve a different problem:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Semantix&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;Guardrails AI&lt;/th&gt;
&lt;th&gt;NeMo Guardrails&lt;/th&gt;
&lt;th&gt;Instructor&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Validates meaning&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌ Schema-focused&lt;/td&gt;
&lt;td&gt;✅ Dialogue rails&lt;/td&gt;
&lt;td&gt;❌ Schema-focused&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Zero required deps&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Works with any LLM&lt;/td&gt;
&lt;td&gt;✅ Any function&lt;/td&gt;
&lt;td&gt;⚠️ Wrappers&lt;/td&gt;
&lt;td&gt;⚠️ Config files&lt;/td&gt;
&lt;td&gt;⚠️ Patched clients&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pluggable backends&lt;/td&gt;
&lt;td&gt;✅ 3 built-in + custom&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lines to validate&lt;/td&gt;
&lt;td&gt;~5&lt;/td&gt;
&lt;td&gt;~20+&lt;/td&gt;
&lt;td&gt;~30+&lt;/td&gt;
&lt;td&gt;~10&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Semantix isn't a replacement for Pydantic or Guardrails. It's the &lt;strong&gt;layer above&lt;/strong&gt; them. After you know the shape is right, verify the meaning is right too.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;semantix-ai

&lt;span class="c"&gt;# With embedding judge (fast, local)&lt;/span&gt;
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="s2"&gt;"semantix-ai[embeddings]"&lt;/span&gt;

&lt;span class="c"&gt;# With OpenAI judge (accurate)&lt;/span&gt;
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="s2"&gt;"semantix-ai[openai]"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check out the repo: &lt;a href="https://github.com/labrat-akhona/semantix-ai" rel="noopener noreferrer"&gt;github.com/labrat-akhona/semantix-ai&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It's MIT licensed, Python 3.10+, and the core has zero dependencies. I'd love feedback — open an issue or drop a comment below.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I'm Akhona, an automation engineer based in South Africa. I build AI-powered tools and integrations. You can find me on &lt;a href="https://github.com/labrat-akhona" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>ai</category>
      <category>llm</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
