<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: ZiLing</title>
    <description>The latest articles on Forem by ZiLing (@ziling-failcore).</description>
    <link>https://forem.com/ziling-failcore</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3678173%2F93143944-c0ca-49d5-9007-851ffe10bb9c.png</url>
      <title>Forem: ZiLing</title>
      <link>https://forem.com/ziling-failcore</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/ziling-failcore"/>
    <language>en</language>
    <item>
      <title>Why Execution Boundaries Matter More Than AI Guardrails</title>
      <dc:creator>ZiLing</dc:creator>
      <pubDate>Wed, 14 Jan 2026 18:18:39 +0000</pubDate>
      <link>https://forem.com/ziling-failcore/why-execution-boundaries-matter-more-than-ai-guardrails-2421</link>
      <guid>https://forem.com/ziling-failcore/why-execution-boundaries-matter-more-than-ai-guardrails-2421</guid>
      <description>&lt;h2&gt;
  
  
  Why Execution Boundaries Matter More Than AI Guardrails
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Probabilistic Prompts vs. Deterministic Runtime Safety&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The problem isn’t that AI models are “careless”
&lt;/h3&gt;

&lt;p&gt;Over the past year, we’ve seen rapid improvements in AI guardrails built directly into models — better refusals, safer completions, and increasingly aggressive alignment tuning.&lt;/p&gt;

&lt;p&gt;And yet, something still feels fundamentally off.&lt;/p&gt;

&lt;p&gt;When an AI agent is allowed to read files, make network requests, or spawn processes, we are no longer dealing with a purely conversational system.&lt;/p&gt;

&lt;p&gt;We are dealing with &lt;strong&gt;code execution&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;At that point, the question is no longer:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“Will the model behave responsibly?”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The real question becomes:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Where does responsibility actually live?&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Guardrails inside the model are probabilistic by design
&lt;/h3&gt;

&lt;p&gt;Model-level guardrails operate on probabilities.&lt;/p&gt;

&lt;p&gt;They rely on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;pattern recognition,&lt;/li&gt;
&lt;li&gt;learned safety heuristics,&lt;/li&gt;
&lt;li&gt;statistical correlations between inputs and “safe” outputs.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This works reasonably well for tasks like text generation or summarization.&lt;/p&gt;

&lt;p&gt;But probabilistic systems have an unavoidable property:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;They can never guarantee correctness on a single execution.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;“Most of the time” is not good enough when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a wrong file path deletes data,&lt;/li&gt;
&lt;li&gt;a misinterpreted URL triggers SSRF,&lt;/li&gt;
&lt;li&gt;a subtle prompt variation bypasses a refusal.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can prompt better.&lt;br&gt;&lt;br&gt;
You can fine-tune more.&lt;br&gt;&lt;br&gt;
You can stack system messages.&lt;/p&gt;

&lt;p&gt;But in the end, you are still asking a probabilistic system to police itself.&lt;/p&gt;
&lt;h3&gt;
  
  
  Execution changes everything
&lt;/h3&gt;

&lt;p&gt;The moment an agent can &lt;em&gt;act&lt;/em&gt;, not just respond, the safety model must change.&lt;/p&gt;

&lt;p&gt;Execution has characteristics that language does not:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;it is stateful,&lt;/li&gt;
&lt;li&gt;it has side effects,&lt;/li&gt;
&lt;li&gt;it is often irreversible.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once a process is spawned or a file is deleted, there is no “retry with a better prompt”.&lt;/p&gt;

&lt;p&gt;This is where the concept of an &lt;strong&gt;execution boundary&lt;/strong&gt; becomes critical.&lt;/p&gt;

&lt;p&gt;An execution boundary is the point where:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;intent becomes action,&lt;/li&gt;
&lt;li&gt;language becomes effect,&lt;/li&gt;
&lt;li&gt;probability must give way to determinism.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Deterministic safety belongs at the execution boundary
&lt;/h3&gt;

&lt;p&gt;Execution boundaries are enforced by code, not by intent.&lt;/p&gt;

&lt;p&gt;They answer binary questions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Is this file path allowed?&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Is this network address private or public?&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Is this process permitted under the current policy?&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These checks are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;explicit,&lt;/li&gt;
&lt;li&gt;repeatable,&lt;/li&gt;
&lt;li&gt;and free of ambiguity.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is not about distrusting AI models.&lt;/p&gt;

&lt;p&gt;It is about placing guarantees &lt;strong&gt;where guarantees are actually possible&lt;/strong&gt;.&lt;/p&gt;
&lt;h2&gt;
  
  
  What a deterministic boundary looks like
&lt;/h2&gt;

&lt;p&gt;Here is a simplified, conceptual example of what a deterministic execution boundary might look like in practice.&lt;/p&gt;

&lt;p&gt;This example is not about how the model reasons — it’s about what the runtime enforces:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;//&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;A&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;deterministic&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;policy&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;does&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;not&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"think"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;—&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;it&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;enforces.&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"policy"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"enforce"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"rules"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"fs_write_limit"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"filesystem"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"pattern"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/app/data/temp/*"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"block_sensitive_paths"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"filesystem"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"deny"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"pattern"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"/etc/*"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/usr/bin/*"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A model cannot reliably allow access to&lt;br&gt;&lt;br&gt;
&lt;code&gt;/app/data/temp/file.txt&lt;/code&gt;&lt;br&gt;&lt;br&gt;
while blocking&lt;br&gt;&lt;br&gt;
&lt;code&gt;/etc/passwd&lt;/code&gt;&lt;br&gt;&lt;br&gt;
&lt;strong&gt;100% of the time&lt;/strong&gt; via prompts alone.&lt;/p&gt;

&lt;p&gt;A runtime execution boundary can.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why “fail-fast” matters more than “self-correct”
&lt;/h3&gt;

&lt;p&gt;A common argument is that agents can detect and fix their own mistakes.&lt;/p&gt;

&lt;p&gt;In practice, this breaks down quickly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the agent may not realize it crossed a boundary,&lt;/li&gt;
&lt;li&gt;the context explaining the violation may be lost,&lt;/li&gt;
&lt;li&gt;retries may amplify damage instead of preventing it.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Fail-fast systems behave differently:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;unsafe actions are rejected immediately,&lt;/li&gt;
&lt;li&gt;no partial side effects occur,&lt;/li&gt;
&lt;li&gt;the system state remains consistent.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is not an AI-specific idea.&lt;/p&gt;

&lt;p&gt;We don’t let databases “try their best” to enforce constraints.&lt;br&gt;&lt;br&gt;
We don’t let operating systems “probably” respect permissions.&lt;/p&gt;

&lt;p&gt;Agent runtimes should not be an exception.&lt;/p&gt;

&lt;h2&gt;
  
  
  Auditability is not optional
&lt;/h2&gt;

&lt;p&gt;When something goes wrong, you need clear answers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What was attempted?&lt;/li&gt;
&lt;li&gt;Why was it blocked?&lt;/li&gt;
&lt;li&gt;Which rule triggered the decision?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Probabilistic refusals are hard to audit.&lt;br&gt;&lt;br&gt;
They often explain &lt;em&gt;what&lt;/em&gt; was refused, but not &lt;em&gt;why&lt;/em&gt; at a system level.&lt;/p&gt;

&lt;p&gt;Deterministic execution boundaries produce artifacts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;traces,&lt;/li&gt;
&lt;li&gt;decision logs,&lt;/li&gt;
&lt;li&gt;rule evaluations.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These artifacts matter for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;debugging,&lt;/li&gt;
&lt;li&gt;compliance,&lt;/li&gt;
&lt;li&gt;incident response.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If an agent operates in a real environment, its actions must be explainable &lt;strong&gt;after the fact&lt;/strong&gt;, not just “well-intended” at runtime.&lt;/p&gt;

&lt;h3&gt;
  
  
  Closing thoughts
&lt;/h3&gt;

&lt;p&gt;As AI agents gain more autonomy, the cost of a single mistake increases.&lt;/p&gt;

&lt;p&gt;At that scale, safety cannot live entirely inside the model.&lt;/p&gt;

&lt;p&gt;It must live at the execution boundary:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;enforced by deterministic code,&lt;/li&gt;
&lt;li&gt;observable through audit logs,&lt;/li&gt;
&lt;li&gt;designed to fail fast rather than recover late.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is not a philosophical position.&lt;br&gt;&lt;br&gt;
It is a systems engineering one.&lt;/p&gt;

&lt;p&gt;And systems tend to punish us quickly when we ignore their boundaries.&lt;/p&gt;

&lt;h3&gt;
  
  
  Epilogue
&lt;/h3&gt;

&lt;p&gt;This line of thinking is what led me to build&lt;br&gt;
&lt;a href="https://github.com/zi-ling/failcore" rel="noopener noreferrer"&gt;FailCore&lt;/a&gt; —&lt;br&gt;
an open-source, fail-fast execution boundary for AI agents.&lt;/p&gt;

&lt;p&gt;The project is still evolving, but its core goal is simple:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Make unsafe actions impossible to execute, regardless of how they are generated.&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>security</category>
      <category>agents</category>
      <category>devops</category>
    </item>
    <item>
      <title>I Thought It Was Refactoring My Code. It Actually Wiped It Out.</title>
      <dc:creator>ZiLing</dc:creator>
      <pubDate>Wed, 31 Dec 2025 13:34:51 +0000</pubDate>
      <link>https://forem.com/ziling-failcore/i-thought-it-was-refactoring-my-code-it-actually-wiped-it-out-25m3</link>
      <guid>https://forem.com/ziling-failcore/i-thought-it-was-refactoring-my-code-it-actually-wiped-it-out-25m3</guid>
      <description>&lt;h2&gt;
  
  
  3 Months of Code, Gone in 5 Seconds
&lt;/h2&gt;

&lt;p&gt;I’m still a bit shaky as I type this.&lt;/p&gt;

&lt;p&gt;A few weeks ago, I was using an LLM-based automation to refactor a project’s directory structure.&lt;br&gt;&lt;br&gt;
The goal was simple: clean things up, reorganize a few core modules — nothing risky.&lt;/p&gt;

&lt;p&gt;During the planning stage, everything looked perfect.&lt;br&gt;&lt;br&gt;
Clear reasoning. Careful steps. It even reassured me:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“For safety, I will scan the directories first.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I let it run in the background and went to do something else.&lt;/p&gt;

&lt;p&gt;When I came back, my code was gone.&lt;/p&gt;

&lt;p&gt;Not moved.&lt;br&gt;&lt;br&gt;
Not misplaced.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Physically deleted.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Because of a subtle path hallucination, the model interpreted my project root as a temporary directory.&lt;br&gt;&lt;br&gt;
There was no warning. No error. Nothing suspicious before execution.&lt;/p&gt;

&lt;p&gt;In about &lt;strong&gt;5 seconds&lt;/strong&gt;, it “optimized” &lt;strong&gt;3 months of my work&lt;/strong&gt; into a blank screen.&lt;/p&gt;

&lt;p&gt;That was the moment I realized:&lt;br&gt;&lt;br&gt;
the word &lt;strong&gt;“refactor”&lt;/strong&gt; in the title was doing a lot of lying.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Prompt Engineering Isn’t Enough
&lt;/h2&gt;

&lt;p&gt;This accident taught me a hard lesson:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;AI failures don’t usually happen during “thinking” —&lt;br&gt;&lt;br&gt;
they happen during “doing.”&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;We spend an enormous amount of time designing prompt guardrails, trying to &lt;em&gt;convince&lt;/em&gt; models to behave safely.&lt;br&gt;&lt;br&gt;
But in practice:&lt;/p&gt;

&lt;h3&gt;
  
  
  Hallucinations are inevitable
&lt;/h3&gt;

&lt;p&gt;A model can promise safety in text, then hallucinate a destructive path at the exact millisecond it generates a tool call.&lt;/p&gt;

&lt;h3&gt;
  
  
  Execution is irreversible
&lt;/h3&gt;

&lt;p&gt;Once an AI has filesystem or network access, every action produces real-world side effects.&lt;br&gt;&lt;br&gt;
There is no “undo” button.&lt;/p&gt;

&lt;p&gt;Running AI automation without execution-time protection is basically&lt;br&gt;&lt;br&gt;
&lt;strong&gt;barefoot running on broken glass.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  FailCore: Not a Framework, Just a Safety Belt
&lt;/h2&gt;

&lt;p&gt;I didn’t want to build another heavy framework.&lt;/p&gt;

&lt;p&gt;FailCore exists for one reason:&lt;br&gt;&lt;br&gt;
that incident made it obvious what I was missing.&lt;/p&gt;

&lt;p&gt;After the failure, I realized I needed three very concrete things.&lt;/p&gt;




&lt;h3&gt;
  
  
  1. Execution-Time Interception
&lt;/h3&gt;

&lt;p&gt;That path hallucination made one thing clear:&lt;br&gt;&lt;br&gt;
&lt;strong&gt;safety checks can’t stop at the prompt layer.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;FailCore hooks into tool calls at the Python runtime level.&lt;br&gt;&lt;br&gt;
If an automated process tries to touch an unauthorized directory or a dangerous network target&lt;br&gt;&lt;br&gt;
(for example, an internal IP that could trigger SSRF),&lt;br&gt;&lt;br&gt;
the circuit is broken &lt;strong&gt;before&lt;/strong&gt; the side effect happens.&lt;/p&gt;




&lt;h3&gt;
  
  
  2. A “Black Box” Audit Trail
&lt;/h3&gt;

&lt;p&gt;During those 5 seconds, I had no idea what the system was actually doing.&lt;/p&gt;

&lt;p&gt;So I needed evidence.&lt;/p&gt;

&lt;p&gt;FailCore turns raw execution traces into an HTML audit report, showing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;when an action happened
&lt;/li&gt;
&lt;li&gt;what parameters were used
&lt;/li&gt;
&lt;li&gt;which resource was targeted
&lt;/li&gt;
&lt;li&gt;and why it was allowed or blocked
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This was the first time I could actually see what the AI did, step by step.&lt;/p&gt;




&lt;h3&gt;
  
  
  3. Deterministic Replay
&lt;/h3&gt;

&lt;p&gt;I didn’t want to burn tokens or risk my environment just to reproduce a failure.&lt;/p&gt;

&lt;p&gt;With FailCore, you can take a recorded execution trace and replay it locally —&lt;br&gt;&lt;br&gt;
&lt;strong&gt;without re-running dangerous operations&lt;/strong&gt; —&lt;br&gt;&lt;br&gt;
to pinpoint exactly where the logic went wrong.&lt;/p&gt;




&lt;h2&gt;
  
  
  Opening the Black Box
&lt;/h2&gt;

&lt;p&gt;Below is a prototype of the HTML audit report generated by FailCore:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzvy70ixh8ba8cod4c05a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzvy70ixh8ba8cod4c05a.png" alt="FailCore execution audit report showing blocked unsafe actions and risk summary" width="681" height="922"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This isn’t just about preventing accidents.&lt;br&gt;&lt;br&gt;
It’s about &lt;strong&gt;observability&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;For developers&lt;/strong&gt;: debugging non-deterministic failures with 100% replay accuracy
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;For teams&lt;/strong&gt;: maintaining an auditable trail of automated actions
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;For AI systems&lt;/strong&gt;: operating within explicit, enforceable boundaries
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;AI is incredibly good at writing code.&lt;br&gt;&lt;br&gt;
But we shouldn’t let it be the &lt;strong&gt;judge, jury, and executioner&lt;/strong&gt; of our local file systems.&lt;/p&gt;

&lt;p&gt;FailCore is still a work in progress, but it’s what allows me to keep running AI automation on my own machine without fear.&lt;/p&gt;

&lt;p&gt;If you’re letting AI touch the real world,&lt;br&gt;&lt;br&gt;
&lt;strong&gt;execution safety deserves its own layer.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;👉 GitHub: &lt;a href="https://github.com/Zi-Ling/failcore" rel="noopener noreferrer"&gt;https://github.com/Zi-Ling/failcore&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If there’s interest, I can write a follow-up post explaining how the Python runtime hooks actually work.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>security</category>
      <category>opensource</category>
    </item>
    <item>
      <title>I Monkey-Patched Python to Stop AI Agents from Accessing Private Networks</title>
      <dc:creator>ZiLing</dc:creator>
      <pubDate>Thu, 25 Dec 2025 11:31:20 +0000</pubDate>
      <link>https://forem.com/ziling-failcore/i-monkey-patched-python-to-stop-ai-agents-from-accessing-private-networks-24h2</link>
      <guid>https://forem.com/ziling-failcore/i-monkey-patched-python-to-stop-ai-agents-from-accessing-private-networks-24h2</guid>
      <description>&lt;p&gt;Most AI agent failures aren’t caused by bad plans.&lt;/p&gt;

&lt;p&gt;They’re caused by &lt;strong&gt;unsafe execution&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;After building and debugging multiple agent systems, I kept running into the same problems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Tools being called with unexpected arguments&lt;/li&gt;
&lt;li&gt;Network or filesystem side effects happening too early&lt;/li&gt;
&lt;li&gt;Agents “succeeding” while silently doing the wrong thing&lt;/li&gt;
&lt;li&gt;Failures that were impossible to reproduce after the fact&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So I built &lt;strong&gt;FailCore&lt;/strong&gt; — a small execution-time safety runtime for AI agents.&lt;/p&gt;




&lt;h2&gt;
  
  
  What is FailCore?
&lt;/h2&gt;

&lt;p&gt;FailCore is &lt;strong&gt;not&lt;/strong&gt; an agent framework.&lt;br&gt;
It doesn’t plan, reason, or store memory.&lt;/p&gt;

&lt;p&gt;Instead, it focuses on one thing:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Enforcing safety at the Python execution boundary.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Rather than relying on better prompts or smarter planning, FailCore intercepts tool execution &lt;em&gt;before&lt;/em&gt; side effects happen.&lt;/p&gt;

&lt;p&gt;This allows it to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Block unsafe filesystem access&lt;/li&gt;
&lt;li&gt;Prevent private network / SSRF-style calls&lt;/li&gt;
&lt;li&gt;Validate tool inputs and outputs&lt;/li&gt;
&lt;li&gt;Record deterministic execution traces for replay and audit&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;
  
  
  Why execution-time safety?
&lt;/h2&gt;

&lt;p&gt;Most agent systems try to solve safety &lt;em&gt;upstream&lt;/em&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Better prompts&lt;/li&gt;
&lt;li&gt;More constraints&lt;/li&gt;
&lt;li&gt;More planning logic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In practice, that’s brittle.&lt;/p&gt;

&lt;p&gt;Execution is where real damage happens — file writes, HTTP calls, system commands.&lt;br&gt;
Once those occur, it’s already too late.&lt;/p&gt;

&lt;p&gt;FailCore takes a different approach:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Assume plans can be wrong.&lt;br&gt;&lt;br&gt;
Make execution boring, strict, and observable.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;


&lt;h2&gt;
  
  
  A quick demo
&lt;/h2&gt;

&lt;p&gt;Below is a short demo showing FailCore blocking a real tool-use attempt &lt;strong&gt;before any side effect occurs&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The agent believes the call succeeded.&lt;br&gt;
The system never lets the unsafe action run.&lt;/p&gt;

&lt;p&gt;This is hard to achieve with prompt-level constraints alone,&lt;br&gt;
because the side effect is already triggered by the time the model is wrong.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyg6vhljtbqovbtejhywl.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyg6vhljtbqovbtejhywl.gif" alt="FailCore Demo: Blocking an SSRF attack in the terminal" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  Show me the code
&lt;/h2&gt;

&lt;p&gt;Instead of wrapping every tool manually, FailCore lets you define a secure &lt;code&gt;Session&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;failcore&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;Session&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;presets&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ToolMetadata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;RiskLevel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;SideEffect&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;DefaultPolicy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;SecurityError&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 1. Initialize a secure session
# We enforce a strict policy: No private IPs, No local file access
&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Session&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;validator&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;presets&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;net_safe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;strict&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="c1"&gt;# 2. Register a tool with explicit risk metadata
# This tells FailCore: "This function touches the network, watch it closely."
&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;register&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http_get&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;http_get&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# assuming this is your wrapper function
&lt;/span&gt;    &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;ToolMetadata&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;risk_level&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;RiskLevel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HIGH&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;side_effect&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;SideEffect&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NETWORK&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;default_policy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;DefaultPolicy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BLOCK&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 3. Scenario: The Agent tries an SSRF Attack
# Target: AWS Metadata Endpoint (169.254.169.254)
&lt;/span&gt;&lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http_get&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://169.254.169.254/latest/meta-data/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;SecurityError&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# 🛡️ FailCore intercepts the call BEFORE the socket opens
&lt;/span&gt;    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Attack Neutralized: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
    &lt;span class="c1"&gt;# Output: "SecurityError: Access to private IP range 169.254.0.0/16 is blocked."
&lt;/span&gt;
&lt;span class="c1"&gt;# 4. Scenario: Legitimate Traffic
# Target: Public Internet
&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http_get&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.github.com&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Success:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  How it works (high level)
&lt;/h2&gt;

&lt;p&gt;FailCore hooks into the Python execution layer and wraps tool calls with:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Pre-execution validation&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Policy-based permission checks&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Side-effect interception&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Structured trace recording&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The trace format is deterministic and replayable, which makes it possible to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Debug agent failures after the fact&lt;/li&gt;
&lt;li&gt;Audit what &lt;em&gt;would&lt;/em&gt; have happened&lt;/li&gt;
&lt;li&gt;Re-run executions without re-triggering side effects&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Design details are documented here:&lt;br&gt;
👉 &lt;a href="https://github.com/zi-ling/failcore/blob/main/DESIGN.md" rel="noopener noreferrer"&gt;https://github.com/zi-ling/failcore/blob/main/DESIGN.md&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What FailCore is &lt;em&gt;not&lt;/em&gt;
&lt;/h2&gt;

&lt;p&gt;To set expectations clearly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;❌ Not a sandbox&lt;/li&gt;
&lt;li&gt;❌ Not a VM or container&lt;/li&gt;
&lt;li&gt;❌ Not a replacement for OS-level security&lt;/li&gt;
&lt;li&gt;❌ Not an agent framework&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It’s a &lt;strong&gt;small, composable execution safety layer&lt;/strong&gt; that can sit underneath existing agent stacks.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why open source?
&lt;/h2&gt;

&lt;p&gt;I’m sharing this because execution-time safety feels like a missing layer in many agent systems.&lt;/p&gt;

&lt;p&gt;If you’ve ever dealt with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Non-reproducible agent bugs&lt;/li&gt;
&lt;li&gt;“It worked yesterday” failures&lt;/li&gt;
&lt;li&gt;Unsafe tool calls slipping through&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You might find this useful.&lt;br&gt;
If not today, maybe later.&lt;/p&gt;




&lt;h2&gt;
  
  
  Source code
&lt;/h2&gt;

&lt;p&gt;GitHub:&lt;br&gt;&lt;br&gt;
👉 &lt;a href="https://github.com/zi-ling/failcore" rel="noopener noreferrer"&gt;https://github.com/zi-ling/failcore&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you’ve run into similar execution-layer issues in agent systems, I’d love to hear how you handled them.&lt;br&gt;
If this is a problem you’ve run into before,&lt;br&gt;
you might want to star the repo and come back to it later.&lt;/p&gt;

</description>
      <category>python</category>
      <category>ai</category>
      <category>security</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
