<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: lakshmipathi</title>
    <description>The latest articles on Forem by lakshmipathi (@laksmipathig).</description>
    <link>https://forem.com/laksmipathig</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3841004%2F6ac02e93-d39d-47f6-94f5-3abf18cc91d8.jpg</url>
      <title>Forem: lakshmipathi</title>
      <link>https://forem.com/laksmipathig</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/laksmipathig"/>
    <language>en</language>
    <item>
      <title>Why AI tools guess from CI logs (and how to give them real data instead)</title>
      <dc:creator>lakshmipathi</dc:creator>
      <pubDate>Wed, 25 Mar 2026 16:15:15 +0000</pubDate>
      <link>https://forem.com/laksmipathig/why-ai-tools-guess-from-ci-logs-and-how-to-give-them-real-data-instead-4cj4</link>
      <guid>https://forem.com/laksmipathig/why-ai-tools-guess-from-ci-logs-and-how-to-give-them-real-data-instead-4cj4</guid>
      <description>&lt;p&gt;&lt;em&gt;Your CI failed. The AI read the log. It guessed. Here's why that's not good enough.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The log says "Segmentation fault." Now what?
&lt;/h2&gt;

&lt;p&gt;You push code. CI runs. It fails. You get a log:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;Round  7/12: num =  58  *** BUG DETECTED (crash next round) ***
Round  8/12: num =  50  *** CRASHING: previous round 7 had bad value 58 ***
Segmentation fault (core dumped)
Exit code: 139
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You paste this into your AI coding tool. It reads the log, sees "Segmentation fault" and "NULL pointer," and suggests a fix.&lt;/p&gt;

&lt;p&gt;But did it find the &lt;em&gt;root cause&lt;/em&gt;? Or did it just patch the symptom?&lt;/p&gt;

&lt;h2&gt;
  
  
  We tested this. Two AIs, same crash, very different fixes.
&lt;/h2&gt;

&lt;p&gt;We took a real CI failure — a C program that crashes with SIGSEGV at round 8 — and gave it to two AI models.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AI #1 (from logs only):&lt;/strong&gt; Read the CI output. Saw the NULL pointer dereference at line 34. Generated this fix:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight diff"&gt;&lt;code&gt;&lt;span class="gd"&gt;-        int *p = NULL;
-        *p = g_prev_num;
&lt;/span&gt;&lt;span class="gi"&gt;+        return -1;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Removes the crash. CI passes. Job done?&lt;/p&gt;

&lt;p&gt;No. The program still silently corrupts data. The &lt;em&gt;real&lt;/em&gt; bug is five lines above — the code forces &lt;code&gt;num = 58&lt;/code&gt; at round 7, which triggers the crash path. Removing the crash mechanism just hides the problem.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AI #2 (from runtime data):&lt;/strong&gt; Saw the exact variable values at every step:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Round 7: getrandom() returned 91
         → code overwrote num from 91 to 58
         → is_even=1, in_range=1
         → g_prev_was_bad set to 1

Round 8: entered crash path because g_prev_was_bad=1
         → p = NULL
         → *p = g_prev_num (58) → SIGSEGV
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It identified lines 38-42 as the root cause and generated:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight diff"&gt;&lt;code&gt;&lt;span class="gd"&gt;-    if (round &amp;gt;= 7 &amp;amp;&amp;amp; !g_prev_was_bad) {
-        num = 58;
-        is_even = 1;
-        in_range = 1;
-    }
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The actual bug removed. Not the symptom — the cause.&lt;/p&gt;

&lt;h2&gt;
  
  
  The difference? Variable values.
&lt;/h2&gt;

&lt;p&gt;AI #1 saw: "line 34 crashes with NULL pointer."&lt;br&gt;
AI #2 saw: "line 39 overwrites num from 91 to 58, which triggers the crash two lines later."&lt;/p&gt;

&lt;p&gt;The second AI didn't guess. It traced the variable evolution step by step, saw where the value changed, and identified the injection point. That's what real runtime data gives you that a log never will.&lt;/p&gt;

&lt;h2&gt;
  
  
  Logs show symptoms. Runtime data shows causes.
&lt;/h2&gt;

&lt;p&gt;Here's what a typical CI log contains:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The test name&lt;/li&gt;
&lt;li&gt;stdout/stderr output&lt;/li&gt;
&lt;li&gt;An exit code&lt;/li&gt;
&lt;li&gt;Maybe a stack trace&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here's what deep runtime tracking captures:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Every variable's value at the crash point&lt;/li&gt;
&lt;li&gt;The call stack with arguments&lt;/li&gt;
&lt;li&gt;Thread state and interleaving order&lt;/li&gt;
&lt;li&gt;The exact line where a value changed unexpectedly&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For simple bugs, logs are enough. For the bugs that actually waste your time — race conditions, intermittent crashes, "works on my machine" failures — you need the runtime data.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this looks like in practice
&lt;/h2&gt;

&lt;p&gt;We built a tool that re-runs your failed CI test with deep runtime tracking. Here's what the output looks like — a replay showing the exact variable evolution leading to the crash:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn2nmemkm93y85lpa416v.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn2nmemkm93y85lpa416v.gif" alt="Debug replay showing variable values changing step by step, ending in crash"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The yellow arrow steps through the code. Variables update in real-time on the left. When &lt;code&gt;num&lt;/code&gt; suddenly changes from 91 to 58 — that's the bug, highlighted in orange. When &lt;code&gt;p = NULL&lt;/code&gt; and the program writes to it — red flash, SIGSEGV.&lt;/p&gt;

&lt;p&gt;This isn't a simulation. These are actual values captured from the running process.&lt;/p&gt;

&lt;h2&gt;
  
  
  How it works
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Your CI fails on your runner (GitHub Actions, GitLab CI — doesn't matter)&lt;/li&gt;
&lt;li&gt;We re-run only the failed test with deep runtime tracking&lt;/li&gt;
&lt;li&gt;We capture exact variable values, thread state, and stack trace at the crash point&lt;/li&gt;
&lt;li&gt;AI analyzes the captured data — not logs — and opens a fix PR&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Zero overhead on your passing builds. We only run on failures.&lt;/p&gt;

&lt;h2&gt;
  
  
  The hard bugs aren't the ones with good error messages
&lt;/h2&gt;

&lt;p&gt;The bugs that waste engineering hours are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Thread races&lt;/strong&gt; — "passes 9 out of 10 runs"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Timing-dependent crashes&lt;/strong&gt; — works locally, fails in CI&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Intermittent NULL derefs&lt;/strong&gt; — depends on random values or scheduling&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These bugs don't leave useful logs. They leave "Segmentation fault" and a prayer. The only way to diagnose them reliably is to see the actual state at the moment of failure.&lt;/p&gt;

&lt;p&gt;That's what deep runtime tracking does. That's what we capture. That's the proof.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;We're looking for beta testers — especially teams with flaky tests they can't figure out.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;neverbreak.ai&lt;/strong&gt; — CI broke. AI fixed. With proof.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Currently supports C, C++, Go, Python, Node.js, and Java. Linux CI only.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>ci</category>
      <category>github</category>
      <category>gitlab</category>
    </item>
  </channel>
</rss>
