<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Tomer Lihovetsky</title>
    <description>The latest articles on Forem by Tomer Lihovetsky (@tomerli).</description>
    <link>https://forem.com/tomerli</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3827294%2Fbfa4090a-5935-4d39-8b4f-0bdb3c095a40.jpg</url>
      <title>Forem: Tomer Lihovetsky</title>
      <link>https://forem.com/tomerli</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/tomerli"/>
    <language>en</language>
    <item>
      <title>We stopped leaving GitHub to debug test failures. Here's how.</title>
      <dc:creator>Tomer Lihovetsky</dc:creator>
      <pubDate>Fri, 01 May 2026 12:32:23 +0000</pubDate>
      <link>https://forem.com/tomerli/we-stopped-leaving-github-to-debug-test-failures-heres-how-g29</link>
      <guid>https://forem.com/tomerli/we-stopped-leaving-github-to-debug-test-failures-heres-how-g29</guid>
      <description>&lt;p&gt;CI is red. You open the PR. Now what?&lt;/p&gt;

&lt;p&gt;You click the failing workflow. You read the logs. You open the trace viewer in a separate tab. You cross-reference the error with the code. You search Slack to see if this happened before. You go back to GitHub to leave a comment.&lt;/p&gt;

&lt;p&gt;Every time. For every failure.&lt;/p&gt;

&lt;p&gt;The problem isn't that debugging is hard. The problem is that you keep leaving GitHub to do it — even though GitHub is where you make the merge decision.&lt;/p&gt;

&lt;p&gt;We built &lt;a href="https://github.com/marketplace/actions/qai-test-intelligence" rel="noopener noreferrer"&gt;QAI Agent&lt;/a&gt; to fix that. This post is about two things that changed how we work: &lt;strong&gt;asking QAI questions directly from a PR comment&lt;/strong&gt;, and &lt;strong&gt;getting the exact code fix inline on the PR&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  The problem with CI feedback
&lt;/h2&gt;

&lt;p&gt;Your tests fail. The PR comment tells you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;8 failures&lt;/li&gt;
&lt;li&gt;3 unique clusters&lt;/li&gt;
&lt;li&gt;Risk: High&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Useful. But it doesn't answer the question every developer actually asks: &lt;em&gt;"Is this worse than last week?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;That question requires history. It requires context across runs. It requires something that remembers.&lt;/p&gt;




&lt;h2&gt;
  
  
  Ask QAI from the PR
&lt;/h2&gt;

&lt;p&gt;Once QAI is connected to your repo, you can mention &lt;code&gt;@qai-agent&lt;/code&gt; in any PR comment and ask anything about your test failures.&lt;/p&gt;

&lt;p&gt;Real example — I typed this on a PR with 18 failing tests across 4 frameworks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@qai-agent What are the key fixes that would resolve roughly 80% of the test failures across all suites?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The response came back instantly:&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;QAI Agent 🪄&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Fixes for ~80% of Failures
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Fix 1: Login flow missing load state wait (covers ~50% of failures)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;All checkout and user-account &lt;code&gt;beforeEach&lt;/code&gt; timeouts across all 4 suites trace to the same root cause: login completes but the page isn't fully loaded before navigation/assertions proceed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Playwright (&lt;code&gt;login()&lt;/code&gt; helper):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;login&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Page&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/login&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fill&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;[name="username"]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;demo&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fill&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;[name="password"]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;demo&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;click&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;[type="submit"]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;waitForLoadState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;networkidle&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// ADD THIS&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Selenium Java (&lt;code&gt;@BeforeEach&lt;/code&gt; setup):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;WebDriverWait&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;driver&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Duration&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofSeconds&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;until&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ExpectedConditions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;visibilityOfElementLocated&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;By&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"username"&lt;/span&gt;&lt;span class="o"&gt;)));&lt;/span&gt;
&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;WebDriverWait&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;driver&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Duration&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofSeconds&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;until&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;webDriver&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;((&lt;/span&gt;&lt;span class="nc"&gt;JavascriptExecutor&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="n"&gt;webDriver&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;executeScript&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"return document.readyState"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;equals&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"complete"&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Selenium Python (&lt;code&gt;setup_checkout&lt;/code&gt; fixture):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;wait&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;WebDriverWait&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;driver&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# increase from 30
&lt;/span&gt;&lt;span class="n"&gt;wait&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;until&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;EC&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;visibility_of_element_located&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;By&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NAME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;username&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Fix 2: Inverted/stale assertions in search &amp;amp; cart tests (covers ~15% of failures)&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;fullCount&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;cards&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;count&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fill&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;[data-testid="search"]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;bamboo&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;filteredCount&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;cards&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;count&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;filteredCount&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toBeGreaterThan&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;fullCount&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toBeGreaterThan&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;filteredCount&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// was backwards&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Fix 3: Empty cart text locator mismatch (covers ~2% of failures)&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Change from:&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;//*[contains(text(),"Your cart is empty")]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// To:&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;//*[contains(text(),"Your cart is empty.")]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Fix&lt;/th&gt;
&lt;th&gt;Tests resolved&lt;/th&gt;
&lt;th&gt;Suites&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Login load state + timeout&lt;/td&gt;
&lt;td&gt;~13 tests&lt;/td&gt;
&lt;td&gt;All 4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Inverted/stale assertions&lt;/td&gt;
&lt;td&gt;~4 tests&lt;/td&gt;
&lt;td&gt;Playwright, WebdriverIO&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cart empty text locator&lt;/td&gt;
&lt;td&gt;1 test&lt;/td&gt;
&lt;td&gt;WebdriverIO&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;p&gt;Three fixes. One question. ~18 tests resolved.&lt;/p&gt;

&lt;p&gt;It didn't just list what broke — it told us which fix to do first, how many tests each one resolves, and gave the exact code per framework. If you're running Playwright, Selenium Java, Selenium Python, and WebdriverIO on the same PR, the fix isn't the same in each one. QAI knows which frameworks are affected and generates the right code for each.&lt;/p&gt;




&lt;h2&gt;
  
  
  Ask about history — the question no test runner can answer
&lt;/h2&gt;

&lt;p&gt;The most powerful use case isn't "why did this fail." It's "is this getting worse?"&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@qai-agent is there an increase in failures comparing to previous PRs?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Yes, there's a dramatic increase in failures compared to previous PRs.&lt;br&gt;
Previous PRs: 7 occurrences of these patterns. This PR: 20 occurrences each — nearly 3x increase.&lt;/p&gt;

&lt;p&gt;Root causes:&lt;br&gt;
→ UI_CHANGED failures (4 tests) — 85% confidence&lt;br&gt;
→ TIMING_FLAKE failures (4 tests) — 70% confidence&lt;br&gt;
→ 0% flaky score — consistent, reproducible failures&lt;/p&gt;

&lt;p&gt;Verdict: This PR introduced systematic failures. Block merge until UI locator issue and timing problems are resolved.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That's not a test runner. That's a senior QA engineer reviewing your PR.&lt;/p&gt;

&lt;p&gt;A single failure is noise. A 3x increase in failures across PRs is a signal. QAI can answer that in seconds because it has the history. Your team doesn't.&lt;/p&gt;

&lt;p&gt;Some other questions you can ask:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@qai-agent why is this test failing?
@qai-agent is this flaky or a real regression?
@qai-agent how long has this been broken?
@qai-agent what's the fastest fix for the cart failures?
@qai-agent is this the same failure we saw last week?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each answer includes historical context, severity classification, confidence score, and a fix suggestion.&lt;/p&gt;




&lt;h2&gt;
  
  
  The code fix — already on the PR, without asking
&lt;/h2&gt;

&lt;p&gt;The second feature shows up automatically. When QAI analyzes a PR, the comment includes an inline code fix for high-confidence failures. You don't need to ask. It's already there.&lt;/p&gt;

&lt;p&gt;For a &lt;code&gt;TEST BUG&lt;/code&gt; cluster at 70% confidence:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The test "search narrows results to matching products" has inverted logic on line 23. [View fix →]&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;search narrows results&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cards&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;locator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;a[href^="/products/"]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;initialCount&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;cards&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;count&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByPlaceholder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/search/i&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;fill&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;bamboo&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/bamboo/i&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;first&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="nf"&gt;toBeHidden&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="nx"&gt;_000&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;filteredCount&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;locator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;a[href^="/products/"]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;count&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByPlaceholder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/search/i&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;fill&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;locator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;a[href^="/products/"]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;count&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toBeGreaterThanOrEqual&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;initialCount&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Ready to copy and apply. No dashboard. No trace viewer. No tab switching.&lt;/p&gt;

&lt;p&gt;The PR comment also breaks results down by suite:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Suite&lt;/th&gt;
&lt;th&gt;✅ Pass&lt;/th&gt;
&lt;th&gt;❌ Fail&lt;/th&gt;
&lt;th&gt;Total&lt;/th&gt;
&lt;th&gt;Pass rate&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Selenium Python&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;td&gt;71%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Selenium Java&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;td&gt;69%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WebdriverIO&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;80%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;23&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;9&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;32&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;72%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;And at the bottom of every comment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;💬 Ask QAI anything about this PR:
Comment @qai-agent &amp;lt;your question&amp;gt; — examples:
• @qai-agent why is this failing?
• @qai-agent is this flaky or a real regression?
• @qai-agent what's the fastest fix?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Why this matters
&lt;/h2&gt;

&lt;p&gt;Most test tools are read-only. You look at them. They don't talk back.&lt;/p&gt;

&lt;p&gt;Ask QAI flips this. Instead of navigating to a dashboard, opening a report, filtering by date, comparing runs manually — you just ask. In the same place you're already working. The context stays in the PR. The team sees the answer.&lt;/p&gt;

&lt;p&gt;The PR is where you decide whether to merge. That's where the analysis should live.&lt;/p&gt;




&lt;h2&gt;
  
  
  Setup — two steps
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Step 1 — Add the Action to your workflow:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;QAI Agent&lt;/span&gt;
  &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;useqai/qai-agent@v1&lt;/span&gt;
  &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;always()&lt;/span&gt;
  &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;junit-path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;test-results/results.xml'&lt;/span&gt;
    &lt;span class="na"&gt;qai-url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;https://ingest.useqai.dev&lt;/span&gt;
    &lt;span class="na"&gt;qai-api-key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.QAI_API_KEY }}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 2 — Install the QAI GitHub App on your repo&lt;/strong&gt; (required for &lt;code&gt;@qai-agent&lt;/code&gt; replies)&lt;/p&gt;

&lt;p&gt;Get your free API key at &lt;a href="https://useqai.dev" rel="noopener noreferrer"&gt;useqai.dev&lt;/a&gt; — 30 seconds, no credit card.&lt;/p&gt;




&lt;h2&gt;
  
  
  Try it before connecting anything
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Zero setup:&lt;/strong&gt; Paste your JUnit XML at &lt;a href="https://useqai.dev/try" rel="noopener noreferrer"&gt;useqai.dev/try&lt;/a&gt; — no account, no GitHub, no secrets. See exactly what QAI posts on a PR in 30 seconds.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fork and see:&lt;/strong&gt; Fork &lt;a href="https://github.com/useqai/demo-shop" rel="noopener noreferrer"&gt;useqai/demo-shop&lt;/a&gt; — QAI is already wired up across 4 frameworks. Open a PR, comment &lt;code&gt;@qai-agent&lt;/code&gt;, and see it respond.&lt;/p&gt;




&lt;p&gt;🔧 GitHub Action: &lt;a href="https://github.com/marketplace/actions/qai-test-intelligence" rel="noopener noreferrer"&gt;useqai/qai-agent&lt;/a&gt;&lt;br&gt;
📦 Source: &lt;a href="https://github.com/useqai/qai-agent" rel="noopener noreferrer"&gt;github.com/useqai/qai-agent&lt;/a&gt;&lt;br&gt;
📊 Dashboard + Ask QAI: &lt;a href="https://useqai.dev" rel="noopener noreferrer"&gt;useqai.dev&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you try it and hit any edge cases — unusual JUnit variants, frameworks not listed — open an issue or drop a comment here.&lt;/p&gt;

</description>
      <category>github</category>
      <category>testing</category>
      <category>playwright</category>
      <category>devops</category>
    </item>
    <item>
      <title>From CI Failure to Fix in Under a Minute — QAI Agent Now Closes the Full Loop</title>
      <dc:creator>Tomer Lihovetsky</dc:creator>
      <pubDate>Tue, 07 Apr 2026 13:54:33 +0000</pubDate>
      <link>https://forem.com/tomerli/from-ci-failure-to-fix-in-under-a-minute-qai-agent-now-closes-the-full-loop-49nf</link>
      <guid>https://forem.com/tomerli/from-ci-failure-to-fix-in-under-a-minute-qai-agent-now-closes-the-full-loop-49nf</guid>
      <description>&lt;p&gt;&lt;strong&gt;QAI Agent now does more than cluster failures and score PR risk. It alerts your team in Slack, explains why tests failed with AI-powered RCA, and generates the fix — all without leaving the PR.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A few weeks ago I wrote about &lt;a href="https://dev.to/tomerli/stop-drowning-in-ci-noise-qai-agent-clusters-your-test-failures-and-tells-you-what-actually-broke-923"&gt;QAI Agent&lt;/a&gt; — a GitHub Action that clusters test failures and scores PR risk.&lt;/p&gt;

&lt;p&gt;The feedback was clear: clustering is useful. But developers wanted more. Not just what broke — why it broke, and how to fix it.&lt;/p&gt;

&lt;p&gt;So we closed the loop.&lt;/p&gt;

&lt;p&gt;Here's what happens now when your CI fails.&lt;/p&gt;




&lt;h4&gt;
  
  
  Step 1 — Slack finds you
&lt;/h4&gt;

&lt;p&gt;You don't check CI. CI comes to you.&lt;br&gt;
The moment a high-risk PR is detected, your team gets a Slack alert:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnxmgogzviit9coooc8ro.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnxmgogzviit9coooc8ro.png" alt=" " width="464" height="172"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;🔴 High Risk PR #28 — Do not merge&lt;br&gt;
Risk: 0.60 · 8 test failures · 8 clusters&lt;br&gt;
Recommendation: investigate failures first&lt;br&gt;
No polling. No tab switching. It just appears in your team channel.&lt;/p&gt;


&lt;h4&gt;
  
  
  Step 2 — The PR comment tells you everything
&lt;/h4&gt;

&lt;p&gt;You click "View in QAI Platform." The PR already has a full breakdown:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9l1hk496k3v0h8n8q6kf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9l1hk496k3v0h8n8q6kf.png" alt=" " width="662" height="1255"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;8 failures listed with exact errors&lt;/li&gt;
&lt;li&gt;8 unique failure clusters identified&lt;/li&gt;
&lt;li&gt;RCA Analysis table — cause, confidence score, suggestion&lt;/li&gt;
&lt;li&gt;💡 AI fix suggestions available&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The RCA table is new. For each Playwright trace, QAI detects the failure category and confidence.&lt;/p&gt;

&lt;p&gt;Rule-based detection, no cloud required. It runs locally on the GitHub Actions runner.&lt;/p&gt;


&lt;h4&gt;
  
  
  Step 3 — AI generates the fix
&lt;/h4&gt;

&lt;p&gt;Click "View in QAI" → open the failing test → hit "Suggest fix."&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4yh3ttioyyk1ei5uks5q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4yh3ttioyyk1ei5uks5q.png" alt=" " width="800" height="393"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The AI explains exactly what went wrong:&lt;/p&gt;

&lt;p&gt;The price locator matches multiple elements (paragraph and span both showing $54.95), causing a strict mode violation that prevents the visibility check.&lt;/p&gt;

&lt;p&gt;Then generates the fix:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;locator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;span&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;hasText&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="se"&gt;\$\d&lt;/span&gt;&lt;span class="sr"&gt;+/&lt;/span&gt; &lt;span class="p"&gt;})).&lt;/span&gt;&lt;span class="nf"&gt;toBeVisible&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One click. Ready to apply.&lt;/p&gt;




&lt;h4&gt;
  
  
  The full loop
&lt;/h4&gt;

&lt;p&gt;CI fails&lt;br&gt;
  → Slack alert fires to your team channel&lt;br&gt;
  → PR comment posts: clusters + RCA + confidence scores&lt;br&gt;
  → AI fix suggestion on demand&lt;br&gt;
  → Merge verdict: go or no-go&lt;/p&gt;

&lt;p&gt;Before QAI: open CI → read logs → guess → fix → repeat.&lt;br&gt;
After QAI: open Slack → click link → apply fix → merge.&lt;/p&gt;


&lt;h4&gt;
  
  
  Setup — still one step
&lt;/h4&gt;

&lt;p&gt;Nothing changed on the setup side. Add one step after your tests run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;QAI Agent&lt;/span&gt;
  &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;useqai/qai-agent@v1&lt;/span&gt;
  &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;always()&lt;/span&gt;
  &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;junit-path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;test-results/results.xml'&lt;/span&gt;
    &lt;span class="na"&gt;trace-path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;test-results/**/*.zip'&lt;/span&gt;   &lt;span class="c1"&gt;# optional, enables RCA&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For Slack alerts and AI fix suggestions, connect the cloud platform with two more lines:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;    &lt;span class="na"&gt;qai-url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;https://ingest.useqai.dev&lt;/span&gt;
    &lt;span class="na"&gt;qai-api-key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.QAI_API_KEY }}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Get your free API key at &lt;a href="https://useqai.dev/" rel="noopener noreferrer"&gt;useqai.dev&lt;/a&gt; — 30 seconds, no credit card.&lt;/p&gt;




&lt;h4&gt;
  
  
  GitHub Action — fully open source:
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;PR comment with risk score ✅&lt;/li&gt;
&lt;li&gt;Failure clustering ✅&lt;/li&gt;
&lt;li&gt;Playwright trace RCA (rule-based, runs locally) ✅&lt;/li&gt;
&lt;li&gt;Block merges on high risk ✅&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Cloud platform (useqai.dev):
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;AI fix suggestions (LLM-powered) ✅&lt;/li&gt;
&lt;li&gt;Slack alerts for high-risk PRs ✅&lt;/li&gt;
&lt;li&gt;Historical trends + flakiness tracking ✅&lt;/li&gt;
&lt;li&gt;Cross-repo visibility ✅&lt;/li&gt;
&lt;/ul&gt;




&lt;h4&gt;
  
  
  Try it
&lt;/h4&gt;

&lt;p&gt;🔧 GitHub Action: &lt;a href="https://github.com/marketplace/actions/qai-test-intelligence" rel="noopener noreferrer"&gt;useqai/qai-agent&lt;/a&gt; on the Marketplace&lt;br&gt;
📦 Source: &lt;a href="https://github.com/useqai/qai-agent" rel="noopener noreferrer"&gt;github.com/useqai/qai-agent&lt;/a&gt;&lt;br&gt;
📊 Dashboard: &lt;a href="https://useqai.dev/" rel="noopener noreferrer"&gt;useqai.dev&lt;/a&gt;&lt;br&gt;
💬 Live PR comment demo: &lt;a href="https://github.com/useqai/qai-agent/pull/2" rel="noopener noreferrer"&gt;PR #2&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you try it and hit any issues — open an issue or drop a comment here. Especially interested in non-Playwright frameworks and edge cases in JUnit parsing.&lt;/p&gt;

</description>
      <category>github</category>
      <category>testing</category>
      <category>devops</category>
      <category>playwright</category>
    </item>
    <item>
      <title>Stop Drowning in CI Noise: QAI Agent Clusters Your Test Failures and Tells You What Actually Broke</title>
      <dc:creator>Tomer Lihovetsky</dc:creator>
      <pubDate>Mon, 16 Mar 2026 13:16:17 +0000</pubDate>
      <link>https://forem.com/tomerli/stop-drowning-in-ci-noise-qai-agent-clusters-your-test-failures-and-tells-you-what-actually-broke-923</link>
      <guid>https://forem.com/tomerli/stop-drowning-in-ci-noise-qai-agent-clusters-your-test-failures-and-tells-you-what-actually-broke-923</guid>
      <description>&lt;p&gt;You open a PR. CI is red. There are 47 failed tests.&lt;/p&gt;

&lt;p&gt;Now what?&lt;/p&gt;

&lt;p&gt;You scroll through a wall of test names. Some look related. Some look flaky. Some are probably the same root cause repeated across 20 test cases. You don't know which to fix first, or whether it's even safe to merge.&lt;/p&gt;

&lt;p&gt;This is CI noise — and it's eating engineering time every single day.&lt;/p&gt;




&lt;h2&gt;
  
  
  What QAI Agent does
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/marketplace/actions/qai-test-intelligence" rel="noopener noreferrer"&gt;QAI Agent&lt;/a&gt; is a GitHub Action that runs after your tests and posts an intelligent summary directly on the pull request.&lt;/p&gt;

&lt;p&gt;It does three things:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Clusters failures by root cause&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Instead of showing you 47 test names, it groups tests that failed for the same underlying reason. If 30 tests all hit the same null pointer, that's one cluster — one thing to fix.&lt;/p&gt;

&lt;p&gt;It works by normalizing error messages: stripping timestamps, line numbers, UUIDs, memory addresses, file paths, and variable values, then hashing the result. Tests with the same normalized signature are the same failure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Scores PR risk&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Based on the fail rate and number of unique failure patterns, it outputs a risk level: &lt;code&gt;low&lt;/code&gt;, &lt;code&gt;medium&lt;/code&gt;, or &lt;code&gt;high&lt;/code&gt;. You can use this to automatically block merges on high-risk PRs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Analyzes Playwright traces (optional)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you're using Playwright and save traces on failure, QAI Agent will unzip and analyze them locally — no cloud required. It detects five failure categories:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Cause&lt;/th&gt;
&lt;th&gt;How it's detected&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;UI Changed&lt;/td&gt;
&lt;td&gt;Locator not found, strict mode violation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Backend Error&lt;/td&gt;
&lt;td&gt;HTTP 5xx response during test&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Test Bug&lt;/td&gt;
&lt;td&gt;Assertion errors in console logs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Timing / Flaky&lt;/td&gt;
&lt;td&gt;Timeout on step&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Environment Failure&lt;/td&gt;
&lt;td&gt;Network failures, ECONNREFUSED&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Setup in 60 seconds
&lt;/h2&gt;

&lt;p&gt;Add one step to your existing workflow, after your tests run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;QAI Agent&lt;/span&gt;
  &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;useqai/qai-agent@v1&lt;/span&gt;
  &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;always()&lt;/span&gt;
  &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;junit-path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;test-results/results.xml'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your workflow needs pull-requests: write permission:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;permissions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;pull-requests&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;write&lt;/span&gt;
      &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;read&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run tests&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npx playwright test --reporter=junit&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;QAI Agent&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;useqai/qai-agent@v1&lt;/span&gt;
        &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;always()&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;junit-path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;test-results/results.xml'&lt;/span&gt;
          &lt;span class="na"&gt;trace-path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;test-results/**/*.zip'&lt;/span&gt;   &lt;span class="c1"&gt;# optional, for RCA&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. No account. No API key. No configuration.&lt;/p&gt;

&lt;h2&gt;
  
  
  The PR comment it generates
&lt;/h2&gt;

&lt;p&gt;Every PR gets a comment like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh7wj8cgjznb63fo3arb1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh7wj8cgjznb63fo3arb1.png" alt=" " width="800" height="931"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It shows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Risk level and merge recommendation&lt;/li&gt;
&lt;li&gt;Failed tests with their error messages&lt;/li&gt;
&lt;li&gt;Failure clusters (grouped by root cause)&lt;/li&gt;
&lt;li&gt;RCA analysis from Playwright traces (if provided)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The comment is upserted — it updates in place when you push new commits, so it doesn't spam your PR timeline.&lt;/p&gt;

&lt;h2&gt;
  
  
  Block merges on high risk
&lt;/h2&gt;

&lt;p&gt;QAI Agent exposes outputs you can use in subsequent steps:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;QAI Agent&lt;/span&gt;
  &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;qai&lt;/span&gt;
  &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;useqai/qai-agent@v1&lt;/span&gt;
  &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;junit-path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;test-results/results.xml'&lt;/span&gt;

&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Block merge on high risk&lt;/span&gt;
  &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;steps.qai.outputs.risk-level == 'high'&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
    &lt;span class="s"&gt;echo "High risk — investigate failures before merging"&lt;/span&gt;
    &lt;span class="s"&gt;exit 1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Available outputs: &lt;code&gt;risk-level&lt;/code&gt;, &lt;code&gt;risk-score&lt;/code&gt;, &lt;code&gt;failed-tests&lt;/code&gt;, &lt;code&gt;total-tests&lt;/code&gt;, &lt;code&gt;cluster-count&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Works with any JUnit-compatible framework
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Framework&lt;/th&gt;
&lt;th&gt;How to get JUnit output&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Playwright&lt;/td&gt;
&lt;td&gt;&lt;code&gt;--reporter=junit&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Jest&lt;/td&gt;
&lt;td&gt;&lt;code&gt;--reporters=jest-junit&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vitest&lt;/td&gt;
&lt;td&gt;&lt;code&gt;--reporter=junit&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;pytest&lt;/td&gt;
&lt;td&gt;&lt;code&gt;--junitxml=results.xml&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Maven/JUnit&lt;/td&gt;
&lt;td&gt;built-in&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Go (gotestsum)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;--junitfile results.xml&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  What it doesn't do without the cloud
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;No historical context — without connecting a cloud backend, QAI Agent only sees the current run. It can't tell you "this failure has been flaky for 3 weeks."&lt;/li&gt;
&lt;li&gt;No LLM explanations — the RCA is rule-based, not AI-generated. It detects categories of failure, not the specific cause in your code.&lt;/li&gt;
&lt;li&gt;Playwright traces only — the RCA analysis only works with Playwright trace zip files, not other test frameworks.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A cloud platform &lt;a href="https://useqai.dev" rel="noopener noreferrer"&gt;useqai.dev&lt;/a&gt; adds historical trends chart across all runs, flakiness tracking, cross-repo visibility, and LLM-powered root cause analysis. &lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;GitHub Action: useqai/qai-agent on the Marketplace&lt;/li&gt;
&lt;li&gt;Source: github.com/useqai/qai-agent&lt;/li&gt;
&lt;li&gt;Live PR comment demo: &lt;a href="https://github.com/useqai/qai-agent/pull/2" rel="noopener noreferrer"&gt;https://github.com/useqai/qai-agent/pull/2&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Dashboard: &lt;a href="https://useqai.dev" rel="noopener noreferrer"&gt;https://useqai.dev&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you try it, open an issue or leave a comment here — especially if you run into a framework or JUnit variant that doesn't parse correctly. Happy to fix it.&lt;/p&gt;

</description>
      <category>github</category>
      <category>testing</category>
      <category>devops</category>
      <category>playwright</category>
    </item>
  </channel>
</rss>
