<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Let's Automate 🛡️</title>
    <description>The latest articles on Forem by Let's Automate 🛡️ (@letsautomate).</description>
    <link>https://forem.com/letsautomate</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3582938%2Fd47e0b42-428a-4790-af53-79366dc1e7fc.png</url>
      <title>Forem: Let's Automate 🛡️</title>
      <link>https://forem.com/letsautomate</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/letsautomate"/>
    <language>en</language>
    <item>
      <title>How to Generate Cypress, Playwright, and WebdriverIO Tests From Natural Language Using AI</title>
      <dc:creator>Let's Automate 🛡️</dc:creator>
      <pubDate>Mon, 04 May 2026 23:01:02 +0000</pubDate>
      <link>https://forem.com/qa-leaders/how-to-generate-cypress-playwright-and-webdriverio-tests-from-natural-language-using-ai-57d5</link>
      <guid>https://forem.com/qa-leaders/how-to-generate-cypress-playwright-and-webdriverio-tests-from-natural-language-using-ai-57d5</guid>
      <description>&lt;h4&gt;
  
  
  A step-by-step breakdown of an open-source platform that converts plain English requirements into runnable E2E tests — no manual coding required
&lt;/h4&gt;

&lt;blockquote&gt;
&lt;p&gt;Writing end-to-end tests is one of those things every developer knows they should do well and almost nobody actually enjoys. You spend an hour getting a Playwright spec to click the right button, another hour figuring out why the selector breaks in CI, and by then the feature has already been redesigned anyway.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsjx8qa9324xokwtrmoub.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsjx8qa9324xokwtrmoub.png" width="800" height="597"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;So when I came across a project that lets you describe what you want to test in plain English — and then generates the actual test code — I had to dig in.&lt;/p&gt;

&lt;p&gt;The project is called &lt;strong&gt;AI Natural Language Tests&lt;/strong&gt; , built under AI Quality Lab. It is open source on GitHub, has a published academic DOI on Zenodo, and as of this week just shipped v5.0.0. You can also try it right now in your browser on Hugging Face Spaces — no installation needed.&lt;/p&gt;

&lt;p&gt;Here is what it does, how it works, and why it deserves a spot in your QA toolkit.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  The Core Idea
&lt;/h3&gt;

&lt;p&gt;Instead of writing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;cy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;#username&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;admin&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nx"&gt;cy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;#password&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;secret&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nx"&gt;cy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;[type=submit]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;click&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nx"&gt;cy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;contains&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Dashboard&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;should&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;be.visible&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;You just say:&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Test login with valid credentials"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;The platform reads that sentence, visits the URL you point it at, analyzes the live HTML to find the actual form fields and selectors, then generates a complete runnable test — in Cypress, Playwright, or WebdriverIO, whichever you prefer.&lt;/p&gt;

&lt;p&gt;That is the pitch. But the internals are more interesting than the pitch.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  What Is Actually Happening Under the Hood
&lt;/h3&gt;

&lt;p&gt;This is not a thin wrapper around a ChatGPT call. It runs a structured five-step workflow built with LangGraph — and each step has a clear purpose.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Step 1 — Understand the page.&lt;/strong&gt; When you pass a --url, the system fetches the live HTML and extracts real selectors, form fields, and interactive elements. This is what prevents it from hallucinating IDs that do not exist on your page.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2 — Check memory.&lt;/strong&gt; The system keeps a vector database (FAISS + SQLite) of patterns from every test it has previously generated. Before writing anything new, it searches for similar past tests using semantic similarity. If it has seen a login flow before, it reuses what worked.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3 — Generate with an LLM.&lt;/strong&gt; The actual test code is produced by your choice of LLM — OpenAI, Anthropic Claude, or Google Gemini. LangChain handles prompt templating and output parsing, while LangGraph turns the multi-step flow into a repeatable, auditable pipeline rather than a single prompt-and-pray call.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 4 — Optional human review.&lt;/strong&gt; There is a --approve flag that pauses execution before saving the generated test and asks a human to confirm. This Human-in-the-Loop gate is especially useful when running the tool against production-critical flows where you want a set of eyes before anything gets committed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 5 — Run it.&lt;/strong&gt; Pass --run and the tool immediately executes the generated test through the framework runner. If it fails, an AI-assisted failure analyzer categorizes the error and suggests a fix — more on that below.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Getting Started Takes About Five Minutes
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/aiqualitylab/ai-natural-language-tests.git
&lt;span class="nb"&gt;cd &lt;/span&gt;ai-natural-language-tests
python &lt;span class="nt"&gt;-m&lt;/span&gt; venv .venv &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;source&lt;/span&gt; .venv/bin/activate &lt;span class="c"&gt;# macOS/Linux&lt;/span&gt;
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
npm ci
npx playwright &lt;span class="nb"&gt;install &lt;/span&gt;chromium
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Add your API key to a .env file:&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight conf"&gt;&lt;code&gt;&lt;span class="n"&gt;OPENAI_API_KEY&lt;/span&gt;=&lt;span class="n"&gt;your_key&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Then generate and immediately run a test:&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python qa_automation.py &lt;span class="s2"&gt;"Test login with valid credentials"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--url&lt;/span&gt; https://the-internet.herokuapp.com/login &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--framework&lt;/span&gt; playwright &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--run&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;That single command fetches the page, generates a .spec.ts file, and runs it through Playwright — without you writing a line of test code.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If you just want to see it work before installing anything, the live Hugging Face Spaces demo lets you paste in a requirement and watch the generation happen in real time.&lt;/p&gt;

&lt;h3&gt;
  
  
  Three Frameworks, One Workflow
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;The tool supports all three major E2E frameworks with the same natural language interface. You switch between them with a single flag:&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frttlwbvfb4ip038fe7ny.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frttlwbvfb4ip038fe7ny.png" width="800" height="484"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;3 Frameworks, One Workflow&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The Cypress integration is worth noting specifically — it supports two distinct modes. The traditional mode generates standard Cypress code. The prompt-powered mode uses cy.prompt() to keep natural language embedded directly in the test, which is useful for teams exploring the newer AI-native Cypress APIs.&lt;/p&gt;

&lt;p&gt;If your team is mid-migration from Cypress to Playwright, you can generate equivalent tests in both frameworks from the same requirement and compare them side by side.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhkoezj8e24zy0ejetzzr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhkoezj8e24zy0ejetzzr.png" width="760" height="827"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Writing Prompts That Actually Work
&lt;/h3&gt;

&lt;p&gt;The output quality depends heavily on how specific you are. A few patterns that work well:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Name the expected outcome.&lt;/strong&gt; “Test login fails with wrong password and shows an error message” produces a far more precise test than “Test login.”&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Chain multiple requirements.&lt;/strong&gt; You can pass several prompts in one run: "Test login" "Test logout" --url  — each gets its own generated file.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Always use&lt;/strong&gt;  &lt;strong&gt;--url.&lt;/strong&gt; Giving the tool a real page means it reads actual HTML instead of guessing selector names. This is the single biggest factor in test quality, because the generator extracts real element IDs and attributes from the live DOM.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Some practical examples:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz2gcbvpskpjoh9k7v41v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz2gcbvpskpjoh9k7v41v.png" width="800" height="305"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Usage :&lt;/em&gt;&lt;/strong&gt; &lt;a href="https://github.com/aiqualitylab/ai-natural-language-tests#usage" rel="noopener noreferrer"&gt;https://github.com/aiqualitylab/ai-natural-language-tests#usage&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  When Tests Fail: AI-Assisted Diagnosis
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;One of the more practical features is the failure analyzer. Instead of staring at a cryptic Cypress error, you pass it to the tool:&lt;br&gt;
&lt;/p&gt;


&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python qa_automation.py &lt;span class="nt"&gt;--analyze&lt;/span&gt; &lt;span class="s2"&gt;"CypressError: Element not found"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;The analyzer categorizes the error into one of ten types — SELECTOR, TIMING, ASSERTION, NETWORK, STATE, NAVIGATION, INTERACTION, CONFIGURATION, ENVIRONMENT, or DYNAMIC_URL — then gives you a plain-English explanation of the root cause and a concrete suggestion for fixing it.&lt;/p&gt;

&lt;p&gt;You can also pipe in a full log file: python qa_automation.py --analyze -f error.log&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  The Quality Evaluation Layer
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;This is the part most people skip over in the README, but it is arguably the most important piece for teams that care about reliability.&lt;/p&gt;

&lt;p&gt;Generating test code is only valuable if the generated tests are actually correct. The project includes two evaluation scripts that measure whether the output is grounded in the real page content.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Offline evaluation (no API key needed).&lt;/strong&gt; The ragas_nlp_evaluator.py script compares generated output against a reference dataset using ROUGE and string similarity metrics. It runs entirely offline, exits with a non-zero code if quality drops below a configurable threshold, and is designed to run as a fast CI gate.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LLM-based evaluation (requires OpenAI key).&lt;/strong&gt; The ragas_evaluator.py script goes further. It fetches the live page HTML, uses GPT-4o-mini to answer the test requirement using that HTML, then scores the generated test on four dimensions: faithfulness to the page, relevance to the requirement, context precision, and context recall.&lt;/p&gt;

&lt;p&gt;Both evaluators are wired into the GitHub Actions CI pipeline. The offline script runs first as a baseline check. If it passes, three parallel jobs spin up — one per framework — each generating tests, evaluating them with the LLM evaluator, and then executing them. If the score drops below threshold, the pipeline blocks before the tests even run.&lt;/p&gt;

&lt;p&gt;You are not shipping generated tests blindly. You have a measurable, automated quality signal at every stage.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Docker and CI/CD
&lt;/h3&gt;

&lt;p&gt;The project ships pre-built Docker images on GitHub Container Registry. You can skip the clone entirely:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;docker pull ghcr.io/aiqualitylab/ai-natural-language-tests:latest

docker run --rm \
  -e OPENAI_API_KEY=your_key \
  ghcr.io/aiqualitylab/ai-natural-language-tests:latest \
  "Test login" --url [https://the-internet.herokuapp.com/login](https://the-internet.herokuapp.com/login)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;For CI/CD, pin to a specific release tag (v5.0.0) rather than latest for reproducibility. The recommended pipeline stages cover dependency installation, NLP baseline evaluation, test generation, LLM evaluation, test execution, and optional telemetry export to Grafana Tempo and Loki.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Observability (Optional but Thoughtful)
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;If your team runs Grafana, the project has native OpenTelemetry integration that exports traces to Grafana Tempo and ships logs to Loki. This is entirely optional — leaving the relevant environment variables unset disables it completely. But for teams that already operate a Grafana stack, having AI test generation traces alongside your application traces is a genuinely useful debugging surface.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  What It Does Not Do Yet
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;To be fair about the limits: the current CLI works through URL-driven generation. A --data flag for passing raw JSON specifications directly is not implemented yet. If your tests target APIs or non-rendered content, you will need to adapt. Given the active release cadence — nine releases with v5.0.0 landing this week — that gap may close soon.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Why This Matters Beyond the Tool Itself
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;The bottleneck in most QA pipelines is not running tests — it is writing them. Engineers skip test authoring because it is slow, tedious, and breaks constantly as UIs change. This tool makes the first draft essentially free, which lowers the activation energy enough that more tests actually get written.&lt;/p&gt;

&lt;p&gt;The pattern memory design compounds the value over time. Every test the system generates gets stored as a vector embedding. Future generations for similar requirements pull from those patterns, so the output becomes more consistent and more project-specific as usage grows. It is not just generating tests in isolation — it is building institutional knowledge about how your application is structured.&lt;/p&gt;

&lt;p&gt;The Ragas evaluation layer means you can measure whether that knowledge is accurate, and block on it in CI if it is not.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Try It
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;The project is open source at &lt;a href="https://github.com/aiqualitylab/ai-natural-language-tests" rel="noopener noreferrer"&gt;github.com/aiqualitylab/ai-natural-language-tests&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Want to experiment without installing anything? The live demo is on &lt;a href="https://huggingface.co/spaces/aiqualitylab/ai-natural-language-tests" rel="noopener noreferrer"&gt;Hugging Face Spaces&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx5d10sh04p46o7ezy9as.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx5d10sh04p46o7ezy9as.png" width="800" height="948"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Hugging Face Space&lt;/em&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;em&gt;Are you using AI-assisted test generation in your pipeline?&lt;/em&gt;
&lt;/h4&gt;

&lt;h4&gt;
  
  
  &lt;em&gt;Share what has worked — and what has not.&lt;/em&gt;
&lt;/h4&gt;




</description>
      <category>programming</category>
      <category>testautomation</category>
      <category>devops</category>
      <category>artificialintelligen</category>
    </item>
    <item>
      <title>QA Bug Triage Pipeline: From App Reviews to Searchable Bug Reports</title>
      <dc:creator>Let's Automate 🛡️</dc:creator>
      <pubDate>Tue, 28 Apr 2026 19:02:48 +0000</pubDate>
      <link>https://forem.com/qa-leaders/qa-bug-triage-pipeline-from-app-reviews-to-searchable-bug-reports-12f4</link>
      <guid>https://forem.com/qa-leaders/qa-bug-triage-pipeline-from-app-reviews-to-searchable-bug-reports-12f4</guid>
      <description>&lt;h4&gt;
  
  
  A simple Python project that turns messy user reviews into structured QA bug reports using an LLM and RAG.
&lt;/h4&gt;

&lt;p&gt;&lt;em&gt;📖&lt;/em&gt; &lt;strong&gt;&lt;em&gt;Full guide:&lt;/em&gt;&lt;/strong&gt; &lt;a href="https://blog.aiqualitylab.org/#/blog/2026-04-qa-bug-triage-pipeline" rel="noopener noreferrer"&gt;&lt;em&gt;blog.aiqualitylab.org&lt;/em&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;a href="https://blog.aiqualitylab.org/#/blog/2026-04-qa-bug-triage-pipeline?id=why-this-project" rel="noopener noreferrer"&gt;Why this project&lt;/a&gt;
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;Product teams get lots of feedback, but most of it is noisy and unstructured. This project helps QA teams convert that feedback into consistent bug records that are easy to search and summarize.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1024%2F0%2A8Kl_zZ2FSBygGo_w" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1024%2F0%2A8Kl_zZ2FSBygGo_w" width="1024" height="1365"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Photo by Guille B on Unsplash&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;a href="https://blog.aiqualitylab.org/#/blog/2026-04-qa-bug-triage-pipeline?id=what-it-does" rel="noopener noreferrer"&gt;What it does&lt;/a&gt;
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;Collects reviews from Google Play&lt;/p&gt;

&lt;p&gt;Routes review text (bug report vs non-bug)&lt;/p&gt;

&lt;p&gt;Generates structured JSON bug reports with an LLM&lt;/p&gt;

&lt;p&gt;Stores bugs in ChromaDB for semantic retrieval&lt;/p&gt;

&lt;p&gt;Adds BM25 keyword matching for hybrid search&lt;/p&gt;

&lt;p&gt;Produces short AI summaries for triage&lt;/p&gt;

&lt;p&gt;Lets you clear the stored bugs from the UI&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  &lt;a href="https://blog.aiqualitylab.org/#/blog/2026-04-qa-bug-triage-pipeline?id=quick-start" rel="noopener noreferrer"&gt;Quick start&lt;/a&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python &lt;span class="nt"&gt;-m&lt;/span&gt; venv .venv
.&lt;span class="se"&gt;\.&lt;/span&gt;venv&lt;span class="se"&gt;\S&lt;/span&gt;cripts&lt;span class="se"&gt;\A&lt;/span&gt;ctivate.ps1
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
python app.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then open the local Gradio URL.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;a href="https://blog.aiqualitylab.org/#/blog/2026-04-qa-bug-triage-pipeline?id=api-key" rel="noopener noreferrer"&gt;API key&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;This app uses BYOK (Bring Your Own Key):&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Paste your OpenAI API key in the UI&lt;/p&gt;

&lt;p&gt;The key is masked&lt;/p&gt;

&lt;p&gt;Do not commit keys to source control&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  &lt;a href="https://blog.aiqualitylab.org/#/blog/2026-04-qa-bug-triage-pipeline?id=main-files" rel="noopener noreferrer"&gt;Main files&lt;/a&gt;
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;app.py: Gradio app flows&lt;/p&gt;

&lt;p&gt;collect.py: review collection&lt;/p&gt;

&lt;p&gt;triage.py: routing and structured triage logic&lt;/p&gt;

&lt;p&gt;rag.py: storage and hybrid retrieval&lt;/p&gt;

&lt;p&gt;eval/eval.py: evaluation script&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  &lt;a href="https://blog.aiqualitylab.org/#/blog/2026-04-qa-bug-triage-pipeline?id=evaluation-sample" rel="noopener noreferrer"&gt;Evaluation sample&lt;/a&gt;
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;Answer Relevancy: 0.868&lt;/p&gt;

&lt;p&gt;Faithfulness: 0.292&lt;/p&gt;

&lt;p&gt;Context Precision: 0.020&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  &lt;a href="https://blog.aiqualitylab.org/#/blog/2026-04-qa-bug-triage-pipeline?id=cost-target" rel="noopener noreferrer"&gt;Cost target&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;For a short demo session, the expected usage is typically under $0.50.&lt;/p&gt;

&lt;p&gt;Tips:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Keep review count low (5 to 10)&lt;/p&gt;

&lt;p&gt;Avoid repeated large collection runs&lt;/p&gt;

&lt;p&gt;Use short test inputs when validating triage&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  &lt;a href="https://blog.aiqualitylab.org/#/blog/2026-04-qa-bug-triage-pipeline?id=tech-stack" rel="noopener noreferrer"&gt;Tech stack&lt;/a&gt;
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;Python&lt;/p&gt;

&lt;p&gt;Gradio&lt;/p&gt;

&lt;p&gt;OpenAI GPT-4o&lt;/p&gt;

&lt;p&gt;ChromaDB&lt;/p&gt;

&lt;p&gt;rank-bm25&lt;/p&gt;

&lt;p&gt;RAGAS&lt;/p&gt;

&lt;p&gt;google-play-scraper&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This project is useful for QA teams that want a lightweight bug triage assistant with searchable bug intelligence and fast summaries.&lt;/p&gt;

</description>
      <category>testautomation</category>
      <category>llm</category>
      <category>qaautomation</category>
      <category>artificialintelligen</category>
    </item>
    <item>
      <title>Prompt Injection Attacks Are Breaking AI Products — Here’s How to Stop Them</title>
      <dc:creator>Let's Automate 🛡️</dc:creator>
      <pubDate>Sat, 25 Apr 2026 12:23:02 +0000</pubDate>
      <link>https://forem.com/qa-leaders/prompt-injection-attacks-are-breaking-ai-products-heres-how-to-stop-them-4c76</link>
      <guid>https://forem.com/qa-leaders/prompt-injection-attacks-are-breaking-ai-products-heres-how-to-stop-them-4c76</guid>
      <description>&lt;h4&gt;
  
  
  The Simple, Non-Technical Guide to Defensive Prompting: How to Protect Your LLM-Powered App Before Someone Exploits It
&lt;/h4&gt;

&lt;p&gt;&lt;em&gt;📖&lt;/em&gt; &lt;strong&gt;&lt;em&gt;Full guide:&lt;/em&gt;&lt;/strong&gt; &lt;a href="https://blog.aiqualitylab.org/#/blog/2026-04-Prompt-Injection-Attacks-Are-Breaking-AI-Products" rel="noopener noreferrer"&gt;&lt;em&gt;blog.aiqualitylab.org&lt;/em&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Your AI is only as safe as the thought you put into protecting it. Prompts aren’t just instructions — they’re the rules your AI lives by. Protect them like you’d protect any critical part of your product.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1024%2F0%2A9qPV4Cq5MPfoEmEz" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1024%2F0%2A9qPV4Cq5MPfoEmEz" width="1024" height="668"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Photo by Nik Shuliahin 💛💙 on Unsplash&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The teams winning at AI aren’t just the ones moving fast. They’re the ones moving fast &lt;em&gt;and&lt;/em&gt; thinking about this.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  &lt;a href="https://blog.aiqualitylab.org/#/blog/2026-04-Prompt-Injection-Attacks-Are-Breaking-AI-Products?id=ai-is-normal-now-the-problems-aren39t" rel="noopener noreferrer"&gt;AI Is Normal Now. The Problems Aren’t.&lt;/a&gt;
&lt;/h3&gt;

</description>
      <category>testautomation</category>
      <category>llm</category>
      <category>artificialintelligen</category>
      <category>aisecurity</category>
    </item>
    <item>
      <title>GitHub Copilot CLI Remote: Control Your AI Coding Agent From Phone and Web</title>
      <dc:creator>Let's Automate 🛡️</dc:creator>
      <pubDate>Fri, 17 Apr 2026 17:10:11 +0000</pubDate>
      <link>https://forem.com/qa-leaders/github-copilot-cli-remote-control-your-ai-coding-agent-from-phone-and-web-cki</link>
      <guid>https://forem.com/qa-leaders/github-copilot-cli-remote-control-your-ai-coding-agent-from-phone-and-web-cki</guid>
      <description>&lt;h4&gt;
  
  
  New copilot --remote preview lets you steer Copilot CLI sessions from GitHub.com and GitHub Mobile — here's what it does and why it matters
&lt;/h4&gt;

&lt;blockquote&gt;
&lt;p&gt;📖 &lt;strong&gt;Full guide, team scenarios, and honest limitations:&lt;/strong&gt; &lt;a href="https://blog.aiqualitylab.org/#/blog/2026-04-github-copilot-cli-remote" rel="noopener noreferrer"&gt;blog.aiqualitylab.org&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;💻 &lt;strong&gt;Source on GitHub:&lt;/strong&gt; &lt;a href="https://github.com/aiqualitylab/blog" rel="noopener noreferrer"&gt;aiqualitylab/blog&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;🔗 &lt;strong&gt;Official GitHub changelog:&lt;/strong&gt; &lt;a href="https://github.blog/changelog/2026-04-13-remote-control-cli-sessions-on-web-and-mobile-in-public-preview/" rel="noopener noreferrer"&gt;Remote control CLI sessions on web and mobile&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;em&gt;If you use AI coding tools in your terminal, you know the problem. You start a 20-minute task, step away, and come back to find the agent stalled — waiting for you to approve something ten minutes ago.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;On April 13, GitHub shipped a fix:&lt;/em&gt; &lt;em&gt;copilot --remote.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffs5mt0pom7m1587rjdt4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffs5mt0pom7m1587rjdt4.png" width="800" height="217"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;GitHub Copilot CLI Remote: Control Your AI Coding Agent From Phone and Web&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  What it does
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;Turn on remote mode and your CLI session streams to GitHub in real time. Your terminal shows a link and a QR code. Open it on any phone or browser, and you get a live, two-way view. You can send messages, approve permissions, switch modes, and stop the session — all from your phone.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  How to turn it on
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;copilot &lt;span class="nt"&gt;--remote&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;You need to be in a GitHub repo.&lt;/p&gt;

&lt;p&gt;Copilot Business and Enterprise users need an admin to enable the policy first.&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>agents</category>
      <category>softwaredevelopment</category>
      <category>githubcopilotremote</category>
      <category>artificialintelligen</category>
    </item>
    <item>
      <title>AI-Assisted Testing vs AI Agents vs AI Agent Skills: A Practical Journey Through All Three</title>
      <dc:creator>Let's Automate 🛡️</dc:creator>
      <pubDate>Sat, 07 Mar 2026 13:08:54 +0000</pubDate>
      <link>https://forem.com/qa-leaders/ai-assisted-testing-vs-ai-agents-vs-ai-agent-skills-a-practical-journey-through-all-three-48dj</link>
      <guid>https://forem.com/qa-leaders/ai-assisted-testing-vs-ai-agents-vs-ai-agent-skills-a-practical-journey-through-all-three-48dj</guid>
      <description>&lt;h4&gt;
  
  
  Most teams are only using one layer of AI in testing. Here is what the full picture looks like — and how I built across all three.
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1024%2F0%2AOHLYcxWt1ZlY-T2z" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1024%2F0%2AOHLYcxWt1ZlY-T2z" width="1024" height="1383"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Photo by Possessed Photography on Unsplash&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Before any of this made sense, I had to answer a more basic question: what does AI QA Engineering actually mean?&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://medium.com/ai-in-quality-assurance/what-is-ai-qa-engineering-and-why-qaes-sdets-and-qa-automation-engineers-should-pay-attention-e8d26e460153" rel="noopener noreferrer"&gt;What is AI QA Engineering — and Why QAEs, SDETs, and QA Automation Engineers Should Pay Attention&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;And before touching AI at all — the foundations still matter. Clean BDD tests. Reports that stakeholders can read.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://aiqualityengineer.com/how-to-add-beautiful-bdd-test-reports-to-your-reqnroll-project-using-expressium-livingdoc-aafaf799523d" rel="noopener noreferrer"&gt;How to Add Beautiful BDD Test Reports to Your Reqnroll Project Using Expressium LivingDoc&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Before you automate smarter, you have to know what good looks like.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Layer 1 — AI-Assisted Testing
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;AI speeds you up. You are still driving.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This is where most teams start — and where most teams stay.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;You write a prompt, get a test, review it, ship it. AI is a productivity multiplier. GitHub Copilot suggests the next line. ChatGPT drafts your test cases. Claude rewrites a flaky selector. You are in control at every step.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The catch? A bad prompt gives you a bad test — and it will look convincing. Garbage in, confident garbage out.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://blog.gopenai.com/crafting-effective-prompts-for-genai-in-software-testing-e5f76d2ccbf6" rel="noopener noreferrer"&gt;Crafting Effective Prompts for GenAI in Software Testing&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;I built &lt;a href="https://github.com/aiqualitylab/ai-natural-language-tests" rel="noopener noreferrer"&gt;&lt;strong&gt;ai-natural-language-tests&lt;/strong&gt;&lt;/a&gt; at this layer. Give it a plain English requirement, and it generates Cypress or Playwright tests using GPT-4, LangChain, and LangGraph. Every output still needs your eyes on it — but the heavy lifting is done.&lt;/p&gt;

&lt;p&gt;Same idea with &lt;a href="https://github.com/aiqualitylab/JIRA-QA-Automation-with-AI" rel="noopener noreferrer"&gt;&lt;strong&gt;JIRA-QA-Automation-with-AI&lt;/strong&gt;&lt;/a&gt; &lt;strong&gt;:&lt;/strong&gt; feed it a JIRA story with acceptance criteria, and BDD test scripts come out the other side. Human judgment still required at the end. You own every decision.&lt;/p&gt;

&lt;p&gt;That last part is the definition of this layer.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Layer 2 — AI Agents for Testing
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;You give the goal. The agent executes, adapts, and decides.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;At this layer, you stop steering and start delegating.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;You set the objective. The agent figures out how to get there — and when something breaks mid-run, it handles that too. No human in the loop for every step.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/aiqualitylab/selenium-selfhealing-mcp" rel="noopener noreferrer"&gt;&lt;strong&gt;selenium-selfhealing-mcp&lt;/strong&gt;&lt;/a&gt; is a good example of what this looks like in practice. A UI change breaks a Selenium locator mid-execution. The agent inspects the DOM, finds the updated element, and keeps going — without stopping to ask you what to do. I submitted this to the Docker MCP Registry, and watching it recover from failures on its own still feels like a step-change from Layer 1.&lt;/p&gt;

&lt;p&gt;For .NET teams, &lt;a href="https://github.com/aiqualitylab/SeleniumSelfHealing.Reqnroll" rel="noopener noreferrer"&gt;&lt;strong&gt;SeleniumSelfHealing.Reqnroll&lt;/strong&gt;&lt;/a&gt; does the same with C#, NUnit, Reqnroll, and Semantic Kernel. And &lt;a href="https://github.com/aiqualitylab/IntelliTest" rel="noopener noreferrer"&gt;&lt;strong&gt;IntelliTest&lt;/strong&gt;&lt;/a&gt; takes it further — write your assertions in plain English, and the agent decides whether the application behaviour actually matches the intent.&lt;/p&gt;

&lt;p&gt;But there is a trap at this layer. Agents move fast and look thorough. It is easy to trust the output and skip the checks. Coverage looks complete — but the agent may have tested the wrong thing entirely.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://medium.com/ai-in-quality-assurance/the-ai-qa-engineers-decision-framework-when-not-to-use-ai-in-testing-5be256108750" rel="noopener noreferrer"&gt;The AI QA Engineer’s Decision Framework: When NOT to Use AI in Testing&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;And if you are using AI agents to run tests, a harder question follows: how do you know the agent’s output is correct? That is the LLM evaluation problem, and it turns out to be one of the most interesting unsolved problems in this space.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://medium.com/ai-in-quality-assurance/llm-evaluation-explained-how-to-know-if-your-ai-is-actually-working-7c17ba59c3f4" rel="noopener noreferrer"&gt;LLM Evaluation Explained: How to Know If Your AI Is Actually Working&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 3 — AI Agent Skills
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Not a tool. Not an agent. Expertise that travels.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Layer 3 is the one most people have not thought about yet.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Here is the pattern I kept running into: every new agent project started from scratch. New codebase, new prompts, same underlying knowledge — how to read a requirement, what makes a test meaningful, when to flag a risk. The expertise was always being rebuilt. That seemed wrong.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;A skill is a portable, encoded unit of expertise. It is not tied to one agent or one project. Any compatible agent can load it and apply it — without rebuilding the logic again. You build it once, and it travels.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://medium.com/ai-in-quality-assurance/github-copilot-agent-skills-teaching-ai-your-repository-patterns-01168b6d7a25" rel="noopener noreferrer"&gt;GitHub Copilot Agent Skills: Teaching AI Your Repository Patterns&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://github.com/aiqualitylab/vibe-coding-checklist" rel="noopener noreferrer"&gt;&lt;strong&gt;vibe-coding-checklist&lt;/strong&gt;&lt;/a&gt; applies the same idea to AI code review — a shared quality framework that any team or any agent can use consistently.&lt;/p&gt;

&lt;p&gt;The shift in thinking is subtle but significant. At Layer 1, you build prompts and tools. At Layer 2, you build goals and trust boundaries. At Layer 3, you build expertise itself — in a form that outlasts any single project or team.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  The Difference That Matters
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcctx1duwy2nixyo5ieop.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcctx1duwy2nixyo5ieop.png" width="800" height="315"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;AI-Assisted Testing vs AI Agents vs AI Agent Skills&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Three layers. All called AI testing. Now you know which one you are actually in.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;All repos →&lt;/em&gt; &lt;a href="https://github.com/aiqualitylab" rel="noopener noreferrer"&gt;&lt;em&gt;github.com/aiqualitylab&lt;/em&gt;&lt;/a&gt;&lt;br&gt;&lt;br&gt;
&lt;em&gt;More writing →&lt;/em&gt; &lt;a href="https://aiqualityengineer.com/" rel="noopener noreferrer"&gt;&lt;em&gt;aiqualityengineer.com&lt;/em&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;




</description>
      <category>testautomation</category>
      <category>softwareengineering</category>
      <category>artificialintelligen</category>
      <category>agents</category>
    </item>
    <item>
      <title>The GitHub Copilot Features That Are Quietly Draining Your Premium Requests</title>
      <dc:creator>Let's Automate 🛡️</dc:creator>
      <pubDate>Thu, 19 Feb 2026 17:19:23 +0000</pubDate>
      <link>https://forem.com/qa-leaders/the-github-copilot-features-that-are-quietly-draining-your-premium-requests-i34</link>
      <guid>https://forem.com/qa-leaders/the-github-copilot-features-that-are-quietly-draining-your-premium-requests-i34</guid>
      <description>&lt;h4&gt;
  
  
  &lt;em&gt;10 optimisations most developers miss — including why the Copilot Coding Agent beats Agent Mode Chat every time&lt;/em&gt;
&lt;/h4&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Most developers hit their monthly limit in the first week. Here’s what’s actually happening under the hood — and how to work smarter before it happens to you.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1024%2F0%2APnmZ7qNMCsXjh1RO" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1024%2F0%2APnmZ7qNMCsXjh1RO" width="1024" height="683"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Photo by Resume Genius on Unsplash&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Before diving in, it helps to understand what GitHub Copilot actually counts as a premium request, because most developers don’t find out until it’s too late.&lt;/p&gt;

&lt;p&gt;Inline code completions on paid plans are unlimited and cost nothing. What drains your monthly allowance is everything else — Copilot Chat, Agent Mode, Copilot Code Review, Copilot CLI, and the Copilot Coding Agent.&lt;/p&gt;

&lt;p&gt;Each model also carries a multiplier. Some models are included free on paid plans. Once your allowance is gone, premium features are locked for the rest of the billing cycle.&lt;/p&gt;

&lt;p&gt;Knowing that, here’s how to make every request count.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;1. Name your functions like they’re instructions&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Inline autocomplete is unlimited on paid plans and costs nothing from your premium allowance. The more precisely you name a function, the more accurately Copilot completes the body without any Chat involved. This is your primary tool, not a fallback.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;2. Write your intent as a comment above the cursor&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;A detailed comment placed directly before your cursor is treated by Copilot as an instruction. You get the same outcome as a Chat message at zero premium cost. Use this for any logic you would otherwise describe to Copilot in conversation.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;3. Cycle through alternatives with&lt;/strong&gt; &lt;strong&gt;Alt+] before opening Chat&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;When the first inline suggestion misses, most developers immediately reach for Chat. Before doing that, cycle through alternative suggestions. The second or third option is often exactly what’s needed — and one saved Chat message multiplies across a full day of work.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;4. Disable Agent Mode when you’re not actively using it&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Agent Mode runs in the background and silently runs even when you’re not directing it. GitHub’s official documentation explicitly flags this as a common cause of unexpected quota drain. Disable it in your repository settings when it isn’t part of your current workflow.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;5. Use the Copilot Coding Agent for complex tasks instead of Agent Mode Chat&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;This is one of the least-known optimisations available. The Copilot Coding Agent — the one that creates and modifies pull requests asynchronously — counts as one premium request per full session regardless of how much work it does. Agent Mode Chat charges one premium request per message, multiplied by the model rate. For any task involving multiple files or significant implementation work, the Coding Agent is dramatically more efficient.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;6. Start a new Chat thread when switching topics&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;As a conversation grows, all prior messages remain in context and contribute to token consumption. GitHub’s documentation specifically calls this out as a driver of elevated usage. When you move to a new task or a different area of your codebase, start a fresh thread rather than continuing an existing one.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;7. Understand the model multiplier before choosing one&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Before switching to a powerful model, weigh whether the capability gain justifies the cost. For most day-to-day work, it doesn’t.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;8. Use auto model selection for a built-in discount&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;When you enable auto model selection in Copilot Chat in VS Code, GitHub applies a 10% multiplier discount across all premium model usage. It requires no change to your workflow and the saving compounds quietly across a full month.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;9. Use&lt;/strong&gt;  &lt;strong&gt;#file references instead of&lt;/strong&gt;  &lt;strong&gt;&lt;a class="mentioned-user" href="https://dev.to/workspace"&gt;@workspace&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a class="mentioned-user" href="https://dev.to/workspace"&gt;@workspace&lt;/a&gt; scans your entire codebase on every message, consuming more than most questions require. Using #file:yourfile.ts targets exactly the context Copilot needs, which produces more focused answers with less back-and-forth and fewer requests spent getting there.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;10. Set a budget alert before your allowance runs out&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;GitHub lets you configure alerts at 75%, 90%, and 100% of any spending threshold you define. Setting a low or zero spending budget with alerts enabled means you get notified well before premium features are cut off — without risking unexpected charges. Check your current usage anytime at &lt;strong&gt;github.com/settings/billing&lt;/strong&gt; or through the Copilot icon in your IDE status bar.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  The Principle Underneath All of It
&lt;/h3&gt;

&lt;p&gt;&lt;em&gt;Every tip here points back to the same question worth asking before you open Chat: is&lt;/em&gt; &lt;strong&gt;&lt;em&gt;there a way to get this through autocomplete instead?&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Reference — &lt;a href="https://docs.github.com/en/copilot" rel="noopener noreferrer"&gt;https://docs.github.com/en/copilot&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Most of the time, there is. And building that habit is what separates developers who hit the wall in week one from those who reach month end with room to spare.&lt;/p&gt;
&lt;/blockquote&gt;




</description>
      <category>ai</category>
      <category>development</category>
      <category>softwaredevelopment</category>
      <category>softwaretesting</category>
    </item>
    <item>
      <title>AI Natural Language Tests — Dual Framework Test Automation with Cypress &amp; Playwright</title>
      <dc:creator>Let's Automate 🛡️</dc:creator>
      <pubDate>Sun, 01 Feb 2026 16:55:23 +0000</pubDate>
      <link>https://forem.com/qa-leaders/ai-natural-language-tests-dual-framework-test-automation-with-cypress-playwright-1khp</link>
      <guid>https://forem.com/qa-leaders/ai-natural-language-tests-dual-framework-test-automation-with-cypress-playwright-1khp</guid>
      <description>&lt;h3&gt;
  
  
  AI Natural Language Tests — Dual Framework Test Automation with Cypress &amp;amp; Playwright
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Open-source AI test automation framework with natural language test generation, self-healing, and dual framework support
&lt;/h4&gt;

&lt;blockquote&gt;
&lt;p&gt;Writing end-to-end tests is one of those things every team knows they should do, but nobody really enjoys doing. You stare at a login page, figure out the selectors, write the steps, handle the waits, and repeat this for every feature. I kept thinking — what if I could just say what I want to test, and let AI handle the rest?&lt;/p&gt;

&lt;p&gt;That’s exactly what I built.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fre19sjdwnfg3xlj0bw42.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fre19sjdwnfg3xlj0bw42.png" width="784" height="718"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Architecture&lt;/em&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  What Is It?
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://github.com/aiqualitylab/ai-natural-language-tests" rel="noopener noreferrer"&gt;&lt;strong&gt;ai-natural-language-tests&lt;/strong&gt;&lt;/a&gt; is an open-source tool that takes a plain English description of a test scenario and generates a fully working Cypress or Playwright test file. No templates. No copy-pasting. You describe the test, point it at a URL, and it writes the code.&lt;/p&gt;

&lt;p&gt;Here’s what a typical command looks like:&lt;br&gt;
&lt;/p&gt;


&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python qa_automation.py "Test login with valid credentials" --url https://the-internet.herokuapp.com/login
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;That single line does everything — fetches the page, reads the HTML, picks up the right selectors, and generates a complete test file you can run immediately.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Want Playwright instead of Cypress? Just add a flag:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python qa_automation.py "Test login with valid credentials" --url https://the-internet.herokuapp.com/login --framework playwright
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  How It Actually Works
&lt;/h3&gt;

&lt;p&gt;Under the hood, the tool runs a 5-step workflow built with LangGraph:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9yynpcdmfm0ci9rsxkbp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9yynpcdmfm0ci9rsxkbp.png" width="784" height="1029"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Complete Workflow&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Step 1 — It sets up a vector store. Think of this as a memory bank for test patterns.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Step 2 — It fetches the target URL, pulls the HTML, and extracts useful selectors like input fields, buttons, and links.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Step 3 — It searches the vector store for similar tests it has generated before. If you tested a login page last week, it remembers the patterns.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Step 4 — It sends everything to GPT-4 along with a carefully crafted prompt — the description, the selectors, and any matching patterns from history. The AI generates the actual test code.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Step 5 — Optionally, it runs the test right away using Cypress or Playwright.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The interesting part is Step 3. Every test the tool generates gets saved as a pattern. Over time, it builds a library of patterns and uses them to write better tests. The first test for a login page might be decent. The tenth one will be much better because it has learned from all the previous ones.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;
  
  
  Why Two Frameworks?
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;I started with Cypress because it’s what most teams I’ve worked with use. But Playwright has been gaining serious traction — especially for teams that need multi-browser testing or prefer TypeScript.&lt;/p&gt;

&lt;p&gt;So in v3.1, I added full Playwright support. The tool uses different prompts for each framework. The Cypress prompt focuses on chaining commands and cy.get() patterns. The Playwright prompt covers locators, async/await, network interception, multi-tab handling, and all the TypeScript-specific patterns.&lt;/p&gt;

&lt;p&gt;You pick the framework. The AI adapts.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;
  
  
  The Part I Didn’t Expect — Failure Analysis
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;While building this, I realized that generating tests is only half the problem. Tests fail. And reading Cypress or Playwright error logs can be painful, especially for someone newer to the frameworks.&lt;/p&gt;

&lt;p&gt;So I added an AI-powered failure analyzer:&lt;br&gt;
&lt;/p&gt;


&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python qa_automation.py --analyze "CypressError: Timed out retrying after 4000ms"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;It reads the error, explains what went wrong in plain language, and suggests a fix. You can also point it at a log file. It’s a small feature but it has saved me a surprising amount of time.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Running It in CI/CD
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;The tool comes with a GitHub Actions workflow out of the box. You can trigger it manually from the Actions tab — type your test description, provide a URL, pick Cypress or Playwright, and it runs the full pipeline. Generate, execute, and get results — all inside your CI.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fid27xcjb19ddabf6vppe.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fid27xcjb19ddabf6vppe.png" width="784" height="1143"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;CI/CD PIPELINE&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;This makes it practical for teams that want to try AI-generated tests without changing their existing setup. Just add the workflow and trigger it when you need a new test.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  What I Learned Building This
&lt;/h3&gt;

&lt;p&gt;A few things surprised me along the way:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Prompts matter more than the model.&lt;/strong&gt; I spent more time refining the system prompts than on any other part of the codebase. A well-structured prompt with clear constraints produces dramatically better test code than a vague one, regardless of which GPT model you use.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pattern learning is underrated.&lt;/strong&gt; The vector store approach turned out to be more useful than I expected. When the tool has seen similar pages before, the generated tests are noticeably more accurate. It picks up things like common selector patterns and assertion styles from its history.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Keeping frameworks separate is important.&lt;/strong&gt; Early on, I tried using a single generic prompt for both Cypress and Playwright. The results were mediocre for both. Dedicated prompts for each framework made a huge difference in output quality.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Try It Out
&lt;/h3&gt;

&lt;p&gt;The project is open source and ready to use:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/aiqualitylab/ai-natural-language-tests" rel="noopener noreferrer"&gt;github.com/aiqualitylab/ai-natural-language-tests&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;First Release —&lt;/strong&gt;  &lt;a href="https://github.com/aiqualitylab/ai-natural-language-tests/releases/tag/v2026.02.01" rel="noopener noreferrer"&gt;https://github.com/aiqualitylab/ai-natural-language-tests/releases/tag/v2026.02.01&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Setup takes about five minutes — clone the repo, install dependencies, add your OpenAI API key, and you’re generating tests.&lt;/p&gt;

&lt;p&gt;If you work in QA or test automation and you’ve been curious about how AI fits into your workflow, give it a try. I’d love to hear what you think.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Exploring how AI can make quality engineering more practical and less tedious. I write about this stuff regularly at&lt;/em&gt;&lt;/strong&gt; &lt;a href="https://aiqualityengineer.com/" rel="noopener noreferrer"&gt;&lt;strong&gt;&lt;em&gt;AI Quality Engineer&lt;/em&gt;&lt;/strong&gt;&lt;/a&gt; &lt;strong&gt;&lt;em&gt;.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;




</description>
      <category>softwareengineering</category>
      <category>programming</category>
      <category>javascript</category>
      <category>artificialintelligen</category>
    </item>
    <item>
      <title>The AI QA Engineer’s Decision Framework: When NOT to Use AI in Testing</title>
      <dc:creator>Let's Automate 🛡️</dc:creator>
      <pubDate>Sun, 25 Jan 2026 10:47:51 +0000</pubDate>
      <link>https://forem.com/qa-leaders/the-ai-qa-engineers-decision-framework-when-not-to-use-ai-in-testing-4lng</link>
      <guid>https://forem.com/qa-leaders/the-ai-qa-engineers-decision-framework-when-not-to-use-ai-in-testing-4lng</guid>
      <description>&lt;h4&gt;
  
  
  A Practical Guide for Quality Engineers Who Want Results, Not Hype
&lt;/h4&gt;

&lt;h3&gt;
  
  
  When NOT to Use AI in Testing: A Simple Guide
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Stop. Think. Then Decide.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  The Big Question
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;Everyone talks about using AI in testing. But nobody talks about when to SKIP it.&lt;/p&gt;

&lt;p&gt;This guide helps you decide: &lt;strong&gt;AI or no AI?&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Why This Matters
&lt;/h3&gt;

&lt;p&gt;AI testing sounds cool. But it comes with baggage:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;It costs money&lt;/strong&gt;  — AI tools need servers, licenses, and API calls.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It needs babysitting&lt;/strong&gt;  — Models drift. Prompts need tuning. Things break in weird ways.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It’s hard to debug&lt;/strong&gt;  — When AI tests fail, figuring out WHY is painful.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Your team might forget basics&lt;/strong&gt;  — If AI does everything, manual debugging skills fade.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;AI isn’t bad. But it’s not always the answer.&lt;/p&gt;

&lt;h3&gt;
  
  
  7 Times to Skip AI (Use Traditional Testing Instead)
&lt;/h3&gt;

&lt;h3&gt;
  
  
  1. Math and Calculations
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt; Tax calculators, loan interest, pricing formulas.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why skip AI?&lt;/strong&gt; The answer is either right or wrong. No guessing needed. No patterns to learn.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Do this instead:&lt;/strong&gt; Simple data-driven tests. Input goes in. Expected output comes out. Done.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  2. Audit and Compliance Systems
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt; Banking apps, healthcare records, legal documents.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why skip AI?&lt;/strong&gt; Auditors want proof. They want to see EXACTLY what you tested. AI is unpredictable — same prompt, different results.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Do this instead:&lt;/strong&gt; Scripted tests with detailed logs. Every step recorded. Every result traceable.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  3. Speed and Load Testing
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt; Can your app handle 10,000 users at once?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why skip AI?&lt;/strong&gt; You’re measuring app speed. AI adds its own delay. You’d be measuring AI, not your app.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Do this instead:&lt;/strong&gt; Use tools built for this — JMeter, k6, Gatling. They’re fast and focused.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  4. Basic CRUD Operations
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt; Create user. Read user. Update user. Delete user.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why skip AI?&lt;/strong&gt; It’s simple. AI is overkill. Like using a rocket to go to the grocery store.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Do this instead:&lt;/strong&gt; Write one test template. Copy it for each operation. Fast and easy.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  5. Screens That Never Change
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt; Internal admin panels. Old systems nobody touches.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why skip AI?&lt;/strong&gt; AI shines when things CHANGE. Self-healing locators fix moving targets. No movement? No need.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Do this instead:&lt;/strong&gt; Regular automation. Page Object Model. Set it and forget it.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  6. Security Testing
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt; Finding SQL injection, XSS attacks, login bypasses.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why skip AI?&lt;/strong&gt; Security needs creative thinking. Breaking things in new ways. AI follows patterns — hackers don’t.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Do this instead:&lt;/strong&gt; Security tools (OWASP ZAP, Burp Suite) plus human testers who think like attackers.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  7. Physical Device Testing
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt; Barcode scanners, payment terminals, IoT sensors.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why skip AI?&lt;/strong&gt; AI lives in software. It can’t press physical buttons or read blinking lights.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Do this instead:&lt;/strong&gt; Hardware test rigs. Human testers. Real-world verification.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  The Quick Decision Guide
&lt;/h3&gt;

&lt;p&gt;Ask yourself these 4 questions:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fin57e16hm04f6y9q9giy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fin57e16hm04f6y9q9giy.png" width="800" height="476"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;DECISION TABLE FRAMEWORK&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Before You Buy Any AI Tool, Answer These:
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;What exact problem am I solving?&lt;/strong&gt; (Not “we want AI” — a real problem)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can a simple script fix this?&lt;/strong&gt; (Seriously, can it?)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How will I know if it worked?&lt;/strong&gt; (What number goes up or down?)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who will maintain it?&lt;/strong&gt; (AI tools need constant care)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can I explain it to my boss?&lt;/strong&gt; (If you can’t explain it, don’t buy it)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  The Simple Truth
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;AI is a tool. Not a magic wand.&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Good testers know WHEN to use each tool:&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq617lq3te9cpuqutxkx6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq617lq3te9cpuqutxkx6.png" width="800" height="331"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;USAGE CHECKLIST&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  One Page Summary
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;USE AI FOR:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Generating test ideas from requirements&lt;/p&gt;

&lt;p&gt;Handling UI changes automatically&lt;/p&gt;

&lt;p&gt;Analyzing why tests keep failing&lt;/p&gt;

&lt;p&gt;Creating test data variations&lt;/p&gt;

&lt;p&gt;Exploring edge cases&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;SKIP AI FOR:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Exact calculations (math, money, dates)&lt;/p&gt;

&lt;p&gt;Compliance and audit trails&lt;/p&gt;

&lt;p&gt;Performance/load measurements&lt;/p&gt;

&lt;p&gt;Simple CRUD operations&lt;/p&gt;

&lt;p&gt;Stable, unchanging systems&lt;/p&gt;

&lt;p&gt;Security penetration testing&lt;/p&gt;

&lt;p&gt;Physical hardware testing&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Final Word
&lt;/h3&gt;

&lt;p&gt;The smartest move isn’t always the newest tool.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Sometimes a simple script beats a fancy AI.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Know when to use AI. Know when to skip it. That’s real skill.&lt;/strong&gt;
&lt;/h3&gt;




</description>
      <category>qualityassurance</category>
      <category>softwaredevelopment</category>
      <category>artificialintelligen</category>
      <category>testautomation</category>
    </item>
    <item>
      <title>Machine Learning Pipelines Made Easy for Quality Assurance Professionals</title>
      <dc:creator>Let's Automate 🛡️</dc:creator>
      <pubDate>Sat, 10 Jan 2026 19:45:18 +0000</pubDate>
      <link>https://forem.com/qa-leaders/machine-learning-pipelines-made-easy-for-quality-assurance-professionals-12ei</link>
      <guid>https://forem.com/qa-leaders/machine-learning-pipelines-made-easy-for-quality-assurance-professionals-12ei</guid>
      <description>&lt;h4&gt;
  
  
  &lt;em&gt;A very simple guide to how machine learning works&lt;/em&gt;
&lt;/h4&gt;

&lt;blockquote&gt;
&lt;p&gt;Machine learning looks hard. But it is not.&lt;/p&gt;

&lt;p&gt;If you know QA, you already know the basics.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;em&gt;ML systems have three parts. We call them FTI:&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;F = Feature (clean the data)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;T = Training (teach the model)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;I = Inference (use the model)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Let me explain each one.&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 1: Feature Pipeline
&lt;/h3&gt;

&lt;h3&gt;
  
  
  What does it do?
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;It cleans dirty data.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Simple example:
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;You have messy data. Names are written in different ways. Dates are in wrong formats. Numbers have errors.&lt;/p&gt;

&lt;p&gt;This pipeline fixes all that. It makes data clean and ready.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2sjw3nhsg5p6a6vm15j6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2sjw3nhsg5p6a6vm15j6.png" width="800" height="1117"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Feature Pipeline Detail&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  In QA words:
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;You never test with bad data. You clean it first. This pipeline does the same thing.&lt;/p&gt;

&lt;p&gt;The clean data goes to a &lt;strong&gt;Feature Store&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Part 2: Training Pipeline
&lt;/h3&gt;

&lt;h3&gt;
  
  
  What does it do?
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;It teaches the model.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Simple example:
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;You show the model 1000 pictures of cats. You tell it “this is a cat” each time. The model learns what a cat looks like.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  In QA words:
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;You learn from requirements. Then you write test cases. The model learns from data. Then it can make predictions.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Picture:
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;The smart model goes to a &lt;strong&gt;Model Registry&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6t1c9wqdbwdfpkz7me92.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6t1c9wqdbwdfpkz7me92.png" width="800" height="139"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Training Pipeline Detail&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 3: Inference Pipeline
&lt;/h3&gt;

&lt;h3&gt;
  
  
  What does it do?
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;It uses the model to answer questions.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Simple example:
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;Someone shows a new picture. The model says “this is a cat” or “this is not a cat.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  In QA words:
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;This is like running tests in production. The model is working and giving answers.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3wwuf796rcsspkak78gw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3wwuf796rcsspkak78gw.png" width="800" height="122"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Inference Pipeline Detail&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Two Important Storage Places
&lt;/h3&gt;

&lt;h3&gt;
  
  
  Feature Store
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;Keeps clean data&lt;/p&gt;

&lt;p&gt;Saves old versions&lt;/p&gt;

&lt;p&gt;Everyone uses same data&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Model Registry
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;Keeps trained models&lt;/p&gt;

&lt;p&gt;Saves old versions&lt;/p&gt;

&lt;p&gt;You know which model is in production&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  The Full Picture
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnobw22n1cn7eup1oshh8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnobw22n1cn7eup1oshh8.png" width="800" height="92"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Full FTI Pipeline Overview&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Why This is Easy for QA
&lt;/h3&gt;

&lt;p&gt;You already know:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;✓ How to check data quality → Test Feature Pipeline&lt;/p&gt;

&lt;p&gt;✓ How to compare old vs new → Test Training Pipeline&lt;/p&gt;

&lt;p&gt;✓ How to test in production → Test Inference Pipeline&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Five Things to Remember
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Three parts.&lt;/strong&gt; Feature, Training, Inference. That’s it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Clean data is key.&lt;/strong&gt; Bad data = bad model.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Save everything.&lt;/strong&gt; Keep old data. Keep old models. You can go back if needed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Test each part.&lt;/strong&gt; Don’t test everything together. Test one part at a time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Your skills work here.&lt;/strong&gt; QA testing skills work for ML testing too.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Last Words
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;ML is just &lt;strong&gt;software with a learning step.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You already know how to &lt;strong&gt;test software.&lt;/strong&gt; Now you can &lt;strong&gt;test ML too.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Start simple. Ask: &lt;strong&gt;“Show me the three pipelines.”&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Then test each one.&lt;/p&gt;

&lt;p&gt;You can do this.&lt;/p&gt;
&lt;/blockquote&gt;




</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>qualityassurance</category>
      <category>softwaretesting</category>
    </item>
    <item>
      <title>I Built an AI-Powered Test Data Generator That Analyzes Any URL and Creates Test Data JSON</title>
      <dc:creator>Let's Automate 🛡️</dc:creator>
      <pubDate>Wed, 31 Dec 2025 19:12:47 +0000</pubDate>
      <link>https://forem.com/letsautomate/i-built-an-ai-powered-test-data-generator-that-analyzes-any-url-and-creates-test-data-json-48l2</link>
      <guid>https://forem.com/letsautomate/i-built-an-ai-powered-test-data-generator-that-analyzes-any-url-and-creates-test-data-json-48l2</guid>
      <description>&lt;h4&gt;
  
  
  &lt;em&gt;I got tired of manually inspecting HTML to find selectors. So I taught my framework to do it instead.&lt;/em&gt;
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl07bqppbcobwxqacbhu2.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl07bqppbcobwxqacbhu2.gif" width="800" height="900"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Architecture flow&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Here’s a question that kept me up at night:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Why am I spending more time finding selectors than writing actual tests?&lt;/p&gt;

&lt;p&gt;I watched myself burn 30 minutes on a simple login test — not writing the test itself, but hunting through DevTools for the right selectors, creating fixture files, and crafting test data that would actually work.&lt;/p&gt;

&lt;p&gt;What if the framework could just… look at the page and figure it out?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;
  
  
  The Problem Nobody Talks About
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;Here’s the dirty secret of test automation: &lt;strong&gt;writing the actual test is the easy part.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The hard part? Finding #username vs input[name="user"] vs .login-field. Creating realistic test data. Building fixture files that match the actual form structure.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Every new page means:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Open DevTools&lt;/p&gt;

&lt;p&gt;Inspect elements&lt;/p&gt;

&lt;p&gt;Copy selectors&lt;/p&gt;

&lt;p&gt;Hope they’re stable&lt;/p&gt;

&lt;p&gt;Create JSON fixtures&lt;/p&gt;

&lt;p&gt;Hope nothing changes tomorrow&lt;/p&gt;

&lt;p&gt;Most “AI-powered” testing tools focus on running tests or analyzing failures. But what about the beginning — the tedious setup that drains your time before you write a single assertion?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;
  
  
  The Experiment: Teaching AI to See
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;The idea was simple but audacious: &lt;strong&gt;give the AI a URL and let it figure out everything else.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not mock data. Not hardcoded selectors. Real selectors from real HTML.&lt;/p&gt;

&lt;p&gt;Here’s what I wanted:&lt;br&gt;
&lt;/p&gt;


&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python qa_automation.py "Test login" --url https://the-internet.herokuapp.com/login
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And the framework should:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Fetch the actual page&lt;/p&gt;

&lt;p&gt;Analyze the HTML structure&lt;/p&gt;

&lt;p&gt;Extract real, working selectors&lt;/p&gt;

&lt;p&gt;Generate meaningful test cases&lt;/p&gt;

&lt;p&gt;Save everything as a Cypress fixture&lt;/p&gt;

&lt;p&gt;Then generate tests that use that data&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Sounds impossible? I thought so too.&lt;/p&gt;

&lt;h3&gt;
  
  
  How It Actually Works
&lt;/h3&gt;

&lt;p&gt;The magic happens in about 50 lines of Python:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def generate_test_data_from_url(url: str, requirements: list) -&amp;gt; tuple:
    # Step 1: Fetch the real page
    resp = requests.get(url, timeout=10, headers={'User-Agent': 'Mozilla/5.0'})
    html = resp.text[:5000] # First 5KB is usually enough

    # Step 2: Ask AI to analyze it
    llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

    prompt = f"""Analyze this HTML and generate test data.

    URL: {url}
    HTML: {html}

    Return JSON with:
    - Real selectors from the HTML
    - Valid test case with working data
    - Invalid test case for error handling
    """

    # Step 3: Parse and save as fixture
    test_data = json.loads(llm.invoke(prompt).content)

    with open("cypress/fixtures/url_test_data.json", 'w') as f:
        json.dump(test_data, f, indent=2)

    return test_data
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The AI doesn’t guess. It reads the actual HTML and extracts what’s really there.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhwsrrmhq11zuycl193gj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhwsrrmhq11zuycl193gj.png" width="800" height="1717"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Complete Workflow&lt;/em&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  What The AI Sees vs What It Returns
&lt;/h3&gt;

&lt;p&gt;When I point it at a login page, here’s the actual flow:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Input:&lt;/strong&gt; Just a URL&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;--url https://the-internet.herokuapp.com/login
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What the AI analyzes:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;input type="text" id="username" name="username"&amp;gt;
&amp;lt;input type="password" id="password" name="password"&amp;gt;
&amp;lt;button type="submit" class="radius"&amp;gt;Login&amp;lt;/button&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What it generates:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "url": "https://the-internet.herokuapp.com/login",
  "selectors": {
    "username": "#username",
    "password": "#password",
    "submit": "button[type='submit']"
  },
  "test_cases": [
    {
      "name": "valid_test",
      "username": "tomsmith",
      "password": "SuperSecretPassword!",
      "expected": "success"
    },
    {
      "name": "invalid_test", 
      "username": "wronguser",
      "password": "badpassword",
      "expected": "error"
    }
  ]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Real selectors. Actual test data. Zero manual inspection.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Generated Test Uses It All
&lt;/h3&gt;

&lt;p&gt;The framework then generates a Cypress test that consumes this fixture:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;describe('Login Tests', function () {
    beforeEach(function () {
        cy.fixture('url_test_data').then((data) =&amp;gt; {
            this.testData = data;
        });
    });

it('should login with valid credentials', function () {
        cy.visit(this.testData.url);
        const valid = this.testData.test_cases.find(tc =&amp;gt; tc.name === 'valid_test');

        cy.get(this.testData.selectors.username).type(valid.username);
        cy.get(this.testData.selectors.password).type(valid.password);
        cy.get(this.testData.selectors.submit).click();

        cy.url().should('include', '/secure');
    });
    it('should show error with invalid credentials', function () {
        cy.visit(this.testData.url);
        const invalid = this.testData.test_cases.find(tc =&amp;gt; tc.name === 'invalid_test');

        cy.get(this.testData.selectors.username).type(invalid.username);
        cy.get(this.testData.selectors.password).type(invalid.password);
        cy.get(this.testData.selectors.submit).click();

        cy.get('#flash').should('contain', 'invalid');
    });
});
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Notice something? &lt;strong&gt;The selectors come from the fixture, not hardcoded in the test.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If the page changes, update the fixture. Tests stay clean.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Two Ways to Feed Data
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;Sometimes you already have test data. Maybe from a previous run. Maybe from your team’s shared fixtures.&lt;/p&gt;

&lt;p&gt;So I added a second option:&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Option 1: AI analyzes live URL
python qa_automation.py "Test login" --url https://example.com/login

# Option 2: Use existing JSON file
python qa_automation.py "Test login" --data cypress/fixtures/my_data.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same test generation. Different data sources. Your choice.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Part That Surprised Me
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;I expected the AI to find basic selectors. What I didn’t expect was how well it understood &lt;strong&gt;context&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;When analyzing a registration form, it didn’t just find #email — it generated test data like:&lt;/p&gt;

&lt;p&gt;Valid: &lt;a href="mailto:testuser@example.com"&gt;testuser@example.com&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Invalid: not-an-email&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;For password fields:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Valid: SecurePass123!&lt;/p&gt;

&lt;p&gt;Invalid: 123 (too short)&lt;/p&gt;

&lt;p&gt;The AI understood what kind of data each field expected. Not because I told it — because it read the HTML attributes, labels, and validation patterns.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  The Gotcha: Fixtures Need function() Syntax
&lt;/h3&gt;

&lt;p&gt;One thing tripped me up for hours. Cypress fixtures with this.testData require a specific pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// WRONG - arrow functions don't have 'this'
describe('Test', () =&amp;gt; {
    beforeEach(() =&amp;gt; {
        cy.fixture('data').then((d) =&amp;gt; { this.testData = d; }); // undefined!
    });
});

// RIGHT - function() preserves 'this'
describe('Test', function () {
    beforeEach(function () {
        cy.fixture('data').then((data) =&amp;gt; { this.testData = data; });
    });

    it('works', function () {
        console.log(this.testData); // actual data!
    });
});
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The framework now enforces this pattern in generated tests. Lesson learned the hard way.&lt;/p&gt;

&lt;h3&gt;
  
  
  What This Means For Your Workflow
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Before:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Open page in browser&lt;/p&gt;

&lt;p&gt;Inspect elements manually&lt;/p&gt;

&lt;p&gt;Copy selectors to notepad&lt;/p&gt;

&lt;p&gt;Create fixture JSON by hand&lt;/p&gt;

&lt;p&gt;Write test using those selectors&lt;/p&gt;

&lt;p&gt;Fix typos in selectors&lt;/p&gt;

&lt;p&gt;Run test&lt;/p&gt;

&lt;p&gt;Debug why selectors don’t work&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;After:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Run one command with URL&lt;/p&gt;

&lt;p&gt;Framework handles the rest&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That’s not an exaggeration. The 30-minute login test? &lt;strong&gt;Under 2 minutes now.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Try It Yourself
&lt;/h3&gt;

&lt;p&gt;The framework is open source:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git clone https://github.com/user/cypress-natural-language-tests
cd cypress-natural-language-tests
pip install -r requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Set your API key:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export OPENAI_API_KEY=your_key_here
export OPENROUTER_API_KEY=your_openrouter_api_key_here
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Generate tests from any URL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python qa_automation.py "Test the login form" --url https://the-internet.herokuapp.com/login
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check what it created:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cat cypress/fixtures/url_test_data.json
cat cypress/e2e/generated/*.cy.js
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Bigger Picture
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;We’re at an interesting moment in test automation. The tooling is getting smarter, but&lt;/em&gt; &lt;strong&gt;&lt;em&gt;the real breakthrough isn’t replacing testers — it’s eliminating the tedious parts.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Finding selectors is tedious. Creating fixture files is tedious. Debugging why&lt;/em&gt; &lt;em&gt;#submit-btn worked yesterday but not today is tedious.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Let AI handle tedious. Let humans handle important.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;That’s the framework I’m building.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Follow for more AI + QA experiments:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/aiqualitylab/cypress-natural-language-tests.git" rel="noopener noreferrer"&gt;https://github.com/aiqualitylab/cypress-natural-language-tests.git&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>openai</category>
      <category>llm</category>
      <category>langgraph</category>
    </item>
    <item>
      <title>I Built an AI-Powered Cypress Framework That Analyses Test Failures for Free</title>
      <dc:creator>Let's Automate 🛡️</dc:creator>
      <pubDate>Sun, 28 Dec 2025 14:03:59 +0000</pubDate>
      <link>https://forem.com/qa-leaders/i-built-an-ai-powered-cypress-framework-that-analyses-test-failures-for-free-5f78</link>
      <guid>https://forem.com/qa-leaders/i-built-an-ai-powered-cypress-framework-that-analyses-test-failures-for-free-5f78</guid>
      <description>&lt;h4&gt;
  
  
  Cypress test debugging is painful. This free AI-powered framework analyses failures instantly and tells you exactly what went wrong.
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvcbcjpl0coe6p2wprcku.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvcbcjpl0coe6p2wprcku.gif" width="900" height="350"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;AI-Powered Cypress Framework That Analyses Test Failures for Free&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Ever stared at a cryptic Cypress error message wondering what broke? 😩 We’ve all been there. That’s why I built something that changed my debugging workflow forever.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h4&gt;
  
  
  Introducing &lt;strong&gt;v2.1&lt;/strong&gt; of my Cypress Natural Language Test Framework — now featuring &lt;strong&gt;🔍 AI Failure Analysis&lt;/strong&gt; that costs you absolutely nothing.
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd7yciqspi8tbs2gialcp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd7yciqspi8tbs2gialcp.png" width="800" height="1806"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  😤 The Problem Every QA Engineer Knows
&lt;/h3&gt;

&lt;p&gt;Picture this: Your CI pipeline fails and error be like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CypressError: Timed out retrying after 4000ms: Expected to find element: '#submit-btn', but never found it.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now you’re left guessing:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;🤔 Did the selector change?&lt;/p&gt;

&lt;p&gt;⏳ Is the page loading too slowly?&lt;/p&gt;

&lt;p&gt;✏️ Did someone rename the button?&lt;/p&gt;

&lt;p&gt;⚡ Is it a timing issue?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You spend the next hour digging through logs, comparing commits, and testing locally. Sound familiar?&lt;/p&gt;

&lt;h3&gt;
  
  
  💡 The Solution: AI That Debugs For You
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;With v2.1, debugging becomes a one-liner:&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python qa_automation.py --analyze "CypressError: Timed out retrying: Expected to find element: #submit-btn"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;🔍 Analyzing...
REASON: Element #submit-btn not found - selector likely changed during recent UI update
FIX: Use cy.get('[data-testid="submit"]') or add cy.wait() before the click action
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;✅ Two lines. Problem identified. Solution provided. Done.&lt;/p&gt;

&lt;h3&gt;
  
  
  🏗️ System Architecture
&lt;/h3&gt;

&lt;p&gt;Here’s how the entire framework fits together:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1024%2F0%2AZhfR1pLUFuBdtjCj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1024%2F0%2AZhfR1pLUFuBdtjCj.png" width="800" height="3621"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  ⚙️ How It Works Under The Hood
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;The implementation is surprisingly simple. Here’s the core function:&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def analyze_failure(log: str) -&amp;gt; str:
    response = requests.post(
        url="https://openrouter.ai/api/v1/chat/completions",
        headers={
            "Authorization": f"Bearer {os.getenv('OPENROUTER_API_KEY')}",
            "Content-Type": "application/json"
        },
        json={
            "model": "deepseek/deepseek-r1-0528:free",
            "messages": [{"role": "user", "content": f"Analyze this Cypress test failure. Reply ONLY:\nREASON: (one line)\nFIX: (one line)\n\n{log}"}],
            "max_tokens": 150
        }
    )
    return response.json()["choices"][0]["message"]["content"]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;That’s it. About 15 lines of code that leverage OpenRouter’s free tier with DeepSeek R1. 🆓&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  🛠️ Three Ways To Use It
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;1️⃣ Direct from command line:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python qa_automation.py --analyze "Your error message here"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;2️⃣ From a log file:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python qa_automation.py --analyze -f cypress-output.log
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;3️⃣ Piped from another command:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cat error.log | python qa_automation.py --analyze
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  🔄 CI/CD Integration
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;The real power comes when you integrate this into your pipeline. Here’s how the updated GitHub Actions workflow looks:&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F03u2nnc3qchw9iiea2f6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F03u2nnc3qchw9iiea2f6.png" width="800" height="1013"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- name: Run Cypress tests
  id: tests
  continue-on-error: true
  run: |
    npx cypress run --spec "cypress/e2e/generated/**/*.cy.js" 2&amp;gt;&amp;amp;1 | tee test-output.log- name: AI Failure Analysis
  if: steps.tests.outcome == 'failure'
  env:
    OPENROUTER_API_KEY: ${{ secrets.OPENROUTER_API_KEY }}
  run: |
    echo "Analyzing failures with AI..."
    python qa_automation.py --analyze -f test-output.log

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When tests fail, your CI logs now include actionable insights instead of just error dumps. 📋&lt;/p&gt;

&lt;h3&gt;
  
  
  🚀 Setting It Up
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Step 1:&lt;/strong&gt; Get your free API key from &lt;a href="https://openrouter.ai/" rel="noopener noreferrer"&gt;openrouter.ai&lt;/a&gt; 🔑&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2:&lt;/strong&gt; Add to your .env:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;OPENROUTER_API_KEY=your_key_here
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 3:&lt;/strong&gt; Add requests to requirements.txt (if not already there) 📦&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 4:&lt;/strong&gt; Start analyzing 🎉&lt;/p&gt;

&lt;p&gt;That’s the entire setup. No complex configurations. No paid subscriptions.&lt;/p&gt;

&lt;h3&gt;
  
  
  🖥️ Local Development Flow
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;For local development, the flow is just as smooth:&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1024%2F0%2A7hr1LYYMY2vpxfdY.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1024%2F0%2A7hr1LYYMY2vpxfdY.png" width="800" height="3668"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  📦 What’s In v2.1
&lt;/h3&gt;

&lt;p&gt;Here’s everything new in this release:&lt;/p&gt;

&lt;h4&gt;
  
  
  Feature Description
&lt;/h4&gt;

&lt;blockquote&gt;
&lt;p&gt;🔍 &lt;strong&gt;AI Failure Analyzer&lt;/strong&gt; Instant debugging with free LLM&lt;/p&gt;

&lt;p&gt;🌐 &lt;strong&gt;OpenRouter Integration&lt;/strong&gt; Uses DeepSeek R1 at zero cost&lt;/p&gt;

&lt;p&gt;💻 &lt;strong&gt;CLI Flag&lt;/strong&gt; Simple --analyze command&lt;/p&gt;

&lt;p&gt;📁 &lt;strong&gt;File Input&lt;/strong&gt; Analyze entire log files with -f&lt;/p&gt;

&lt;p&gt;⚙️ &lt;strong&gt;CI/CD Ready&lt;/strong&gt; Updated GitHub Actions workflow&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Combined with v2.0 features:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;🤖 Natural language test generation&lt;/p&gt;

&lt;p&gt;🔄 cy.prompt() self-healing tests&lt;/p&gt;

&lt;p&gt;📊 LangGraph workflow orchestration&lt;/p&gt;

&lt;p&gt;📚 Vector store documentation context&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  🌍 Real World Example
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Old approach:&lt;/strong&gt; Manual Investigation 😓&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;New approach:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python qa_automation.py --analyze -f nightly-run.log
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;REASON: Login button selector changed from #login-btn to .auth-button
FIX: Update selector to cy.get('.auth-button') or use data-testid

REASON: API response timeout - server took 6s, test timeout was 4s
FIX: Increase timeout with cy.request({timeout: 10000}) or add retry logic

REASON: Element detached from DOM after React re-render
FIX: Add cy.wait() after state change or use {force: true} option
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  🔗 Try It Yourself
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;The framework is open source and available now:&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;🔗 &lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/aiqualitylab/cypress-natural-language-tests" rel="noopener noreferrer"&gt;github.com/aiqualitylab/cypress-natural-language-tests&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Clone it, set up your API keys, and start generating tests and debugging failures with AI.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  💭 Final Thoughts
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;AI shouldn’t just generate code. It should help maintain it too. This failure analyzer is my attempt at closing that loop — from requirements to tests to debugging, all AI-assisted.&lt;/p&gt;

&lt;p&gt;The best part? It’s completely &lt;strong&gt;free&lt;/strong&gt; to use. 🆓&lt;/p&gt;

&lt;p&gt;Give it a try and let me know how much time it saves you! 💬&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;em&gt;If this helped you, consider ⭐ starring the repo. It helps others discover it.&lt;/em&gt;&lt;/p&gt;




</description>
      <category>llm</category>
      <category>langchain</category>
      <category>ai</category>
      <category>testautomation</category>
    </item>
    <item>
      <title>AI-Powered Cypress Test Generation from Natural Language v2.0 — Now with cy.prompt() Self-Healing</title>
      <dc:creator>Let's Automate 🛡️</dc:creator>
      <pubDate>Sat, 27 Dec 2025 11:46:37 +0000</pubDate>
      <link>https://forem.com/qa-leaders/ai-powered-cypress-test-generation-from-natural-language-v20-now-with-cyprompt-self-healing-5ebe</link>
      <guid>https://forem.com/qa-leaders/ai-powered-cypress-test-generation-from-natural-language-v20-now-with-cyprompt-self-healing-5ebe</guid>
      <description>&lt;h3&gt;
  
  
  AI-Powered Cypress Test Generation from Natural Language — Now with cy.prompt() Self-Healing
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Transform plain English requirements into production-ready Cypress tests using GPT-4, LangChain, and LangGraph — run locally or in CI/CD&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;My Open-source project: &lt;a href="https://github.com/aiqualitylab/cypress-natural-language-tests" rel="noopener noreferrer"&gt;&lt;strong&gt;github.com/aiqualitylab/cypress-natural-language-tests&lt;/strong&gt;&lt;/a&gt;, which utilizes Cypress’s official AI-powered &lt;strong&gt;cy.prompt()&lt;/strong&gt; command introduced at &lt;strong&gt;CypressConf 2025&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo2ppga1md065afnq39qk.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo2ppga1md065afnq39qk.gif" width="720" height="720"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;AI-Powered Cypress Test Generation from Natural Language v2.0 — Now with cy.prompt() Self-Healing&lt;/em&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Introduction
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;Testing shouldn’t be complicated. You know what your application should do — why spend hours writing boilerplate test code?&lt;/p&gt;

&lt;p&gt;I built &lt;a href="https://github.com/aiqualitylab/cypress-natural-language-tests" rel="noopener noreferrer"&gt;&lt;strong&gt;cypress-natural-language-tests&lt;/strong&gt;&lt;/a&gt; to bridge the gap between your test ideas and working Cypress code. Just describe your test in plain English:&lt;br&gt;
&lt;/p&gt;


&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python qa_automation.py "Test user login with valid credentials" --run
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt; A complete .cy.js file generated and executed automatically!&lt;/p&gt;

&lt;p&gt;And now, with the latest update, the framework also supports &lt;strong&gt;Cypress’s new cy.prompt()&lt;/strong&gt; command for self-healing, AI-powered test execution.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  What’s New: cy.prompt() Integration
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;Cypress recently launched cy.prompt() — their official AI command that converts natural language into test steps at runtime. My framework now supports both approaches:&lt;/p&gt;

&lt;p&gt;Mode Description Best For &lt;strong&gt;Generate Mode&lt;/strong&gt; Creates complete .cy.js test files Version control, CI/CD pipelines &lt;strong&gt;cy.prompt() Mode&lt;/strong&gt; Generates tests using cy.prompt() syntax Self-healing tests, rapid prototyping&lt;/p&gt;

&lt;p&gt;You choose what works best for your workflow!&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  How It Works
&lt;/h3&gt;

&lt;p&gt;&lt;em&gt;👆 The complete workflow — from requirements to executed tests&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The framework supports &lt;strong&gt;two execution paths&lt;/strong&gt; :&lt;/p&gt;

&lt;h3&gt;
  
  
  🖥️ Local Machine Flow v/s ⚙️ GitHub Actions CI Flow
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2lo22bbwvy5d8ssft8u3.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2lo22bbwvy5d8ssft8u3.gif" width="480" height="600"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;🖥️ Local Machine Flow v/s ⚙️ GitHub Actions CI Flow&lt;/em&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Two Powerful Modes
&lt;/h3&gt;
&lt;h3&gt;
  
  
  Mode 1: Traditional Test Generation
&lt;/h3&gt;

&lt;p&gt;Generate standard Cypress test files that you own and version control:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python qa_automation.py "Test user login with valid credentials"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt;  &lt;strong&gt;01_test-user-login_20241223_102030.cy.js&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;describe('User Login', () =&amp;gt; {
  it('should login successfully with valid credentials', () =&amp;gt; {
    cy.visit('https://the-internet.herokuapp.com/login');
    cy.get('#username').type('tomsmith');
    cy.get('#password').type('SuperSecretPassword!');
    cy.get('button[type="submit"]').click();
    cy.get('.flash.success').should('be.visible');
  });
});
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Mode 2: cy.prompt() Generation
&lt;/h3&gt;

&lt;p&gt;Generate tests using Cypress’s new AI-powered cy.prompt() command for self-healing capabilities:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python qa_automation.py "Test user login" --use-cyprompt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt;  &lt;strong&gt;01_test-user-login_20241223_102030.cy.js&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;describe('User Login', () =&amp;gt; {
  it('should login successfully with valid credentials', () =&amp;gt; {
    cy.prompt([
      'Visit the login page at https://the-internet.herokuapp.com/login',
      'Type "tomsmith" in the username field',
      'Type "SuperSecretPassword!" in the password field',
      'Click the login button',
      'Verify the success message is visible'
    ]);
  });
});
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why cy.prompt()?&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;🔄 &lt;strong&gt;Self-healing&lt;/strong&gt; : Tests adapt when UI changes&lt;/p&gt;

&lt;p&gt;📝 &lt;strong&gt;Readable&lt;/strong&gt; : Natural language steps in your test files&lt;/p&gt;

&lt;p&gt;🛡️ &lt;strong&gt;Resilient&lt;/strong&gt; : Less maintenance when selectors change&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Quick Start
&lt;/h3&gt;

&lt;h3&gt;
  
  
  Installation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Clone the repository
git clone https://github.com/aiqualitylab/cypress-natural-language-tests.git
cd cypress-natural-language-tests

# Set up Python environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt

# Configure OpenAI API key
echo "OPENAI_API_KEY=your_key_here" &amp;gt; .env

# Initialize Cypress
npm install cypress --save-dev
npx cypress open
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Generate Your First Test
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Standard Cypress test
python qa_automation.py "Test user registration flow"

# With cy.prompt() syntax
python qa_automation.py "Test user registration flow" --use -cyprompt

# Generate and run immediately
python qa_automation.py "Test homepage loads correctly" --run
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Practical Examples
&lt;/h3&gt;

&lt;h3&gt;
  
  
  Example 1: Multiple Test Requirements
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python qa_automation.py \
  "Test successful login with valid credentials" \
  "Test login fails with wrong password" \
  "Test login form shows validation errors for empty fields"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Creates three separate test files — one for each requirement.&lt;/p&gt;

&lt;h3&gt;
  
  
  Example 2: With Documentation Context (RAG)
&lt;/h3&gt;

&lt;p&gt;Supercharge test generation with your own documentation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python qa_automation.py \
  "Test checkout API according to specifications" \
  --docs ./api-documentation \
  --persist-vstore
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The framework indexes your docs into ChromaDB and uses them as context for more accurate test generation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Example 3: Generate and Execute Locally
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python qa_automation.py "Test user profile update" --run
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Generates the test AND runs Cypress immediately. View results in your terminal.&lt;/p&gt;

&lt;h3&gt;
  
  
  Example 4: CI/CD Integration
&lt;/h3&gt;

&lt;p&gt;Trigger via GitHub Actions to generate tests in your pipeline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- name: Generate Tests
  run: python qa_automation.py "${{ github.event.inputs.requirement }}"

- name: Run Cypress
  run: npx cypress run

- name: Upload Artifacts
  uses: actions/upload-artifact@v3
  with:
    name: cypress-results
    path: |
      cypress/videos
      cypress/screenshots
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Why Choose This Framework?
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;Feature Benefit &lt;strong&gt;Dual Mode Support&lt;/strong&gt; Standard Cypress OR cy.prompt() — your choice &lt;strong&gt;Complete Test Files&lt;/strong&gt; Version control your generated tests &lt;strong&gt;Documentation-Aware&lt;/strong&gt; RAG integration for accurate, context-rich tests &lt;strong&gt;Local &amp;amp; CI Ready&lt;/strong&gt; Works on your machine and in GitHub Actions &lt;strong&gt;Model Flexibility&lt;/strong&gt; Use GPT-4, GPT-4o-mini, or GPT-3.5-turbo &lt;strong&gt;Open Source&lt;/strong&gt; Full control, no vendor lock-in&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Configuration
&lt;/h3&gt;

&lt;h3&gt;
  
  
  Change AI Model
&lt;/h3&gt;

&lt;p&gt;In qa_automation.py:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;llm = ChatOpenAI(
    model="gpt-4o-mini", # Options: gpt-4, gpt-4o, gpt-3.5-turbo
    temperature=0
)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Set Your Application URL
&lt;/h3&gt;

&lt;p&gt;Update the prompt template to target your application:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CY_PROMPT_TEMPLATE = """
...
- Use `cy.visit('https://your-app-url.com')` as the base URL.
...
"""
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Get Started Now
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;🔗&lt;/strong&gt; &lt;a href="https://github.com/aiqualitylab/cypress-natural-language-tests" rel="noopener noreferrer"&gt;&lt;strong&gt;github.com/aiqualitylab/cypress-natural-language-tests&lt;/strong&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git clone https://github.com/aiqualitylab/cypress-natural-language-tests.git
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;⭐ Star the repo if you find it useful!&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;Natural language test generation is here to stay. With &lt;strong&gt;cypress-natural-language-tests&lt;/strong&gt; , you get:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Two modes&lt;/strong&gt;  — Traditional Cypress or cy.prompt()&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Full ownership&lt;/strong&gt;  — Complete test files you control&lt;br&gt;&lt;br&gt;
&lt;strong&gt;CI/CD ready&lt;/strong&gt;  — Works locally and in GitHub Actions&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Documentation-aware&lt;/strong&gt;  — RAG for accurate test generation&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Open source&lt;/strong&gt;  — No vendor lock-in&lt;/p&gt;

&lt;p&gt;Stop writing boilerplate. Start describing tests in plain English.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;What’s your experience with AI-powered test generation? Drop a comment below!&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;




</description>
      <category>openai</category>
      <category>ai</category>
      <category>softwaretesting</category>
      <category>cypress</category>
    </item>
  </channel>
</rss>
