Forem: Kevin Julián Martínez Escobar

TWD setup is now two Vite plugins and zero app code

Kevin Julián Martínez Escobar — Fri, 08 May 2026 23:19:11 +0000

Setting up TWD used to mean adding a block of dev-only code to your app's entry file — a dynamic import for the runner, a test glob, a service-worker config, and a twd-relay browser client. It worked, but it never really belonged there.

With twd-js@1.8 and twd-relay@1.2, both packages ship Vite plugins. Setup is two entries in vite.config.ts and nothing in main.tsx.

The new setup

vite.config.ts:

import { defineConfig } from "vite";
import react from "@vitejs/plugin-react";
import { twd } from "twd-js/vite-plugin";
import { twdRemote } from "twd-relay/vite";

export default defineConfig({
    plugins: [
        react(),
        twd({
            testFilePattern: "/**/*.twd.test.ts",
            open: false,
            position: "right",
            search: true,
        }),
        twdRemote(),
    ],
});

main.tsx:

import React from "react";
import ReactDOM from "react-dom/client";
import { RouterProvider } from "react-router";
import { router } from "./routes/router";
import "./styles/index.css";

ReactDOM.createRoot(document.getElementById("root")!).render(
    <RouterProvider router={router} />,
);

That's the whole setup. twd() owns the sidebar, glob discovery, and service-worker registration. twdRemote() attaches the relay to the Vite dev server and auto-injects the browser client into index.html. Both plugins use apply: 'serve', so production builds are untouched.

What it replaces

For comparison, here's what a TWD entry file looked like a few weeks ago:

if (import.meta.env.DEV) {
    const { initTWD } = await import("twd-js/bundled");
    const tests = import.meta.glob("./**/*.twd.test.ts");
    initTWD(tests, {
        open: false,
        position: "right",
        serviceWorker: true,
        serviceWorkerUrl: "/mock-sw.js",
        search: true,
    });

    const { createBrowserClient } = await import("twd-relay/browser");
    const client = createBrowserClient({
        url: `${window.location.origin}/__twd/ws`,
    });
    client.connect();
}

Two top-level await imports, a glob, a service-worker URL that had to stay in sync with the runner, a WebSocket URL that had to match the relay path, and config repeating defaults. All of it dev-only, all of it sitting above ReactDOM.createRoot.

After the upgrade, that block is gone. No if (import.meta.env.DEV), no dynamic imports, no relay client. The dev-tooling story lives entirely in vite.config.ts.

Why it matters

One source of truth for the wiring. The serviceWorkerUrl, the SW served by the dev server, the WebSocket path used by the relay, and the path the browser client connects to were all strings in different files that had to agree. Now the plugins own them.

No top-level await for tooling. The await import("twd-js/bundled") was loading a chunk that had nothing to do with your app, before React was allowed to mount.

Tooling lives in tooling config. New developers reading main.tsx shouldn't have to mentally if (import.meta.env.DEV)-out a quarter of the file to understand startup. The plugin model is what the rest of the Vite ecosystem already does — @vitejs/plugin-react, Tailwind, Tanstack Router devtools — and TWD now matches.

Non-Vite projects

Webpack, Angular CLI, Rollup, esbuild, Rspack — anywhere the Vite plugins don't apply — keep the manual API. initTWD and createBrowserClient stay public exports forever. twdRemote({ autoConnect: false }) is also there as an escape hatch for Vite projects that want to wire the browser client by hand.

Try it

The runner is at https://twd.dev. Upgrade to twd-js@1.8 and twd-relay@1.2, drop the dev-only block from main.tsx, add the two plugins to vite.config.ts, and you're done.

Two test runtimes, two coverage reports, one fragile merge

Kevin Julián Martínez Escobar — Mon, 04 May 2026 21:21:12 +0000

You have unit tests in Vitest (or Jest). You have E2E tests in Playwright. CI runs both. Coverage works for each, until you try to look at a single number.

Then it gets weird.

Two runtimes, two coverage outputs

Unit tests run in Node, instrumented by V8 or istanbul. Playwright runs your real app in a real browser. Each produces its own coverage data. Stitching them together usually means:

nyc merge (or a custom step) combining coverage-final.json files
Reconciling source maps between Vitest's transform pipeline and Playwright's
Hoping both tools agree on file paths

It works, until it doesn't. A path mismatch silently drops files from the merged report. A Playwright run on a different Node version emits slightly different paths. Coverage drops by 12% and nobody knows why.

The deeper issue: you're not really merging coverage. You're merging evidence that two different runtimes touched the same lines. The merge step is a heuristic.

What TWD does differently

TWD runs both styles of test in the same environment, your app's Vite dev server, one browser, one execution context.

A flow test exercises the page through the DOM:

import { twd, userEvent, screenDom } from "twd-js";
import { describe, it } from "twd-js/runner";

describe("checkout", () => {
    it("submits the order", async () => {
        await twd.visit("/checkout");
        await userEvent.click(screenDom.getByRole("button", { name: /pay/i }));
        // ...
    });
});

A unit test imports the function and asserts directly:

import { expect } from "twd-js";
import { describe, it } from "twd-js/runner";
import { normalizeOrder } from "@/utils/normalizeOrder";

describe("normalizeOrder", () => {
    it("defaults quantity to 1 when missing", () => {
        const result = normalizeOrder({ items: [{ sku: "ABC" }] });
        expect(result.items[0].quantity).to.equal(1);
    });
});

Same describe, same it, same expect. Same browser. Same coverage source.

There's no merge step because there's nothing to merge.

When to reach for which

Flow tests are most important and valuable. They cover real user behaviour, routes, interactions, mutations. They catch the bugs your users would actually hit.

Unit tests fill the gaps flow tests can't reach. A pure utility with seven branches in a switch statement isn't worth seven Flow tests, but it's worth covering. Drop it in a unit/ folder, parameterize the branches inline in one it(), done.

The rule of thumb:

Prefer flow-based tests for anything user-visible.
Use unit tests for pure functions and edge-case branches that flow tests genuinely can't reach.
Don't duplicate coverage between the two styles.

The real win

The coverage number at the end of a TWD run is one number from one runtime — not two reports that almost agree. If a line is uncovered, your tests didn't exercise it. That's the only reason left.

That's a small thing. Until you spend a day debugging a CI failure that turned out to be a path mismatch in a coverage merge.

If you want to try it, the runner is at https://twd.dev.

Testing Payment Flows Without the Payment SDK

Kevin Julián Martínez Escobar — Sun, 19 Apr 2026 19:04:53 +0000

Payment integrations are one of the hardest things to test in a web app. The SDK renders its own UI, controls its own form fields, and fires callbacks when the user completes a payment. You can't programmatically fill in a credit card number. You can't simulate a declined card. And if the SDK fails to initialize — because of a network issue, a bad API key, or a test environment misconfiguration — your entire test falls apart.

You can mock the SDK's setup endpoint to get the SDK rendering, the form mounting, the session resolving. That covers surface area — but it stops there. It doesn't test what happens after the payment resolves: the API calls, the analytics events, the navigation, the error states. The part that actually matters.

This article shows a different approach: using TWD's component mocking to replace the payment SDK entirely with a simple mock that gives you full control over the payment lifecycle.

Test What You Own. Mock What You Don't.

That's TWD's philosophy, and it's the whole reason component mocking is the right tool here. The payment SDK is someone else's code — its internals and lifecycle are their problem, covered by their test suite. Your responsibility is the seam: the callbacks fired into your app, the API calls they trigger, the analytics events, the UI state. That's where your bugs ship from.

You won't exercise the real SDK in these tests. That's the tradeoff — and it's deliberate. What you gain is the ability to exercise your side of the integration exhaustively: every callback, every branch, every error path. The SDK's correctness is the vendor's concern. The correctness of everything your app does around it is yours, and that's what these tests finally reach.

The Problem

A typical payment component looks like this:

function PaymentDropIn({ session, clientKey, orderId, cart }) {
  useEffect(() => {
    const checkout = await PaymentSDK.init({
      session,
      clientKey,
      onPaymentCompleted: async () => {
        await confirmOrder(orderId);
        await trackPurchase(cart, orderId);
        navigate("/success");
      },
      onPaymentFailed: (result) => {
        trackPaymentError(cart, result.code);
        setError("Payment failed");
      },
    });
    checkout.mount(ref.current);
  }, []);

  return <div ref={ref} />;
}

Everything is tangled inside one component: the SDK initialization, the business logic, the analytics, the navigation, the error handling. You can't test the onPaymentCompleted callback without actually initializing the SDK. And you can't initialize the SDK without a real (or carefully mocked) payment session.

Step 1: Separate the SDK from the Logic

The fix is architectural. Move the callback logic out of the payment component and into the parent. The payment component becomes a thin SDK wrapper that receives callbacks as props:

// Thin wrapper — just the SDK
function PaymentDropIn({ session, clientKey, onCompleted, onFailed, onError }) {
  useEffect(() => {
    const checkout = await PaymentSDK.init({
      session,
      clientKey,
      onPaymentCompleted: () => onCompleted(),
      onPaymentFailed: (result) => onFailed(result.code),
      onError: (err) => onError(err.message),
    });
    checkout.mount(ref.current);
  }, []);

  return <div ref={ref} />;
}

// Parent — owns the business logic
function CheckoutPage({ cart, orderId }) {
  const handleCompleted = async () => {
    await confirmOrder(orderId);
    await trackPurchase(cart, orderId);
    navigate("/success");
  };

  const handleFailed = (code) => {
    trackPaymentError(cart, code);
    setError("Payment failed");
  };

  return (
    <PaymentDropIn
      session={session}
      clientKey={clientKey}
      onCompleted={handleCompleted}
      onFailed={handleFailed}
      onError={handleError}
    />
  );
}

This is a good refactor regardless of testing. The parent owns the business logic. The payment component owns the SDK. Clean separation.

Step 2: Wrap for Mocking

TWD provides MockedComponent — a wrapper that lets tests replace a component's children with a mock. Wrap the payment component:

import { MockedComponent } from "twd-js/ui";

function PaymentDropIn(props) {
  return (
    <MockedComponent name="paymentDropIn">
      <PaymentDropInContent {...props} />
    </MockedComponent>
  );
}

In production, MockedComponent is a transparent pass-through — it renders its children. In tests, twd.mockComponent("paymentDropIn", ...) replaces the children with whatever you provide.

One important detail: MockedComponent passes its child's props to the mock component. That's why we need PaymentDropInContent as a separate component that receives all the callback props — so the mock receives them too.

Step 3: Build the Mock

The mock is dead simple. Three buttons — one per payment outcome:

twd.mockComponent("paymentDropIn", ({ onCompleted, onFailed, onError }) => {
  return (
    <div>
      <button onClick={() => onCompleted()}>Pay</button>
      <button onClick={() => onFailed("Refused")}>Fail Payment</button>
      <button onClick={() => onError("SDK crashed")}>Error</button>
    </div>
  );
});

Click "Pay" and the parent's handleCompleted fires — calling confirmOrder, sending the purchase event, navigating to success. Click "Fail Payment" and handleFailed fires — sending the error event, showing the error banner. No SDK involved. Just callbacks.

Step 4: Test Everything

Now you can test the full payment lifecycle with standard TWD patterns:

it("should call confirmOrder and navigate to success", async () => {
  await twd.mockRequest("confirmOrder", {
    url: `/api/orders/${orderId}/confirm`,
    method: "PATCH",
    status: 200,
    response: { customer_id: "cust-123", order_count: 3 },
  });

  // ... fill form, submit, wait for payment session ...

  const payButton = await screenDom.findByRole("button", { name: "Pay" });
  await userEvent.click(payButton);

  // Verify the API was called
  const rule = await twd.waitForRequest("confirmOrder");
  expect(rule).to.exist;

  // Verify navigation
  await twd.url().should("contain.url", "/success");
});

it("should fire purchase_error when payment is declined", async () => {
  // ... setup ...

  const failButton = await screenDom.findByRole("button", { name: "Fail Payment" });
  await userEvent.click(failButton);

  const errorEvent = await twd.waitFor(() => {
    const ev = window.dataLayer.find(e => e.event === "purchase_error");
    if (!ev) throw new Error("Event not found");
    return ev;
  });
  expect(errorEvent.error_code).to.equal("Refused");
});

it("should show error banner when confirmOrder fails", async () => {
  await twd.mockRequest("confirmOrderFail", {
    url: `/api/orders/${orderId}/confirm`,
    method: "PATCH",
    status: 500,
    response: { message: "Server error" },
  });

  // ... setup ...

  const payButton = await screenDom.findByRole("button", { name: "Pay" });
  await userEvent.click(payButton);

  const errorBanner = await twd.get("[data-testid='payment-error']");
  errorBanner.should("be.visible");
  await twd.url().should("not.contain.url", "/success");
});

What This Pattern Gives You

Coverage you couldn't get before:

Analytics events fire with the correct data (payment type, transaction ID, error codes)
The confirmOrder API is called with the right order ID
Navigation to the success page happens after payment, not before
Error banners appear when the API fails
Error banners appear when the payment is declined
Error banners appear when the SDK crashes

Speed: These tests run in ~1 second each. No SDK initialization, no payment session setup, no Adyen/Stripe endpoint mocking.

Reliability: No more flaky tests that break because the payment SDK's test environment is down. The mock is deterministic.

Conclusion

The unlock is component mocking. TWD's MockedComponent lets you replace a third-party SDK in tests with a simple stand-in whose callbacks you fire on demand — so the payment flow, which previously depended on an un-drivable SDK, becomes three buttons and a set of assertions. The SDK never boots. Tests run in a second. The callback flow — API calls, analytics, navigation, error states — is finally exercised.

The thin-wrapper refactor is what makes that possible, but it's the enabler, not the point. Once it's in place, the pattern transfers to any third-party component that fires callbacks: map SDKs, video players, chat widgets, auth flows. Same shape every time — wrap the component, swap it in tests.

Existing tests that mock the SDK's setup endpoint still work; they cover different ground. The component mock picks up where those stop.

More on the feature at twd.dev/component-mocking.

When Your Mocks Lie: Contract Testing with TWD

Kevin Julián Martínez Escobar — Sun, 19 Apr 2026 19:03:50 +0000

Every mock you write is a claim about what your backend returns. The moment the backend changes — a renamed field, a tightened enum, a new required property — that claim becomes a lie. Your tests still pass. Production breaks.

This is mock drift, and it's invisible. You don't find out until a user hits a 500 or an empty UI in prod. The mocks that gave you confidence were the thing misleading you.

TWD's contract testing closes this gap. Every mock response registered in a test gets validated against your OpenAPI spec during the same run that executes the test. A schema mismatch becomes a loud, specific error — in the same output as the test failures. No separate pipeline, no broker, no provider verifier. One command does both.

This article walks through what contract testing in TWD actually does, how to wire it into an existing project, and what the output looks like when it catches real drift.

The problem contract testing solves

Consider a typical mock in a TWD test:

await twd.mockRequest("userList", {
  method: "GET",
  url: "/v1/users",
  response: {
    count: 3,
    next: null,
    previous: null,
    results: [
      {
        id: "a1b2-...",
        name: "Acme Corp",
        balance: "10000.00",
        // ...
      }
    ],
  },
  status: 200,
});

This shape made the test pass three months ago. Since then:

The backend team removed balance from the list endpoint (it's a wallet concept now, served elsewhere).
A new required field external_id was added.
The discount field format tightened from "15" to "15.00" (two decimals).

None of these changes break the test. The component receives exactly the shape the mock provides. The test is green. Everything looks fine.

Meanwhile in production, the real API returns external_id (which a column in the table now expects), omits balance (which a detail drawer is still reading), and sends "10.00" where the formatter assumes trailing decimals. Bugs ship.

The test was never wrong — it was testing the wrong reality. The mock had drifted from the contract.

What TWD does about it

TWD's contract testing runs as part of npx twd-cli run — the headless runner you'd typically invoke in CI, not the live sidebar you use during local dev. Your inner loop stays fast; drift gets surfaced on every push.

On every call to twd.mockRequest(), the response payload is collected. After tests run, each response is validated against the OpenAPI schema for the endpoint that the mock targets.

The validation uses openapi-mock-validator under the hood and covers what you'd expect from JSON Schema:

Types (string, number, integer, boolean, array, object)
String formats (uuid, email, date-time, uri, and so on)
Numeric bounds, array constraints, enum values
Required fields, additionalProperties
Composition (oneOf, anyOf, allOf)

In practice this means: if your mock returns "id": "user-123" where the spec says "format": "uuid", you hear about it. If your mock omits external_id where the spec marks it required, you hear about it. If your mock sets "status": "pending" where the spec enum only allows ["COMPLETED", "FAILED", "PENDING"], you hear about it.

The key design choice: no extra test-writing effort. You don't author contract tests separately. The mocks you already write double as contract probes. Two signals from one artifact.

Setting it up

Three pieces: get the spec, tell TWD about it, decide how loud to be.

1. Get the OpenAPI spec

Point TWD at an openapi.json somewhere on disk. How it gets there is up to you — a curl against your backend's spec endpoint in CI is the common path. Download fresh on every run so you're always validating against the current contract.

2. Configure TWD

Create twd.config.json at the project root:

{
  "url": "http://localhost:5173",
  "contractReportPath": ".twd/contract-report.md",
  "retryCount": 3,
  "contracts": [
    {
      "source": "./openapi.json",
      "baseUrl": "/",
      "mode": "warn",
      "strict": true
    }
  ]
}

Key fields:

source — path to the OpenAPI JSON.
baseUrl — prefix to strip when matching mock URLs against spec paths. If your mocks call /v1/users and the spec paths are also /v1/..., set "/". If the spec is served under /api and your mocks include that prefix, set "/api".
mode — "warn" or "error". Start with "warn".
strict — whether to reject undocumented response properties.

3. Decide the mode

This is the one real decision.

"warn" — mismatches appear in the output but the test run still passes. Good posture when you're introducing contract testing into an existing codebase with accumulated drift. You see what's broken without immediately red-gating the team.

"error" — mismatches fail the run. This is where you want to land. It's the only mode that prevents regressions.

A realistic migration path: start in warn to surface the backlog, fix mismatches module by module, then flip to error once you're clean. The flip is the important step — without it, nothing stops new drift from accumulating.

The TWD ecosystem

Contract testing isn't a standalone library — it's the seam where the TWD packages meet: mocks authored with twd-js, runs executed by twd-cli, validation handled by openapi-mock-validator, and (if you're also using the AI agent skills) the browser bridge through twd-relay.

If you're starting from zero with TWD, the AI-powered frontend testing series walks through project setup, writing tests, and wiring them into CI. Contract testing slots in once that's working.

The payoff: what the output looks like

This is the part worth showing up for — and it exists only because you're already in the TWD stack. Your mocks run through twd-js. twd-cli already executes them. The validator just reads what's already moving through your tests. No separate contract test suite, no broker to run, no provider verifier to keep in sync.

Run your normal test command:

npx twd-cli run

Alongside the usual pass/fail output for each test, you'll see a per-mock contract status line:

✓ GET /v1/users (200) — mock "userList" — in "User list > should display the table"
✗ GET /v1/users/{user_id} (200) — mock "getUser" — in "User detail"
  → response.external_id: missing required property "external_id"
✗ GET /v1/orders (200) — mock "getOrders" — in "User detail"
  → response.next: missing required property "next"
  → response.previous: missing required property "previous"

And a summary:

Mocks validated: 253 | Errors: 93 | Warnings: 1 | Skipped: 0

Contract report written to .twd/contract-report.md

That second failure line — a required property missing on a test that otherwise passes — is where contract testing earns its keep. Without it, the mock keeps serving a shape the real API no longer returns, and the only person who finds out is a user.

The markdown report is useful for PRs and CI artifacts — it groups failures by endpoint and includes the test name that produced each mock, so tracing a failure back to a specific file is straightforward.

Why this matters more than it looks

Most contract testing tools (Pact being the canonical one) are heavy: brokers, provider verifiers, consumer-driven workflows, separate CI pipelines, coordination between frontend and backend teams. The ceremony is often what kills adoption — teams try it, find it exhausting, and revert to hoping for the best.

TWD's approach gets maybe 80% of the value for 10% of the cost, because it's opportunistic rather than exhaustive. You're not testing every possible response the backend could emit — you're testing the specific responses your app actually depends on (your mocks). That's often the right target: the place where client assumptions are encoded is exactly the place worth validating.

And it's cheap to adopt. No broker, no CI changes beyond one step to download the spec, no coordination with the backend team. A consuming team can turn this on unilaterally in an afternoon and immediately benefit.

The moment the backend ships a breaking change, your next CI run reports it. Not the next deploy. Not the next bug report from a user. The next CI run.

Wiring it into CI

One change to your workflow:

- name: Download OpenAPI contract
  run: npm run contract:download

- name: Install service worker
  run: npx twd-js init public --save

- name: Run TWD tests
  run: npx twd-cli run

- name: Contract testing report
  run: cat .twd/contract-report.md

Conclusion

Contract testing isn't the whole pitch — it's one piece of a stack designed to make each part of the testing workflow cheap instead of painful. Adopt TWD and you get:

Tests that run in your real browser, with a live sidebar as you develop.
A CI pipeline that's a few lines of YAML away.
Coverage collected without a separate configuration fight.
Mocks that double as contract probes, validated against your OpenAPI spec on every run.

The opportunity isn't just catching drift. It's that once you're in the TWD stack, everything above comes with it — and each piece is an afternoon of setup, not a quarter of migration.

More details and the full config reference live at twd.dev/contract-testing. The project is on GitHub at BRIKEV/twd. If you find a bug in the validator or want a new format supported, PRs welcome.

Guide to AI-Powered Frontend Testing with TWD

Kevin Julián Martínez Escobar — Sun, 19 Apr 2026 10:54:42 +0000

If you've ever watched an AI assistant generate a test file and thought "that looks right" only to spend the next twenty minutes fixing imports, selectors, and mock shapes — this series is for you.

TWD (Test While Developing) is an in-browser testing library built around a simple idea: tests should run inside your real application, against the real DOM, while you develop. No jsdom. No simulated environments. Just your app, a sidebar showing results in real time, and instant feedback as you code.

Over the past year, TWD has grown into a full ecosystem — and the part that changed everything is the AI workflow. A set of skills for Claude Code that let an AI agent write tests, execute them in your browser, fix failures, set up CI, find gaps in your coverage, grade the quality of your tests, and generate visual documentation for your whole team.

This series walks through each piece, in the order you'd use them.

What the Series Covers

1. Stop Letting AI Write Untestable Code. Add Determinism Back with TWD

The starting point. Run /twd:setup to analyze your project, answer a few questions, and generate .claude/twd-patterns.md — a configuration file that teaches the AI agent your project's testing conventions. Framework detection, API patterns, auth middleware, third-party modules — all captured in one file.

2. Your AI Doesn't Just Write Tests. It Runs Them Too

The core of the workflow. The /twd skill writes tests based on your project patterns, sends them to your browser via WebSocket, reads pass/fail results, and iterates until they're green. No screenshots, no heavy payloads — just a tight write-run-fix loop.

3. From Local Tests to CI in One Command

Once you have tests, /twd:ci-setup detects your project configuration and generates a GitHub Actions workflow using the official twd-cli action. Coverage, contract validation, Puppeteer setup — handled automatically.

4. Your Tests Are Running — But Are They Covering the Right Things?

The /twd:test-gaps skill scans your routes, cross-references them against your test files, and classifies each one as tested, partially tested, or untested. High-risk routes with mutations or permissions are flagged first so you know where to focus.

5. Your Tests Pass. But Are They Good?

Having tests is not the same as having good tests. The /twd:test-quality skill grades each test file across four dimensions: journey coverage, interaction depth, assertion quality, and edge case handling. Each file gets a letter grade and actionable suggestions.

6. Turning Your Test Suite Into a Visual Map Your Whole Team Can Read

The final piece. The /twd:test-flow-gallery skill generates Mermaid flowcharts and plain-language summaries from your test files. New developers can understand coverage without reading code. Product can see which user journeys are validated. QA can spot gaps at a glance.

The Philosophy Behind It

TWD is built on a simple principle: automate what you already verify manually. If you check that a form submits correctly by filling it out in the browser, that's your test. TWD just makes it repeatable.

The AI workflow extends that same idea. Instead of writing tests after the fact, the agent writes them as part of the development process — using your conventions, running them against your real app, and iterating until they pass.

The philosophy hasn't changed. What changed is that now your AI agent tests while developing too.

Getting Started

Install the TWD AI plugin for Claude Code:

claude plugin marketplace add BRIKEV/twd-ai
claude plugin install BRIKEV/twd-ai

Then run /twd:setup in your project to kick things off. The rest of the series follows from there.

Full documentation: twd.dev

Source: github.com/BRIKEV/twd-ai

Turning Your Test Suite Into a Visual Map Your Whole Team Can Read

Kevin Julián Martínez Escobar — Sun, 19 Apr 2026 10:49:42 +0000

You have written the tests. The CI pipeline runs them. The gap analysis has helped you fill the blind spots. Quality checks are passing. The work is solid.

And yet — ask a product manager what your test suite actually covers, and you will get a blank stare. Ask a new developer which user flows are tested, and they will spend an hour reading test files to piece it together. Ask QA to verify the coverage makes sense, and they will ask for a document that does not exist.

This is the last problem the TWD AI workflow solves. And it is not a small one.

The Gap Between Tests and Understanding

Test code is written for machines to execute. It is dense, technical, and full of implementation detail. A test that reads:

await twd.mockRequest('getTodos', { method: 'GET', url: '/api/todos', response: [{ id: 1 }, { id: 2 }], status: 200 });
await twd.mockRequest('createTodo', { method: 'POST', url: '/api/todos', response: { id: 3 }, status: 201 });
await twd.visit('/todos');
await twd.waitForRequest('getTodos');

const user = userEvent.setup();
const titleInput = await screenDom.findByLabelText('Title');
await user.type(titleInput, 'New todo');
const descInput = await screenDom.findByLabelText('Description');
await user.type(descInput, 'A new task');
const dateInput = await screenDom.findByLabelText('Date');
await user.type(dateInput, '2026-04-15');
const createButton = await screenDom.findByRole('button', { name: /create todo/i });
await user.click(createButton);

const rule = await twd.waitForRequest('createTodo');
expect(rule.request).to.deep.equal({ title: 'New todo', description: 'A new task', date: '2026-04-15' });

...tells a developer exactly what is being tested. It tells a product manager nothing. And for a new team member, figuring out what user journeys are covered means reading through dozens of files like this and piecing it together mentally.

The /twd:test-flow-gallery skill generates that picture for you.

What the Skill Produces

Running /twd:test-flow-gallery in Claude Code (with the TWD AI plugin installed) analyzes your TWD test files and generates two things for each test file it finds:

Mermaid flowcharts — one per test case. Each chart uses a consistent visual grammar:

Blue rectangles for user actions (clicks, form inputs, navigation)
Green hexagons for assertions (what the test verifies is true)
Separate subgraphs for API calls made during the test

Business-friendly summaries — plain language descriptions of what each test verifies. No function names, no selector syntax. Just: "A user fills out the create todo form with a title, description, and date, then clicks Create Todo. The form data is sent to the server as a new todo."

Here is an example of the flowchart generated from the code above:

The result is a .flows.md file colocated next to each test file, plus a root-level index that gives you a single navigation point across the entire test suite.

Who Actually Benefits From This

New developers can understand what is covered without reading a single line of test code. On day one, they can open the flow gallery and see the user journeys the team has validated. That is faster onboarding and fewer "wait, is this tested?" conversations in code review.

Product teams finally have visibility into testing. Not a coverage percentage — an actual map of user journeys. When they ask "are we testing the checkout flow?", the answer is a link, not a meeting.

QA engineers can identify gaps at a glance and verify that what is visually described matches what they expect to be covered. They can spot missing edge cases by looking at the flows rather than reading assertions.

Running It

With the TWD AI plugin installed, you run:

/twd:test-flow-gallery

That is it. The skill finds your TWD test files, processes them, and writes the .flows.md files alongside your tests. The root index is placed at a predictable location so you can link to it from your README or project wiki.

The flowcharts use standard Mermaid syntax, which renders natively on GitHub, GitLab, Notion, and most modern documentation tools. No extra dependencies, no build step.

The Complete TWD AI Workflow

This skill is the finale of a six-step workflow that takes you from zero to a fully automated, AI-assisted testing practice:

/twd:setup — Scaffolds the TWD testing environment in your project
/twd (twd skill) — AI agent that writes and runs in-browser tests against live components
/twd:ci-setup — Wires your tests into CI/CD with the headless runner
/twd:test-gaps — Identifies untested user flows and generates missing tests
/twd:test-quality — Reviews your tests for reliability, false positives, and maintenance burden
/twd:test-flow-gallery — Turns your test suite into visual documentation for the whole team

Each step builds on the last. The result is a test suite that is not just green in CI — it is legible, maintainable, and understood by everyone who needs to understand it.

Try It

The TWD AI plugin is open source and available at github.com/BRIKEV/twd-ai. The full TWD documentation, including the philosophy behind test-while-developing, is at twd.dev.

If you have been following this series, you now have the full picture. If you are coming to this article first — the rest of the series walks through each step in detail. Start at the beginning and build the workflow incrementally.

Testing should not be a black box. Your team deserves to see what is covered.

Your Tests Pass. But Are They Good? Grading Test Quality with /twd:test-quality

Kevin Julián Martínez Escobar — Sun, 19 Apr 2026 10:49:29 +0000

The Problem With "We Have Tests"

There is a moment in every project where someone says "we have tests" like it settles the matter. The CI pipeline is green. The coverage number is somewhere north of 70%. Everything is fine.

Until a bug slips through. Not because the tests failed — but because they never really covered what broke.

This is the gap between having tests and having good tests. A test that checks whether a button is visible tells you almost nothing about whether your application works. A test that checks whether clicking "Submit" fires the right API call with the right payload — that test is doing real work.

The /twd:test-quality skill is built for exactly this problem. It reads your existing test files, grades them across four weighted dimensions, and hands you a concrete list of what to improve.

What Gets Graded

Every test file gets scored across four dimensions. Each one targets a distinct failure mode in how developers tend to write tests under time pressure.

Journey Coverage (35%) — This is the heaviest dimension, and for good reason. A test suite full of isolated "does X render?" checks does not tell you whether the user can actually complete a task. Journey coverage looks for complete workflows: does the test cover the sequence of actions a user would take to accomplish something, or does it stop after the first visible element?

Interaction Depth (20%) — Variety matters. If all your tests do the same kind of interaction — say, only clicking buttons — you are missing a significant portion of how real users engage with your UI. This dimension checks for the range of input types and interaction patterns exercised.

Assertion Quality (25%) — This is where most test suites quietly fail. Assertions that check CSS classes or element visibility feel like verification, but they do not confirm that your application's logic is correct. Strong assertions check actual outcomes: API payloads, state changes, content that results from a specific action. Loose assertions let bugs pass silently.

Error and Edge Cases (20%) — The happy path is always tested. What about the unhappy path? Empty states, boundary values, API failures, form validation — these are the scenarios that surface in production and are almost never covered by a first-pass test suite.

How the Skill Works

Point the skill at your test directory and it will evaluate each file independently. The output is direct: a letter grade (A through D), a weighted overall score, and — for anything below an A — two or three specific, actionable suggestions.

/twd:test-quality src/tests/

A typical output might look like this:

invoice-form.test.js — C (62/100)

Journey Coverage: D — Tests check that fields render, but no test submits
the form and verifies the result.

Assertion Quality: C — Assertions rely on element visibility. No tests
verify the POST payload or the success state.

Suggestions:
1. Add a test that fills the form and submits it, then asserts the API
   received the correct invoice payload.
2. Add a test for the error state when the API returns a 422.
3. Verify the confirmation message content, not just its presence.

The score is not the point. The suggestions are.

From Analysis to Action

Once you have the quality report, the workflow is immediate. You run /twd on the same files — the core TWD test-writing skill — and it uses the suggestions as its implementation brief.

The quality skill diagnoses. The test skill fixes. You do not have to manually translate "assertion quality is weak" into new test code — that handoff happens automatically.

This is the pattern that makes AI-assisted testing practical rather than cosmetic. The AI is not writing tests from scratch based on a vague request. It is working from a structured diagnosis of what is actually missing.

What This Looks Like in Practice

Here is what a realistic before-and-after looks like for a form component:

Before — a typical first-pass test:

it('renders the submit button', async () => {
  await twd.visit('/invoices/new');
  const button = screenDom.getByRole('button', { name: /submit/i });
  expect(button).to.be.visible;
});

After — the same component, improved by the quality feedback:

it('submits a valid invoice and shows confirmation', async () => {
  await twd.mockRequest('createInvoice', {
    method: 'POST',
    url: '/api/invoices',
    response: { id: 'inv_001' },
    status: 201,
  });

  await twd.visit('/invoices/new');
  const user = userEvent.setup();
  await user.type(screenDom.getByLabelText(/amount/i), '1200');
  await user.click(screenDom.getByRole('button', { name: /submit/i }));

  const rule = await twd.waitForRequest('createInvoice');
  expect(rule.request).to.deep.equal({ amount: 1200, currency: 'EUR' });
  screenDom.getByText(/invoice created/i);
});

The second test is not dramatically more complex. It is just more intentional. It verifies behavior, not presence.

Getting Started

The /twd:test-quality skill is part of the TWD AI plugin for Claude Code. If you have the plugin installed, you can run a quality audit on any test directory immediately.

Start with your most critical feature area. Look at what the grader flags as weak on assertion quality and journey coverage — those two dimensions are usually where the highest-value improvements are hiding.

The tests that catch bugs in production are not the ones you wrote fastest. They are the ones that actually exercise the path that breaks.

Next in the series: the Test Flow Gallery — a curated set of reusable test patterns for common UI scenarios, so you are not writing from scratch every time.

Your Tests Are Running — But Are They Covering the Right Things?

Kevin Julián Martínez Escobar — Sun, 19 Apr 2026 10:49:04 +0000

You've wired up your test suite. CI is green. You're shipping. And then a bug lands in production on a route nobody thought to test.

It's not that your tests are bad. It's that you didn't know what was missing.

That's the problem /twd:test-gaps is built to solve.

The gap between "tests exist" and "tests are enough"

Most coverage tools tell you about lines and branches. What they don't tell you is: which user-facing routes have zero test coverage? Which pages have a test that visits them but never clicks anything, never submits a form, never triggers a mutation?

There's a difference between a test that loads a page and a test that actually exercises it. A route can show up as "covered" while the core interaction — the form submission, the delete confirmation, the role-based redirect — is completely untested.

/twd:test-gaps makes that distinction explicit.

What the skill actually does

When you run /twd:test-gaps, the TWD plugin does three things in sequence:

Route discovery. It reads your project and finds every route your app exposes. It doesn't require a specific framework — it detects routes from router config files, page component patterns, and URLs referenced in existing test files. Angular, React Router, Vue Router, SolidJS: it handles all of them.

Coverage classification. For each discovered route, it checks the test files and assigns one of three states:

Tested          — has twd.visit() + userEvent interactions (clicks, inputs, submits)
Partially tested — has twd.visit() but missing interaction or mutation coverage
Untested         — no test file references this route at all

The partially-tested category is where this gets valuable. These are routes that feel covered but aren't. A smoke test that visits /settings and checks the heading renders is not the same as a test that changes a password, hits submit, and verifies the API call was made.

Risk assessment. The skill reads your component code and scores each untested or partially-tested route as HIGH, MEDIUM, or LOW risk based on what it finds:

HIGH: mutations (form submissions, delete actions, state changes), financial handling, permission checks
MEDIUM: complex UI interactions, multi-step flows, conditional rendering logic
LOW: static pages, read-only views, simple display components

The output is a prioritized list. You know exactly where the exposure is.

A real example

Say you have a /checkout route. The skill visits the component, sees a form with payment fields and a submit handler that calls an API. It checks your test files — finds a test that visits the route but only asserts that the page renders. No form interaction. No API mock.

The result:

/checkout — PARTIALLY TESTED — HIGH RISK
  Missing: form submission, mutation mock for POST /api/orders

That's the information you need to act on.

From gap to green

Once you have the report, the next step is straightforward: run /twd on the high-priority routes and let the agent write the missing tests.

The flow is:

Run /twd:test-gaps to get the prioritized gap report
Run /twd on HIGH-risk untested or partially-tested routes
Tests are written, run, and fixed until green
Repeat for MEDIUM-risk routes as capacity allows

You're not guessing anymore. You have a concrete list, sorted by risk, and a tool that can write the tests you point it at.

Why this matters more than line coverage

Line coverage tells you what code ran during your tests. It doesn't tell you whether the right things were asserted, whether the interactions were real, or whether the routes that users actually care about are exercised.

A codebase can have 85% line coverage and a completely untested checkout flow.

/twd:test-gaps focuses on user-facing behavior: the routes, the interactions, the mutations. It asks the question your users would ask — "what happens when I do this?" — and finds the places where no one has answered it yet.

What's next

The next article in this series covers /twd:test-quality — a skill that goes beyond gap detection and grades the quality of your existing tests. Not just whether a route is tested, but whether the test is actually asserting the right things, using the right patterns, and giving you confidence you can rely on.

If you're curious about the TWD plugin and want to try it yourself, the full source is at https://github.com/BRIKEV/twd-ai.

From Local Tests to CI in One Command

Kevin Julián Martínez Escobar — Sun, 19 Apr 2026 10:47:48 +0000

Why Set Up CI Early

Once you have tests worth running, the natural next step is making sure they run on every push. The sooner you do it, the fewer surprises down the road — things like env vars that only exist locally or paths that work on your machine but nowhere else.

If you've been following this series, you just ran /twd to write your first batch of tests. This is the right moment to wire them into CI, while everything is fresh.

What the Skill Actually Does

/twd:ci-setup is part of the TWD AI plugin. Like /twd:setup, it starts with a discovery phase — detecting your project configuration before asking you anything.

The skill inspects your project to figure out:

Your framework and build tooling
The dev server port your app runs on
Your base path, if you have one configured
Whether you have coverage already wired up, and if not, whether you want it

Most of this is inferred automatically. The main thing it asks you about is coverage. Once it has the full picture, it generates a GitHub Actions workflow tailored to your project.

Here is an example of what it produces:

name: TWD Tests

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

permissions:
  pull-requests: write  # only needed if using contract-report

jobs:
  test:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v5

      - uses: actions/setup-node@v5
        with:
          node-version: 24
          cache: npm

      - name: Install dependencies
        run: npm ci

      - name: Install mock service worker
        run: npx twd-js init public --save

      - name: Start dev server
        run: |
          nohup npm run dev > /dev/null 2>&1 &
          npx wait-on http://localhost:5173

      - name: Run TWD tests
        uses: BRIKEV/twd-cli/.github/actions/run@main
        with:
          contract-report: 'true'

The workflow starts the dev server, waits for it to be ready, and runs your TWD tests using the official twd-cli GitHub Action. That action handles Puppeteer setup, Chrome installation, and optionally posts a contract validation report as a PR comment. If your app runs on a different port or base path, the skill adjusts the configuration accordingly.

Running It

After you've written tests with /twd, run:

/twd:ci-setup

The skill will work through the detection phase, ask whether you want coverage reporting, then drop the workflow file into .github/workflows/. Push the branch, open a pull request, and your tests run.

That first CI run tends to surface the usual suspects: an env var that exists locally but not in CI, a path assumption that only works on your machine, or a dependency that wasn't properly declared. The skill takes care of the pipeline setup — and CI itself takes care of showing you what was missing.

The Pattern This Follows

If you used /twd:setup to initialize TWD in your project, this feels familiar. Detection first, minimal questions, concrete output. The philosophy is the same: don't make developers manage configuration when the tooling can figure it out.

The two commands are designed to be used in sequence:

/twd — write tests for your components and pages
/twd:ci-setup — make sure those tests run automatically on every push

Neither requires you to know the internals of TWD, GitHub Actions syntax, or coverage tooling. You describe what you want; the skill handles the rest.

What Comes Next

Setting up CI is about making sure your tests run. The next question is a harder one: are you testing the right things? You might have solid coverage on the happy path and nothing on the edge cases that actually break in production.

The next article in this series covers the TWD Test Gap Analysis skill — how it analyzes your existing tests, identifies what is missing, and suggests what to write next. Not just "you have low coverage on this file", but a concrete read of what user flows and failure modes are not being tested.

If you want to follow along or explore the TWD plugin: https://github.com/BRIKEV/twd-ai

Your AI Doesn't Just Write Tests. It Runs Them Too.

Kevin Julián Martínez Escobar — Sun, 19 Apr 2026 10:47:24 +0000

In the previous article, we talked about /twd:setup — the skill that analyzes your project and generates .claude/twd-patterns.md, a file that teaches your AI agent how tests are structured in your codebase. That was about giving the agent context.

This one is about what the agent actually does with it.

Writing Tests That Run in Your Real Browser

Most AI-generated tests run in Node.js — against jsdom, a simulated DOM. That works for a lot of things, but it is not the same as running inside your actual application with real components, real routing, and real network mocks in place.

TWD tests run in the browser. They execute inside your running app, against the real DOM, with your actual component tree mounted. And the /twd skill — part of the TWD AI plugin — takes that a step further. It writes the tests, runs them in your browser, reads the results, and if something fails, it fixes the code and runs again.

Here's the full cycle:

Write — the agent reads your .claude/twd-patterns.md to understand your project's test conventions, then generates a test that follows those patterns
Execute — the test is sent to your browser via twd-relay, a WebSocket relay that connects the agent to the TWD sidebar running in your app
Read results — pass/fail status comes back as plain text with error details (no screenshots, no DOM dumps — just the signal you need)
Fix and re-run — if a test fails, the agent reads the error, adjusts the test, and re-executes

This loop runs automatically. You're not involved until it's done.

How the Relay Works (Without Getting Heavy)

One thing that surprised me about this setup is how lightweight it is.

twd-relay uses a WebSocket connection between the agent and the browser. When the agent wants to run tests, it sends a command through the relay. The browser executes the tests inside the running app — against the real DOM, with your mocked API responses in place, and the real component state.

Results come back as text. Not screenshots. Not serialized DOM trees. Just: did it pass, and if not, what was the error message.

This keeps token usage remarkably low. The agent is essentially getting the same output you'd see in a terminal — concise, structured, actionable.

Running It Before You Build

The recommended workflow is test-first. Run /twd before you implement a feature.

/twd Write a test for the checkout form — it should verify that submitting with an empty email shows a validation error

The agent writes the test based on your existing patterns, runs it in the browser, and it fails — because the feature doesn't exist yet. That's expected. The test is now a specification.

You implement the feature. The test passes. You didn't have to think about test structure, selectors, or mock setup. The agent handled that using the patterns it already knows from your project.

What Happens When a Test Won't Pass

Not every test is fixable on the first attempt. Sometimes the agent hits a case it can't resolve — a component that behaves differently than expected, or a pattern that doesn't quite translate.

The /twd skill handles this with a hard limit: if a test still fails after three fix attempts, it's marked as it.skip and left in the file with a comment. It doesn't block the rest of the test run. You can come back to it, investigate the real issue, and decide how to handle it.

This is important for trust. An agent that quietly hides failures is dangerous. One that skips and surfaces them is honest.

Keeping Your Conversation Clean

The /twd skill runs in a forked context — meaning the test iterations, failed attempts, and fix cycles happen separately from your main conversation. When it finishes, you get a summary of what passed, what was skipped, and what files were created. You don't have to scroll through 30 messages of debugging to see the result.

A Concrete Example

Say you're building a Vue component that fetches and displays user data. You invoke:

/twd Write tests for the UserProfile component

Behind the scenes:

The agent reads .claude/twd-patterns.md — it knows your project conventions, how to mock API endpoints with twd.mockRequest(), and which selectors to use with screenDom
It generates tests that mock the /api/users/:id endpoint, visit the page, and assert the displayed data
It runs them in your browser via the relay
One test fails — the agent used a selector that doesn't match your actual markup
It reads the error, corrects the query, re-runs
All tests pass

Total time: less than two minutes. No manual intervention.

What You Actually Need to Get Started

The TWD sidebar running in your app (from twd-js)
twd-relay running locally
.claude/twd-patterns.md generated by /twd:setup
The /twd skill installed from twd-ai

That's it. The relay handles the browser connection, the patterns file handles the conventions, and the skill handles the rest.

Coming Next: CI Setup

Writing and running tests locally is one half of the equation. The other half is making them part of your CI pipeline — so tests run headlessly on every push, without a browser in sight.

The next article in this series covers /twd:ci-setup, which configures your project to run TWD tests in CI using the headless CLI runner. If you've ever wanted your AI-written tests to gate a deployment, that's the one you want.

Stop Letting AI Write Untestable Code. Add Determinism Back with TWD

Kevin Julián Martínez Escobar — Tue, 17 Mar 2026 22:35:17 +0000

Now that AI is generating a big portion of our frontend code, speed is no longer the main bottleneck.

Confidence is.

We’ve seen this before: when teams wanted to move fast, the real enabler wasn’t “more code” — it was having a solid testing strategy that allowed safe refactoring and iteration.

That hasn’t changed. If anything, it’s more important now.

In previous posts, I talked about the TWD philosophy, but one idea is key here:

Test what you own. Mock what you don’t.

With this mindset, TWD (Test While Developing) lets you create deterministic UI tests, where you fully control the environment, simulate any scenario, and avoid flaky behavior.

The Problem: Allow AI Know Your Project

AI can write tests… but not your tests.

It doesn’t know:

how your project is structured
what should be mocked
how your auth works
what “good tests” look like in your codebase

So even if it generates tests, they often:

don’t follow your patterns
mock the wrong things
miss important flows

The Solution: TWD + Skills

To solve this, we introduced TWD skills inside a Claude Code plugin.

These skills give the AI:

context about your project
rules to follow
patterns to reuse

So instead of generic tests, you get tests that actually fit your codebase.

/twd:setup — The Most Important First Step

Everything starts here:

/twd:setup

This is an interactive setup that analyzes your project and creates:

.claude/twd-patterns.md

This file becomes the source of truth for how tests should be written.

What /twd:setup Does

Detects your framework, Vite config, entry points, CSS setup
Asks about:
- auth & permissions
- API structure
- third-party dependencies
- state management
Generates a project-specific testing config
Optionally installs and configures TWD for you

How It Looks in Practice

You just run:

/twd:setup

And the agent starts understanding your project:

It will ask the right questions to understand how your app works and detect your frontend configuration:

What should be mocked?
How does auth behave?
Where are your services?

Once completed, it generates your config file:

.claude/twd-patterns.md

This is critical — every future test uses this as context.

If something is outdated, it will even suggest improvements:

And when it's done, it guides you to the next step:

Why This Matters

Without this step, AI-generated tests are:

generic
inconsistent
sometimes useless

With /twd:setup, tests become aligned with your architecture.

The plugin also includes skills for:

automatic test generation
CI setup
test quality analysis
test gap detection
flow visualisation

We’ll cover those in future posts.

Final Thought

AI can make you faster.

But tests are what let you keep that speed without breaking everything.

If you care about frontend quality in the AI era, don’t just generate code:

Generate confidence.

Agent conversation:

TWD setup skill claude-opus-4-6

You

<command-message>twd:setup</command-message>
<command-name>/twd:setup</command-name>

You

Base directory for this skill: [REDACTED]/.claude/plugins/cache/twd-ai/twd/1.1.0/skills/setup

TWD Project Setup

You are configuring TWD (Test While Developing) for this project. Your job is to detect project settings, ask questions for what can't be auto-detected, and generate a .claude/twd-patterns.md configuration file.

Step 1: Auto-Detect Project Settings

Read these files to pre-fill answers (read all in parallel):

package.json — detect framework from dependencies:
- react / react-dom → React
- vue → Vue
- @angular/core → Angular
- solid-js → Solid
- Also detect CSS/component libraries: @mui/material, @chakra-ui/react, antd, @mantine/core, vuetify, primevue, element-plus, @angular/material
vite.config.ts (or .js, .mjs) — detect:
- base field (Vite base path, default /)
- server.port (dev server port, default 5173)
index.html — detect entry point from <script> src attribute
Glob for src/services/, src/api/, src/lib/api — detect API/services folder
Check if public/ directory exists — confirm public folder name
Detect state management from package.json dependencies:
- zustand → Zustand
- @reduxjs/toolkit or redux → Redux
- jotai → Jotai
- pinia → Pinia
Check if .claude/twd-patterns.md already exists — offer to update vs overwrite

Step 2: Ask Questions

IMPORTANT: Use the AskUserQuestion tool for ALL questions. This provides an interactive UI experience. Never dump questions as a plain numbered list in text output.

Present auto-detected values as a summary first, then ask questions in two batches:

Batch 1: Project basics (confirm auto-detected values)

Do NOT ask individual questions for values you already detected. Show a single summary of all detected values and ask "Does anything look wrong?" using AskUserQuestion. The user only needs to respond if something is incorrect. Example:

Here's what I detected:
- Framework: React
- Vite base path: /
- Dev server port: 5173
- Entry point: src/main.tsx
- Public folder: public/
- API services: src/services/
- CSS library: MUI
- State management: Zustand

Does anything look wrong, or should I continue?

Batch 2: Testing concerns (need user input)

After confirming batch 1, use AskUserQuestion for each of these that requires user input:

CSS library docs (only if a CSS library was detected): Where are the docs? (URL, local path, or "skip")
Auth middleware: Does your project have route-based auth/permissions? If yes, briefly describe the pattern.
Third-party modules: Does your project use external services that need mocking in tests? (e.g., Auth0, Stripe, analytics)
- If yes: Which modules and how are they imported?
- The agent needs this to know what to Sinon-stub in tests — "test what you own, mock what you don't"
State reset (only if state management was detected): How do you reset the store? (e.g., useStore.setState(initialState), store.$reset())
- TWD runs without page reloads — store state persists between tests and must be reset in beforeEach

Step 3: Generate `.claude/twd-patterns.md`

Create the .claude/ directory if it doesn't exist, then write .claude/twd-patterns.md with the following sections. Only include sections that are relevant — omit sections that don't apply.

# TWD Project Patterns

## Project Configuration

- **Framework**: FRAMEWORK
- **Vite base path**: BASE_PATH
- **Dev server port**: PORT
- **Entry point**: ENTRY_FILE
- **Public folder**: PUBLIC_DIR

### Relay Commands

```bash
# Run all tests (default — use this if base path is / and port is 5173)
npx twd-relay run

# Run all tests (custom config)
npx twd-relay run --port PORT --path "BASE_PATH__twd/ws"

Standard Imports

import { twd, userEvent, screenDom, expect } from "twd-js";
import { describe, it, beforeEach, afterEach } from "twd-js/runner";
// Project-specific imports go here (added by user)

Visit Paths

All twd.visit() calls must include the base path prefix:

await twd.visit("BASE_PATH");
await twd.visit("BASE_PATHsome-page");

Standard beforeEach / afterEach

beforeEach(() => {
  twd.clearRequestMockRules();
  twd.clearComponentMocks();
  Sinon.restore();
  // STORE_RESET (if applicable — e.g., useStore.setState(initialState), store.$reset())
  // AUTH_SETUP (if applicable)
  // THIRD_PARTY_STUBS (if applicable — e.g., Sinon.stub(authModule, 'useAuth').returns(...))
});

afterEach(() => {
  twd.clearRequestMockRules();
});

API Service Types

Service/API types are located in: API_FOLDER

Read files in this folder to understand endpoint URLs and response shapes when writing mock data.

CSS / Component Library

Library: CSS_LIB
Docs: CSS_DOCS_LOCATION

When writing tests, refer to library docs for correct ARIA roles and component structure.

Auth Middleware

AUTH_DESCRIPTION

Route → Permission Mapping

Route	Required Permissions
(to be filled by developer)

Third-Party Modules

"Test what you own, mock what you don't." These external modules should be stubbed in tests:

Module	Import Pattern	Stub Strategy
MODULE_NAME	`import { hook } from 'package'`	`Sinon.stub(moduleObj, 'hook').returns(...)`
(to be filled by developer)

See the test-writing reference for the default-export object pattern required for ESM stubbing.

Portals and Dialogs

Use screenDomGlobal instead of screenDom for elements rendered in portals (modals, dropdowns, tooltips):

import { screenDomGlobal } from "twd-js";
const modal = screenDomGlobal.getByRole("dialog");


### Template rules:
- If base path is `/`, simplify visit paths to just `await twd.visit("/page")`
- If port is `5173` and base path is `/`, use `npx twd-relay run` (no flags)
- Omit the "Auth Middleware" section entirely if no auth
- Omit the "Third-Party Modules" section entirely if no external modules
- Omit the "CSS / Component Library" section if none detected
- Omit the "API Service Types" section if no services folder found
- Omit the `STORE_RESET` comment in beforeEach if no state management library
- Omit the `AUTH_SETUP` comment in beforeEach if no auth middleware
- Omit the `THIRD_PARTY_STUBS` comment in beforeEach if no third-party modules

## Step 4: Optionally Run Setup

After generating the config file, check if TWD is already installed. If not, ask the user if they want to run setup now:

1. `npm install twd-js`
2. `npm install --save-dev twd-relay`
3. `npx twd-js init PUBLIC_DIR --save`
4. Configure entry point — **insert this DEV block BEFORE the existing app mount code** (before `createRoot`, `createApp`, etc.). The import is **`twd-js/bundled`**, NOT `twd-js`:

```typescript
if (import.meta.env.DEV) {
  const { initTWD } = await import('twd-js/bundled');
  const tests = import.meta.glob("./**/*.twd.test.ts");

  initTWD(tests, {
    open: true,
    position: 'left',
    serviceWorker: true,
    serviceWorkerUrl: '/mock-sw.js',
  });

  // Connect twd-relay browser client
  const { createBrowserClient } = await import('twd-relay/browser');
  const client = createBrowserClient({ url: `${window.location.origin}/__twd/ws` });
  client.connect();
}

Adjustments: If Vite base is not /, update serviceWorkerUrl to '/BASE/mock-sw.js' and relay URL to `${window.location.origin}/BASE/__twd/ws`.

Add Vite plugins:

import { twdHmr } from 'twd-js/vite-plugin';
import { twdRemote } from 'twd-relay/vite';
import type { PluginOption } from 'vite';

// Add to plugins array:
plugins: [
  // ... other plugins
  twdHmr(),
  twdRemote() as PluginOption,
]

Write a first test file

Only run steps the user approves. Show what each step does before executing.

Output

When done, summarize:
- Where the config file was written
- What values were detected vs asked
- What setup steps were completed (if any)
- Next steps for the user (e.g., "Run npm run dev to see the TWD sidebar")

Agent

Setting up TWD for your project. Let me detect your project settings first.

8 messages not shown

Agent

Here's what I detected from your project:

Framework: React 19 + TypeScript
Vite base path: / (default)
Dev server port: 5173 (default)
Entry point: src/main.tsx
Public folder: public/
API services: src/api/ (found todos.ts)
CSS: Tailwind CSS v4 + Radix UI + shadcn/ui pattern (CVA, clsx, tailwind-merge)
State management: Zustand

No auth middleware or third-party modules detected. Let me confirm this with you.

1 message not shown

Agent

Now let me ask about the remaining items that need your input.

1 message not shown

Agent

Got it. Let me generate the configuration file.

2 messages not shown

Agent

Config file written. Now let me check if TWD is already installed.

1 message not shown

Agent

TWD is not installed yet. Let me ask if you'd like to run the setup.

1 message not shown

Agent

Let's start with installing the packages and initializing TWD.

2 messages not shown

Agent

Good. Now let me configure the entry point and Vite plugins.

2 messages not shown

Agent

Now add the Vite plugins.

1 message not shown

Agent

Now let me read the API file to write a meaningful first test.

4 messages not shown

Agent

Now let me create a first test file for the Helloworld page.

1 message not shown

Agent

TWD setup is complete. Here's a summary:

Config file: .claude/twd-patterns.md

Auto-detected values: React 19, Vite defaults (/, 5173), src/main.tsx, public/, src/api/, Tailwind + Radix UI, Zustand

Setup steps completed:
1. Installed twd-js and twd-relay
2. Initialized service worker in public/mock-sw.js
3. Added TWD dev block to src/main.tsx (before app mount)
4. Added twdHmr() and twdRemote() Vite plugins to vite.config.ts
5. Created first test: src/pages/Helloworld/Helloworld.twd.test.ts

Next steps:
- Run npm run dev to start the dev server — you'll see the TWD sidebar on the left
- Run npx twd-relay run in a separate terminal to execute tests from CLI

Are Developers Still Learning, or Just Using AI?

Kevin Julián Martínez Escobar — Tue, 10 Mar 2026 22:10:52 +0000

AI is changing the way we work as developers.

We are moving from writing code to writing specifications. Instead of implementing features ourselves, we rely more on agent modes and tools like Claude Code, Codex, or Gemini CLI.

This works incredibly well. Productivity increases, repetitive work disappears, and we can ship features faster.

But there is something I’ve started noticing.

Developers are not necessarily learning more.
We are just doing the same things faster.

AI Cannot Replace Understanding

When you rely heavily on AI tools, something subtle happens.

You stop exploring new solutions yourself.
You stop experimenting.
You stop learning new paradigms.

Many developers simply apply the same solutions they always used, just faster because AI writes the code.

When a new approach or development practice appears, it often does not get adopted well because the developer does not really understand it.

AI can generate code.

But it cannot give you knowledge you never built.

The Knowledge Gap

A simple example.

I am not an expert in:

load testing
monitoring
observability

I know these areas are important. But if I do not understand the concepts, how am I supposed to apply them correctly?

Even with the best prompts and agents, the result will still be shallow.

AI helps execute ideas. It does not replace learning them.

Where This Shows the Most

I notice this especially in two areas.

testing
accessibility

Testing

Recently I have heard things like:

Why write tests if AI generates almost perfect code?

or:

We run Playwright tests with AI after the feature. That is enough.

But testing is not about validating one run. It exists to prevent regressions, catch old bugs, and protect systems as they evolve.

AI can generate tests.

But testing strategy still requires understanding.

Accessibility

Accessibility has always been ignored in many projects.

AI has not changed that.

You can add an accessibility agent, but accessibility is not just adding attributes. It requires understanding how assistive technologies behave, how screen readers navigate, and how keyboard interaction works.

Without that knowledge, accessibility stays superficial.

The New Developer Loop

Something interesting is happening.

Developers spend more time writing Markdown specs for AI than learning new technical concepts.

Instead of experimenting, breaking things, and learning through failure, we optimise prompts and agent workflows.

But not necessarily ourselves.

I Am Also In That Loop

To be fair, I am also part of this.

AI tools are powerful and it is easy to rely on them.

But recently I started asking myself a simple question.

Am I actually improving?

Or am I just getting faster?

So I am trying to go back to learning new concepts, experimenting more, and understanding systems deeper.

And only after that, delegating to AI.

Ask Yourself This

If AI makes you 10x more productive, ask yourself:

Are you doing something you would never do if you had infinite time?

Are you exploring areas you previously did not understand?

Or are you just shipping the same things faster?

And one more question.

If your productivity increases 10x, do you now have 10x more time?

More time to learn, exercise, spend time with family, or build deeper knowledge?

If the answer is no, then what is AI really doing for you?

Is it helping you become a better developer?

Or just a faster version of the same developer?