DEV Community

Cover image for Selective test execution mechanism with Playwright using GitHub Actions
Denis Skvortsov
Denis Skvortsov

Posted on

1

Selective test execution mechanism with Playwright using GitHub Actions

TL;DR

Selective test execution: run only the tests related to the actual code changes.

Saves time and resources: speeds up the process and reduces CI/CD load, especially in cloud environments.

Works for both monorepos and split repositories: the solution fits projects with either a monorepo or separate frontend/backend repositories.

Uses GitHub Actions and Playwright: configures CI/CD to filter tests by tags and run only the relevant ones.

Example implementation: available in a public GitHub repository

Introduction

When I first faced the task of setting up testing, I realized that running the entire test suite after every code change is not just slow - it's wasteful. Especially in large projects where microservices, frontend, and backend each have their own test sets. In such setups, running all tests for every commit or PR is unnecessary and significantly slows down the development cycle.

For example, if a frontend developer changes a button or a small UI component, why would we run all backend tests? Or if a backend developer updates a single endpoint, there’s no point in triggering all UI tests - they’re completely unrelated. In those cases, tests become unnecessary overhead that slows down progress.

In some teams, test success is a hard requirement before merging code. If you can’t merge until all tests pass - but your change affects only a small part of the system - triggering all tests becomes a bottleneck.

To solve this, I implemented selective test execution - running only the tests that are actually affected by code changes. This approach helps save both time and infrastructure resources, making the testing and release process faster. In this article, I’ll share my experience and show how to set up such a mechanism using Playwright and GitHub Actions - whether you’re working in a monorepo or with separate frontend/backend repositories.

What problem are we solving?

Running the full e2e test suite on every code change is not always justified. In projects that include frontend, backend and shared modules, this often leads to problems like:

  • Developers waiting for CI feedback longer than it takes to write the fix itself
  • CI infrastructure usage skyrockets (and if you’re in the cloud - so do the costs)
  • Tests are triggered that have nothing to do with the actual changes

Typical scenarios:

  • A frontend developer changes a visual component, but all backend test chains are executed
  • A backend developer tweaks business logic, and the entire UI test suite runs too
  • A shared utility is updated, and suddenly no one knows whether to run a full regression or not

It gets even worse when tests are a mandatory check before merging. Even a minor change can block the entire process - while dozens or even hundreds of unrelated tests are waiting to complete.

My goal wasn’t just to speed up testing. I wanted CI to run only what truly matters. Automatically. Without manual rules or exception lists.

How i implemented it

To demonstrate how you can run only the relevant e2e tests, I put together a small monorepository with a simple but flexible architecture. This isn’t a production-ready setup - it’s a demo project, intentionally simplified to make the mechanism easy to understand and adapt to any real-world structure.

My main goals were:

  • Everything should work transparently - no magic config files
  • The implementation should be reusable - something you can easily apply to another project
  • The structure should remain flexible - easy to extend with new services without rewriting the pipeline

Project structure:

├── frontend/                    # Frontend applications
│   └── apps/                    # Frontend microservices
│       ├── microservice1/
│       └── microservice2/
│       └── shared/             # Common frontend components

├── backend/                     # Backend services
│   └── apps/                    # Backend microservices
│       ├── microservice3/
│       ├── microservice4/
│       └── microservice5/

├── .github/                     # GitHub Actions
│   ├── workflows/               # CI/CD configuration
│   │   └── e2e-runner.yml       # Selective test runner
│   └── preconditions/           # Reusable actions
│       └── e2e/                 # E2E tests environment setup
│           └── action.yml       # Composite action for environment setup

└── tests/                       # Tests
    └── e2e/                     # E2E tests 
Enter fullscreen mode Exit fullscreen mode

Everything is split into logical areas - just like in real projects. There's frontend, backend, a shared area and a dedicated folder for e2e tests. The entire pipeline runs on GitHub Actions. Dependencies are installed using a reusable precondition action. The core logic is tag-based.

The tests themselves are intentionally primitive - because this article isn’t about test coverage or scenarios, it’s about the mechanism for running only what matters. Everything else is just context.

1. Tagging Tests

To determine which tests should run, I use tags directly in the test definitions. The logic is simple: if a PR changes apps/microservice3, the CI system looks for e2e tests tagged with @apps/microservice3 and runs only those.

Each tag is tied to a specific microservice or module. For example:

  test('Test 3', { tag: '@apps/microservice3' }, async ({ ui }) => {
    await ui.google.goto();
    await ui.google.openAppsMenu();
    await ui.google.assertServiceVisible('Maps');
    await ui.google.assertServiceVisible('Gmail');
  });
Enter fullscreen mode Exit fullscreen mode

If microservice3 is affected, this test will be included. If another service is changed, it will be skipped-even if it's in the same file.

I chose the format @apps/<service-name> for two reasons:

  1. It matches the actual project structure.
  2. It’s easy to extract from file paths using grep and sed.

Another example:

  test('Test 1', { tag: '@apps/microservice1' }, async ({ ui }) => {
    await ui.google.goto();
    await ui.google.openAppsMenu();
    await ui.google.assertServiceVisible('YouTube');
    await ui.google.assertServiceVisible('YouTube Music');
  });
Enter fullscreen mode Exit fullscreen mode

If a test is not tagged, it won’t be picked up during selective test runs. For example:

test('Test 5', async ({ ui }) => {
  await ui.google.goto();
  await ui.google.openAppsMenu();
  await ui.google.assertServiceVisible('Calendar');
});
Enter fullscreen mode Exit fullscreen mode

This test will only run during a full test execution (for example, when shared or tests/e2e changes are detected).

Tagging is a core part of this mechanism. It doesn’t require additional tooling and can be maintained manually if needed.

Playwright docs on tags: Test annotations

2. Analyzing сhanges and mapping to tags

The next step is to determine what exactly has changed in the PR and which tests should be triggered. This is done in GitHub Actions using a simple git diff.

What we look for:

  • Changes in paths like apps/microserviceX → means a specific service was affected
  • Changes in shared → means potentially all services are affected
  • Changes in tests/e2e → likely the tests themselves were modified, so the full suite should be executed

Here’s a simplified example from the Find changes step:

changed_apps=$(git diff --name-only origin/main HEAD | grep -E "/apps/[^/]+/" || true)
changed_test_files=$(git diff --name-only origin/main HEAD | grep -E "^tests/e2e/" || true)
Enter fullscreen mode Exit fullscreen mode

Then we normalize the paths. For example, frontend/apps/microservice1/pages/page.tsx becomes apps/microservice1, which maps directly to the tag @apps/microservice1.

full_paths=$(echo "$changed_apps" | sed -E 's#^(.*/apps/[^/]+)/.*#\1#' | sort -u | paste -sd "|" -)
test_paths=$(echo "$full_paths" | tr '|' '\n' | sed -E 's#.*(apps/[^/]+)#\1#' | paste -sd "|" -)
Enter fullscreen mode Exit fullscreen mode

Then what happens:

  • If the changes include shared or tests/e2e, we skip filtering and run all tests
  • If only specific services were changed, we convert the paths into tags, which are passed to Playwright via --grep

The result is saved into a GitHub Actions output variable called test_scope, which is used in later steps:

echo "test_scope=$test_paths" >> $GITHUB_OUTPUT
Enter fullscreen mode Exit fullscreen mode

If test_scope ends up empty, it means either:

  • No relevant changes were found
  • Or no tests are tagged to match the changes

In such cases, you can either skip test execution or fall back to running the full suite - depending on your project’s policy.

3. Running tests with grep

Once we’ve determined which parts of the code have changed and built a list of tags, the next step is simply passing those tags to Playwright. For that, we use the built-in --grep flag, which filters the tests by tags.

Example run:

If the PR affects both apps/microservice1 and apps/microservice4, the resulting scope looks like this:

test_scope=apps/microservice1|apps/microservice4
Enter fullscreen mode Exit fullscreen mode

Then the tests are run like this:

npx playwright test --grep "@apps/microservice1|@apps/microservice4" || true
Enter fullscreen mode Exit fullscreen mode

Why || true?

Because grep might return no matches - for example:

  • The service is new and doesn’t have tests yet
  • The change is minor and not yet covered
  • Tests exist but lack proper tags

In these cases, we don’t want the CI to fail. It’s okay to skip test execution if no relevant tests were found. Failure should only happen when tests exist and they fail - not when there are simply no tests to run.

When do we run all tests?

If the changes include any of the following:

  • A shared directory (frontend or backend)
  • Files in tests/e2e
  • Or if test_scope is completely empty (depending on your policy)

We simply skip filtering and run the full suite:

if: needs.detect-changes.outputs.test_scope == ''
Enter fullscreen mode Exit fullscreen mode

And then:

npx playwright test
Enter fullscreen mode Exit fullscreen mode

All of this is wrapped inside a pipeline with two jobs: selective-tests and all-tests. Based on the content of test_scope, only one of them runs.

What we get in the end

After implementing selective test execution on GitHub Actions, the benefits were immediately obvious - both in terms of human effort and infrastructure usage.

Faster builds
CI no longer runs the entire test suite for every little change. If only one microservice is affected, only its e2e tests are triggered. As a result:

  • Tests run significantly faster
  • You get feedback almost immediately
  • Releases are no longer blocked by irrelevant checks

Resource savings
If you're running tests in the cloud, CI/CD can get expensive - especially when all tests are triggered for every pull request. Selective execution helps save actual money by avoiding unnecessary resource usage.

Easy to maintain
You don’t need to manually build complex test mappings. The entire system is driven by:

  • A simple tag in each test (@apps/xxx)
  • git diff and bash (2 lines of code)
  • grep

The approach is flexible and adaptable to any structure. You can filter by modules, features, user flows, folders, roles - whatever makes sense in your project.

Works out of the box
This mechanism performs well in:

  • Monorepos
  • Projects with split frontend and backend
  • Any project where e2e tests are a mandatory merge requirement

Want real-world scenarios?
You can explore common situations - shared modules, new services, test changes, combinations - in the project’s README

Example GitHub Actions workflow

name: E2E Tests
on:
  pull_request:
    branches: [ main ]

jobs:
  detect-changes:
    runs-on: ubuntu-latest
    outputs:
      test_scope: ${{ steps.scope.outputs.test_scope }}
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Find changes
        id: changes
        run: |
          echo "🔄  Analyzing changes in PR..."
          changed_apps=$(git diff --name-only origin/main HEAD | grep -E "/apps/[^/]+/" || true)
          changed_test_files=$(git diff --name-only origin/main HEAD | grep -E "^tests/e2e/" || true)

          echo "📦 Files changed:"
          [ -z "$changed_apps" ] && echo "  No changes in files" || echo "$changed_apps" | sed 's/^/  /'

          echo "🧪 E2E tests:"
          [ -z "$changed_test_files" ] && echo "  No changes in e2e tests" || echo "$changed_test_files" | sed 's/^/  /'

          echo "✨ Affected services:"
          full_paths=$(echo "$changed_apps" | sed -E 's#^(.*/apps/[^/]+)/.*#\1#' | sort -u | paste -sd "|" -)
          [ -z "$full_paths" ] && echo " No changes in services" || echo "$full_paths" | tr '|' '\n' | sed 's/^/  /'

          test_paths=$(echo "$full_paths" | tr '|' '\n' | sed -E 's#.*(apps/[^/]+)#\1#' | paste -sd "|" -)
          test_files=$(echo "$changed_test_files" | paste -sd "|" -)

          echo "test_paths=${test_paths}" >> $GITHUB_OUTPUT
          echo "changed_test=${test_files}" >> $GITHUB_OUTPUT

      - name: Check shared modules and modified e2e tests
        id: shared
        run: |
          test_paths="${{ steps.changes.outputs.test_paths }}"
          changed_test="${{ steps.changes.outputs.changed_test }}"

          echo "🔄 Checking shared modules and e2e tests:"

          has_shared=$(echo "$test_paths" | tr '|' '\n' | grep -q "shared" && echo "true" || echo "false")
          has_e2e_changes=$([ ! -z "$changed_test" ] && echo "true" || echo "false")

          $has_shared && echo "⚠️ Changes in shared modules detected" && echo "$test_paths" | tr '|' '\n' | grep "shared" | sed 's/^/  /'
          $has_e2e_changes && echo "⚠️ Changes in e2e tests detected" && echo "$changed_test" | tr '|' '\n' | sed 's/^/  /'

          { $has_shared || $has_e2e_changes; } && test_paths="" || echo "✅ No changes in shared modules or e2e tests"

          echo "test_paths=$test_paths" >> $GITHUB_OUTPUT

      - name: Set final scope
        id: scope
        run: |
          test_paths="${{ steps.shared.outputs.test_paths }}"
          echo "🔄 Result:"
          [ -z "$test_paths" ] && echo " ✅ All tests will be run" || {
            echo " ✅ Running tests for:"
            echo "$test_paths" | tr '|' '\n' | sed 's/^/    /'
          }
          echo "test_scope=$test_paths" >> $GITHUB_OUTPUT

  selective-tests:
    needs: detect-changes
    if: needs.detect-changes.outputs.test_scope != ''
    timeout-minutes: 60
    runs-on: ubuntu-latest
    container:
      image: mcr.microsoft.com/playwright:v1.52.0
    steps:
      - uses: actions/checkout@v4
      - uses: ./.github/preconditions/e2e
      - name: Run selective tests
        run: npx playwright test --grep "${{ needs.detect-changes.outputs.test_scope }}" || true

  all-tests:
    needs: detect-changes
    if: needs.detect-changes.outputs.test_scope == ''
    timeout-minutes: 60
    runs-on: ubuntu-latest
    container:
      image: mcr.microsoft.com/playwright:v1.52.0
    steps:
      - uses: actions/checkout@v4
      - uses: ./.github/preconditions/e2e
      - name: Run all tests
        run: npx playwright test
Enter fullscreen mode Exit fullscreen mode

Can you run only the tests that were changed?

This is a common question that usually comes up first:
"If we already know which files have changed - why not run only the tests that were also modified?"

At first glance, it sounds reasonable. But in practice it’s not always a good idea.

Why I didn’t go down that path

In my setup, I run all tests related to the changed components, not just the .spec.ts files that were edited in the PR. The reason is simple - in some projects, tests are not properly isolated.

A typical example:

  • one test modifies some data,
  • another test relies on the state left behind by the first one,
  • or there’s a shared setUp that affects the behavior of all tests.

This is an anti-pattern, of course, but it's still common - especially in older or fast-growing projects.

If you run only the modified test files, you can easily end up with a green build - even though the logic is actually broken.
That’s why I went with a safer approach:
we filter by the affected areas, but within that area, we run all the tests - even if the test files themselves weren't changed.

What about --only-changed?

Playwright does support a --only-changed flag that runs only the .spec.ts files which were changed.
It can be helpful as a temporary solution or for small PRs.

But it’s important to understand: this flag only works at the file level.
It doesn’t track which modules or helpers were changed, nor does it understand which tests depend on them.

So if you modify something like auth.ts, which is used across all tests - --only-changed won’t pick that up, because the test files themselves didn’t change.

What you can do if you want to take it further

If your tests are well-isolated and your project architecture is clean, it’s possible to run only those tests that are truly affected by a change.

At the git diff stage, you can track not only changes in app code, but also updates to shared modules or e2e utilities. To identify which tests depend on those changes, you can build a dependency graph between source files and test files - using tools like ts-morph or dependency-cruiser. This graph reveals which .spec.ts files import or transitively rely on the modified code.

This approach works best when your test structure is modular and the dependency graph is accurate and regularly maintained. Without that, the risk of silently skipping important tests increases.

That’s why in many cases, it's safer and more predictable to run all tests within the affected scope - even if only a small change was made.
It keeps things simple and reduces the chance of hidden regressions in CI.

Conclusion

Selective execution of e2e tests isn’t a silver bullet - but it’s an effective way to reduce build time, lower CI load, and speed up the development cycle. It’s especially valuable in projects where tests are a required condition for merging, and every minute of CI time matters.

All you need is to:

  • tag your tests appropriately
  • analyze changes in the PR
  • and run tests using the --grep filter

The solution is simple, flexible and reusable. You can apply it in monorepos, in projects with separate frontend and backend repos or even as a shared internal standard within your team.

If you want to take this further - like running only the truly modified tests - that’s also possible if architecture makes it achievable.

If you’ve solved a similar problem differently - I’d love to hear how you approached it. Feel free to share your solutions.

Top comments (0)