<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: 137Foundry</title>
    <description>The latest articles on Forem by 137Foundry (@137foundry).</description>
    <link>https://forem.com/137foundry</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3856342%2F39ac4be7-399f-4f6e-9a32-60abf8a8a324.png</url>
      <title>Forem: 137Foundry</title>
      <link>https://forem.com/137foundry</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/137foundry"/>
    <language>en</language>
    <item>
      <title>7 Free Tools for Measuring and Improving Core Web Vitals</title>
      <dc:creator>137Foundry</dc:creator>
      <pubDate>Wed, 29 Apr 2026 11:01:57 +0000</pubDate>
      <link>https://forem.com/137foundry/7-free-tools-for-measuring-and-improving-core-web-vitals-1cgg</link>
      <guid>https://forem.com/137foundry/7-free-tools-for-measuring-and-improving-core-web-vitals-1cgg</guid>
      <description>&lt;p&gt;Improving Core Web Vitals doesn't require paid software. The most useful tools in this space are free, and most of them come from Google or are open source. The challenge is knowing which tool to reach for at each stage of the process, since each one measures slightly different things and answers different questions.&lt;/p&gt;

&lt;p&gt;This roundup covers the seven free tools worth having in your workflow, what each one is best for, and where each falls short. Used together in the right order, they cover everything from identifying which pages are failing to diagnosing the root cause and verifying the fix.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Google Search Console (Core Web Vitals Report)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://developers.google.com/" rel="noopener noreferrer"&gt;Google Search Console&lt;/a&gt; is the starting point for any Core Web Vitals project. The Core Web Vitals report shows field data from the Chrome User Experience Report, grouped by URL pattern, and categorized into poor, needs improvement, and good. It's the only place to see how your site is actually performing for real visitors using real devices.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Identifying which page templates are failing in the field and getting a priority-ranked list of URLs to focus on. The report groups similar URLs (like all blog post pages) together so you can see which template types need the most attention, rather than having to check every URL individually.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Limitation:&lt;/strong&gt; The data is aggregated and delayed by approximately 28 days. You can't see individual user sessions, and you can't isolate which specific device types or geographies are driving poor scores. It tells you what to fix but not precisely why.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Chrome DevTools (Performance Panel and Web Vitals Extension)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://developer.chrome.com/" rel="noopener noreferrer"&gt;Chrome DevTools&lt;/a&gt; is the most powerful diagnostic tool for Core Web Vitals because it lets you run the page in your own browser, reproduce issues locally, and trace them to specific resources and code paths. No external service required.&lt;/p&gt;

&lt;p&gt;The Performance panel records a timeline of everything the browser does during page load, including resource fetches, JavaScript execution, layout and paint events, and user interactions. The Experience row highlights layout shift events. Long tasks are visible in the Main thread track. CPU and network throttling let you simulate slower devices. The Interactions panel, added more recently, records every interaction and shows its breakdown across input delay, processing time, and presentation delay, which is exactly what you need for INP debugging.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Diagnosing the exact cause of an LCP delay, identifying which elements are shifting for CLS, and profiling long JavaScript tasks or event handlers that cause INP.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Limitation:&lt;/strong&gt; Lab conditions. Your machine's CPU speed and browser state don't match a typical user's device. Installed extensions and cached resources can affect results in ways that don't reflect the real user experience. Run tests in a clean Incognito window with throttling enabled for the most representative lab results.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Lighthouse (via DevTools or CLI)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/GoogleChrome/lighthouse" rel="noopener noreferrer"&gt;Lighthouse&lt;/a&gt; is Google's open-source automated auditing tool. It runs a battery of tests against a page, including Core Web Vitals, accessibility, SEO, and best practices, and produces a scored report with specific, prioritized recommendations. The Lighthouse tab in Chrome DevTools runs it in a few clicks. The CLI version allows it to run in a CI/CD pipeline as a quality gate on every pull request.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Getting a structured, actionable report on what to fix, with explanations of each issue and direct links to guidance. Running it in CI catches performance regressions before they reach production. The JSON output format makes it possible to track scores over time programmatically.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Limitation:&lt;/strong&gt; Lighthouse runs in lab conditions and is particularly sensitive to CPU load on your local machine. Scores from DevTools can vary significantly between runs if other applications are running. For stable, repeatable scores, use the CLI version in a clean environment with a consistent CPU profile.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj8ggrnwqtks09uyi78v4.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj8ggrnwqtks09uyi78v4.jpeg" alt="toolbox hardware wrench organized bench" width="800" height="1200"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Photo by Markus Winkler on &lt;a href="https://www.pexels.com" rel="noopener noreferrer"&gt;Pexels&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  4. WebPageTest
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.webpagetest.org/" rel="noopener noreferrer"&gt;WebPageTest&lt;/a&gt; runs page load tests from real browsers on real hardware in locations around the world. Unlike local DevTools tests, a WebPageTest run from a specific city on a specific device profile reflects what a user in that location on that device would actually experience. The waterfall chart shows every resource fetch in sequence, making it easy to see exactly what is happening at each millisecond of page load.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Reproducing performance issues that only appear on slower networks or lower-end devices, testing from multiple geographies, and getting a detailed waterfall view of resource loading order and timing. The filmstrip view shows what the page looks like at each second of load, which is useful for identifying the LCP element and confirming when it actually renders.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Limitation:&lt;/strong&gt; Free tier has rate limits and queued test execution. Not suitable for rapid iterative testing during development. Plan on running a few targeted tests at key milestones rather than using it as a fast feedback loop.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. GTmetrix
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://gtmetrix.com/" rel="noopener noreferrer"&gt;GTmetrix&lt;/a&gt; runs performance tests from multiple locations and presents a combined score alongside a Lighthouse report and a filmstrip view of the page loading in frames. The comparison feature lets you run tests before and after a change and see the difference side by side. The filmstrip view makes it easy to see exactly when the LCP element paints and whether any layout shifts are visible between frames.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Quickly verifying that a fix improved LCP and getting a visual sense of how the page loads frame by frame. The before/after comparison is particularly useful for communicating performance improvements to stakeholders who want to see a visual difference.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Limitation:&lt;/strong&gt; Free tier is limited to specific test locations and a limited number of tests per month. For teams that run many tests during an optimization sprint, the free tier can run out quickly.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. HTTP Archive (Research and Benchmarking)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://httparchive.org/" rel="noopener noreferrer"&gt;HTTP Archive&lt;/a&gt; is a nonprofit project that crawls millions of web pages on a regular schedule and archives their performance data. The data is queryable via BigQuery for custom analysis. The site publishes regular Web Almanac reports with detailed analysis of web performance trends, broken down by page type, CMS, and technology stack.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Benchmarking your site's performance metrics against industry-wide averages, understanding how common specific issues are across the web, and tracking how performance patterns evolve over time. The Web Almanac's annual performance chapter is one of the best publicly available analyses of Core Web Vitals across the web.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Limitation:&lt;/strong&gt; HTTP Archive data is aggregate and historical. It's a research and benchmarking tool, not a diagnostic tool for your specific site. There's a meaningful learning curve to querying the data effectively via BigQuery.&lt;/p&gt;

&lt;h2&gt;
  
  
  7. web.dev (Learning Resource and Measure Tool)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://web.dev/" rel="noopener noreferrer"&gt;web.dev&lt;/a&gt; is Google's resource for web performance best practices. It hosts a Measure tool for running a Lighthouse audit from a URL, but more importantly it's where Google publishes detailed technical guidance on Core Web Vitals, including how each metric is calculated, what causes failures, and how to fix specific issues. The articles are written by the Chrome team members who define the metrics, which makes them unusually authoritative.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Understanding the "why" behind a score, reading Google's official guidance on LCP, CLS, and INP, and finding detailed case studies from sites that have improved their scores. When you find an issue in DevTools or Lighthouse and need to understand it deeply before writing a fix, web.dev is where you go.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Limitation:&lt;/strong&gt; It's documentation and guidance, not a diagnostic tool for your specific site. Use it alongside the other tools in this list.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4jnjabaag4lh6cmo24rd.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4jnjabaag4lh6cmo24rd.jpg" alt="library books shelves resources organized" width="800" height="534"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Photo by &lt;a href="https://pixabay.com/users/ninocare-3266770/" rel="noopener noreferrer"&gt;ninocare&lt;/a&gt; on &lt;a href="https://pixabay.com" rel="noopener noreferrer"&gt;Pixabay&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  How These Fit Together in Practice
&lt;/h2&gt;

&lt;p&gt;A practical Core Web Vitals workflow uses these tools in a specific sequence. Start with Search Console to find which page templates have field data problems and which URLs to prioritize. Run those specific URLs through WebPageTest or GTmetrix to get a baseline lab measurement. Use Chrome DevTools to drill into the specific resources and code causing the issues on those pages. Use Lighthouse to generate a structured list of recommendations. Use web.dev to read the guidance for each issue type before writing fixes. Then run WebPageTest or GTmetrix again after shipping the fix to confirm the score moved.&lt;/p&gt;

&lt;p&gt;For a step-by-step guide to using these tools in a real production audit, &lt;a href="https://137foundry.com" rel="noopener noreferrer"&gt;137Foundry&lt;/a&gt; published a full walkthrough that covers the complete process from field data to fix verification: &lt;a href="https://137foundry.com/articles/how-to-audit-fix-core-web-vitals-production" rel="noopener noreferrer"&gt;How to Audit and Fix Core Web Vitals on a Production Website&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The tooling in this space is genuinely excellent and genuinely free. The bottleneck is rarely tool access. It's knowing which issues to prioritize, how to interpret what the tools are reporting, and how to trace a reported problem to the specific code that's causing it. That part comes from running the process on real sites and building familiarity with what each tool is actually measuring.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>programming</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Why Cumulative Layout Shift Is Harder to Fix Than It Looks</title>
      <dc:creator>137Foundry</dc:creator>
      <pubDate>Wed, 29 Apr 2026 11:01:52 +0000</pubDate>
      <link>https://forem.com/137foundry/why-cumulative-layout-shift-is-harder-to-fix-than-it-looks-1ki2</link>
      <guid>https://forem.com/137foundry/why-cumulative-layout-shift-is-harder-to-fix-than-it-looks-1ki2</guid>
      <description>&lt;p&gt;Of the three Core Web Vitals, Cumulative Layout Shift often surprises developers. LCP is a loading problem. INP is a JavaScript problem. CLS looks like it should be solved by adding &lt;code&gt;width&lt;/code&gt; and &lt;code&gt;height&lt;/code&gt; to images, and then it turns out your score barely moves after you do that.&lt;/p&gt;

&lt;p&gt;The reason CLS is deceptively hard is that the score accumulates from multiple sources, most of which are not images. A CLS score is the sum of all unexpected layout shifts during a page session, and the sources include web fonts, injected third-party content, dynamic banners, browser-extension interference, and animated elements that change dimensions without user interaction. Fixing images helps. It usually isn't enough on its own.&lt;/p&gt;

&lt;h2&gt;
  
  
  How CLS Is Actually Calculated
&lt;/h2&gt;

&lt;p&gt;The metric combines the "impact fraction" (how much of the viewport moved) with the "distance fraction" (how far it moved). A shift that moves a large block of content a short distance can score similarly to one that moves a small element a long distance. Understanding the formula matters because it reveals that small-looking shifts can have significant score impact if they affect large portions of the viewport.&lt;/p&gt;

&lt;p&gt;CLS is also session-windowed. The browser groups shifts into windows of up to 5 seconds, with a maximum gap of 1 second between shifts. The score for a window is the sum of all shifts in that window. Your reported CLS is the worst window's score.&lt;/p&gt;

&lt;p&gt;This windowed approach means that a series of small shifts that happen in quick succession can accumulate into a high CLS score, even if no single shift is dramatic. Pages that load ads or dynamic content in stages often have this pattern. What looks visually like minor movement repeatedly is registering as a high CLS window score.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2yl3rjxlp0wop1dew76u.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2yl3rjxlp0wop1dew76u.jpg" alt="construction scaffolding building frame assembly" width="800" height="592"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Photo by &lt;a href="https://pixabay.com/users/2211438-2211438/" rel="noopener noreferrer"&gt;2211438&lt;/a&gt; on &lt;a href="https://pixabay.com" rel="noopener noreferrer"&gt;Pixabay&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Sources Most Teams Miss
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Web font loading.&lt;/strong&gt; When &lt;code&gt;font-display: swap&lt;/code&gt; is set, the browser renders text in the fallback font first, then swaps to the web font when it loads. If the web font has different metrics than the fallback (different x-height, character width, or line height), the swap causes a layout shift. This is a common source of CLS on editorial sites and landing pages that use distinctive typography.&lt;/p&gt;

&lt;p&gt;The fix is either to use &lt;code&gt;font-display: optional&lt;/code&gt; (no swap at all, web font only used if loaded in the first render frame) or to use the CSS &lt;code&gt;size-adjust&lt;/code&gt;, &lt;code&gt;ascent-override&lt;/code&gt;, &lt;code&gt;descent-override&lt;/code&gt;, and &lt;code&gt;line-gap-override&lt;/code&gt; descriptors to align the fallback font metrics to the web font. The latter approach is more complex but preserves the web font experience for users on fast connections.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ads and embeds.&lt;/strong&gt; Ad networks and embedded third-party widgets frequently insert content into reserved or unreserved slots after the page has painted. A cookie consent banner that appears above the page header is a classic example. An ad that loads into a container without a declared minimum height is another.&lt;/p&gt;

&lt;p&gt;For ad slots, always declare a minimum container height in CSS matching the expected ad dimensions. For consent banners and notifications, wrap them in a container with a declared minimum height even if you plan to animate them in. The declared height prevents the shift on insertion.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Animations.&lt;/strong&gt; CSS animations that change &lt;code&gt;top&lt;/code&gt;, &lt;code&gt;left&lt;/code&gt;, &lt;code&gt;margin&lt;/code&gt;, or &lt;code&gt;padding&lt;/code&gt; cause layout shifts because those properties affect how the browser positions surrounding elements. Animations that use &lt;code&gt;transform: translate()&lt;/code&gt; do not cause layout shifts because transforms operate in the compositor layer and don't affect layout flow. Review any CSS that animates position-affecting properties and convert them to transform-based equivalents.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dynamically loaded content above existing content.&lt;/strong&gt; Any JavaScript that inserts content above already-painted content will cause a layout shift if the existing content moves down. This includes chat widgets that appear in a fixed position but whose initialization changes layout, notification banners inserted at page top, and lazy-loaded content sections above the fold that load after user scroll.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Tools Don't Always Catch Everything
&lt;/h2&gt;

&lt;p&gt;Lab tools like Lighthouse capture CLS from a single simulated page load in a clean browser context. They can't reproduce a shift caused by a browser extension that adds toolbar content. They can't always reproduce shifts from ad networks whose ad content varies by visitor or by geographic location. And they don't simulate the range of user devices and system fonts that affect font rendering and fallback behavior.&lt;/p&gt;

&lt;p&gt;Field data from &lt;a href="https://developers.google.com/" rel="noopener noreferrer"&gt;Google Search Console&lt;/a&gt; and the Chrome User Experience Report captures the actual 75th-percentile CLS across real visits. If your lab CLS score is good but your field data shows poor CLS, the problem is almost certainly coming from a source that varies by visitor, like ads, extensions, or system font differences between operating systems.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://developer.mozilla.org/" rel="noopener noreferrer"&gt;MDN Web Docs&lt;/a&gt; has detailed documentation on the Layout Instability API, which is what the browser uses to measure CLS. Reading the spec-level description of what constitutes a layout shift and what doesn't is useful when you're debugging an unexpected score. Not every visual change counts as a shift; only unexpected changes to the start position of layout-affecting elements in the viewport trigger the metric.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where to Start Debugging
&lt;/h2&gt;

&lt;p&gt;The &lt;a href="https://developer.chrome.com/" rel="noopener noreferrer"&gt;Chrome DevTools&lt;/a&gt; Performance panel shows layout shifts as events in the timeline. Click on any shift event to see which elements moved and what triggered the shift. The "Experience" row in the timeline highlights shift events specifically, and clicking into one shows the affected elements and the computed CLS score contribution from that event.&lt;/p&gt;

&lt;p&gt;For shift events that happen after initial load (caused by ads, dynamic content, or JavaScript-triggered changes), set up a longer recording in DevTools that captures user interaction or page scroll. Some shifts only appear when the user interacts with specific page elements or when a page reaches a certain scroll position. Record a realistic user session rather than just the initial page load.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9fusgt6jniytage7b7h2.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9fusgt6jniytage7b7h2.jpeg" alt="layout grid wireframe digital screen design" width="800" height="450"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Photo by Google DeepMind on &lt;a href="https://www.pexels.com" rel="noopener noreferrer"&gt;Pexels&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Start with the shifts that appear in the first 2.5 seconds of page load, since those tend to have the highest impact on the reported score. Work through the sources in order: images without explicit dimensions, web font swaps, injected content, and animated elements using layout-affecting properties. Use a real device on a slower network profile to surface shifts that only appear under load conditions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Verifying the Fix
&lt;/h2&gt;

&lt;p&gt;After applying CLS fixes, measure the change in both lab tools and, over time, in field data. For font-related fixes, disable web fonts entirely in DevTools and check whether the content changes visually. If the text barely changes appearance, &lt;code&gt;font-display: optional&lt;/code&gt; is safe to deploy. For ad-slot fixes, use DevTools' network throttle to load the page slowly and observe whether content shifts when ads inject.&lt;/p&gt;

&lt;p&gt;Keep in mind that a CLS fix on one template doesn't mean the same fix applies everywhere. A site with multiple page types may have different CLS sources on article pages versus category pages versus landing pages. Audit each template type separately.&lt;/p&gt;

&lt;p&gt;For a broader look at all three Core Web Vitals metrics and how to address LCP and INP alongside CLS, &lt;a href="https://137foundry.com" rel="noopener noreferrer"&gt;137Foundry&lt;/a&gt; put together a complete production audit guide: &lt;a href="https://137foundry.com/articles/how-to-audit-fix-core-web-vitals-production" rel="noopener noreferrer"&gt;How to Audit and Fix Core Web Vitals on a Production Website&lt;/a&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"CLS is the metric where I see teams get stuck longest. They fix the images, run Lighthouse again, see a slightly better number, and assume it's done. But field data still shows poor. The remaining score almost always comes from font swaps, third-party injections, or animations that nobody thought of as layout-affecting." - Dennis Traina, &lt;a href="https://137foundry.com/services" rel="noopener noreferrer"&gt;founder of 137Foundry&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;CLS requires systematic enumeration of every source of visual change on a page, not just the obvious ones. The good news is that once you've worked through that enumeration on a given page template, the fixes usually stay fixed. Unlike INP, which can regress with any new JavaScript, CLS problems are structural and tend to stay solved once the root cause is addressed properly.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>programming</category>
      <category>productivity</category>
    </item>
    <item>
      <title>The Real Cost of Silent Data Pipeline Failures</title>
      <dc:creator>137Foundry</dc:creator>
      <pubDate>Tue, 28 Apr 2026 04:18:37 +0000</pubDate>
      <link>https://forem.com/137foundry/the-real-cost-of-silent-data-pipeline-failures-4k3p</link>
      <guid>https://forem.com/137foundry/the-real-cost-of-silent-data-pipeline-failures-4k3p</guid>
      <description>&lt;p&gt;A loud failure - a crash, an error email, an alert firing at 3am - is a recoverable problem. You know something broke, you know when it broke, and you can investigate.&lt;/p&gt;

&lt;p&gt;A silent failure is different. The pipeline runs. No errors are logged. No alerts fire. The data is wrong, or incomplete, or stale, and nobody knows until someone notices that the numbers don't add up. At that point, the first question is "how long has this been happening?" and the answer is almost always longer than you expected.&lt;/p&gt;

&lt;p&gt;This piece is about silent failures: why they happen, what they cost, and how to design pipelines that surface problems rather than hiding them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Data Pipelines Fail Silently
&lt;/h2&gt;

&lt;p&gt;Silent failures have a structural cause: the code treats a missing or incorrect result as a valid outcome.&lt;/p&gt;

&lt;p&gt;The most common pattern is this: a pipeline pulls records from an API, and the API starts returning fewer records than expected - maybe due to a rate limit, a pagination bug, or a filter that was added on the source side. The pipeline processes the records it receives, writes them to the destination, and logs success. From the pipeline's perspective, nothing went wrong. From the business's perspective, 30% of the data from the last two weeks is missing.&lt;/p&gt;

&lt;p&gt;Another common pattern: a field in the source system gets renamed or its format changes. The pipeline's transformation code mapped from the old field name to the destination field. Now the source returns null for that field (the old name doesn't exist anymore), the transformation writes null to the destination, and every record for the last three days has a null value where it should have a meaningful value.&lt;/p&gt;

&lt;p&gt;Both cases represent the same design failure: the pipeline has no way to distinguish between "everything worked correctly" and "something changed and we processed garbage."&lt;/p&gt;

&lt;h2&gt;
  
  
  The Business Costs
&lt;/h2&gt;

&lt;p&gt;The direct cost of a silent data pipeline failure is the bad data that reaches reporting, operations, or downstream systems. But the cost multipliers are significant:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Time to detection.&lt;/strong&gt; Silent failures are found by humans reviewing output, not by automated monitoring. The average time-to-detection is measured in days to weeks for pipelines without monitoring. Every day of latency compounds the amount of data that needs to be corrected.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Recovery effort.&lt;/strong&gt; When a pipeline has been silently dropping records for two weeks, recovering requires identifying which records were affected, re-running the pipeline for the affected time window, deduplicating any overlap with records that were correctly written, and verifying the corrected data. This is significantly more expensive than the incremental fix of a loud failure caught immediately.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Trust erosion.&lt;/strong&gt; After a team discovers that the pipeline has been silently producing wrong data, the standard response is to stop trusting the data source entirely until it's verified. This often means manual data validation work that bypasses the pipeline - which defeats the purpose of automation and creates a parallel data entry problem.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Decision quality.&lt;/strong&gt; If the bad data reached business decisions before anyone noticed - a performance report, a customer analysis, a budget forecast - those decisions were made on incorrect information. Quantifying this cost is harder, but it's real.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn0fv267yznfzi2kictie.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn0fv267yznfzi2kictie.jpeg" alt="data pipeline monitoring alert system" width="800" height="534"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Photo by panumas nikhomkhai on &lt;a href="https://www.pexels.com" rel="noopener noreferrer"&gt;Pexels&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What Silent Failures Look Like in Practice
&lt;/h2&gt;

&lt;p&gt;A few real patterns:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The vanishing records case.&lt;/strong&gt; A pipeline extracts orders from an e-commerce platform. The platform adds a new required field to its API response. The pipeline's JSON parser doesn't handle the new field structure, throws an exception in the transformation step, catches it with a broad &lt;code&gt;except Exception&lt;/code&gt; handler, logs a debug message, and skips the record. The pipeline completes with 0 errors and 15% fewer records than yesterday. The monitoring dashboard shows "Pipeline: OK."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The null propagation case.&lt;/strong&gt; A pipeline syncs contact records from a CRM. An admin renames the "Company" field to "Organization" in the CRM. The pipeline's field mapping extracts "Company," which now returns null, and writes it as null to the destination. Every record written after the rename has a null company field. Reports that group by company show an explosion of records with no company associated.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The stale data case.&lt;/strong&gt; A pipeline is supposed to run every hour. A deployment changes the scheduling configuration. The pipeline stops running. No records fail - there simply are no new records. Nobody notices for three days because the data isn't wrong, it's just not updating.&lt;/p&gt;

&lt;p&gt;Each of these is detectable with basic monitoring. None of them are detectable without it.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"We've done data audits on pipelines that have been running for over a year where the team assumed the data was correct because nothing had ever crashed. The combination of no monitoring and optimistic error handling is how you end up with analytics you can't trust and can't recover." - Dennis Traina, &lt;a href="https://137foundry.com/services" rel="noopener noreferrer"&gt;founder of 137Foundry&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Designing for Visibility
&lt;/h2&gt;

&lt;p&gt;The fix is not complex. Five things give you visibility into a data pipeline:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Count records at each stage.&lt;/strong&gt; How many records were extracted? How many passed transformation? How many were successfully loaded? If the ratio is unexpected, alert on it. A 90% drop in extraction volume without a corresponding change in the source system is a problem.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Track run timing.&lt;/strong&gt; Log start time, end time, and duration for each run. Alert when a run takes significantly longer than the historical average. Alert when a run hasn't started within the expected window.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Separate transient and structural errors.&lt;/strong&gt; Transient errors (rate limits, network timeouts) should be retried automatically and logged. They should alert if they exceed a threshold. Structural errors (records that fail transformation due to unexpected field values) should never be swallowed silently. Log the record, the field, and the value. Alert if structural errors exceed zero or a small threshold per run.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Validate schema on extraction.&lt;/strong&gt; When extracting data, compare the schema of the API response to a stored baseline. If a field appears that wasn't there before, or a field that was previously present is now absent, log a warning and alert. Schema drift is the most common cause of silent failures.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Store per-run metrics in a queryable log.&lt;/strong&gt; Write a row to a log table for each run: records extracted, records failed, records loaded, run duration, error count. This gives you a historical record that's useful for diagnosing issues after the fact.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhxqe0ezeei5bywqz3w1q.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhxqe0ezeei5bywqz3w1q.jpeg" alt="data analytics pipeline logging metrics" width="800" height="530"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Photo by Lukas Blazek on &lt;a href="https://www.pexels.com" rel="noopener noreferrer"&gt;Pexels&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Cost of Retrofitting Monitoring
&lt;/h2&gt;

&lt;p&gt;One reason monitoring often gets deferred is that it feels like overhead at the start of a project, when the team is focused on getting the pipeline working at all. The irony is that retrofitting monitoring onto a pipeline that's been running without it is significantly more expensive than building it in from the start.&lt;/p&gt;

&lt;p&gt;Retrofitting requires: understanding the existing behavior well enough to define normal baselines, adding logging infrastructure to code that wasn't designed for it, deploying changes to a running pipeline without disrupting data flow, and verifying that the monitoring correctly reflects actual pipeline state.&lt;/p&gt;

&lt;p&gt;Building monitoring in from the start takes a fraction of that time because the logging points are natural integration points in the code architecture.&lt;/p&gt;

&lt;p&gt;For the practical pipeline architecture that includes monitoring as a first-class concern - alongside idempotent loads, incremental extraction, and error handling - &lt;a href="https://137foundry.com/articles/how-to-build-etl-pipeline-business-data-syncing" rel="noopener noreferrer"&gt;How to Build an ETL Pipeline for Business Data Syncing&lt;/a&gt; covers each piece in sequence.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://137foundry.com" rel="noopener noreferrer"&gt;https://137foundry.com&lt;/a&gt; works with businesses on data pipeline design and implementation. The &lt;a href="https://137foundry.com/services/ai-automation" rel="noopener noreferrer"&gt;AI automation and data integration services&lt;/a&gt; include both pipeline architecture and the operational monitoring setup that makes pipelines trustworthy rather than just functional.&lt;/p&gt;

&lt;p&gt;For monitoring infrastructure, &lt;a href="https://prometheus.io/" rel="noopener noreferrer"&gt;Prometheus&lt;/a&gt; and &lt;a href="https://grafana.com/" rel="noopener noreferrer"&gt;Grafana&lt;/a&gt; are widely used for pipeline metrics collection and alerting. For orchestration that includes built-in run observability, &lt;a href="https://apache.org/" rel="noopener noreferrer"&gt;Apache Airflow&lt;/a&gt; tracks run history, task durations, and failure states in a web UI. &lt;a href="https://www.python.org/" rel="noopener noreferrer"&gt;Python&lt;/a&gt; with &lt;a href="https://www.sqlalchemy.org/" rel="noopener noreferrer"&gt;SQLAlchemy&lt;/a&gt; is the standard stack for custom pipeline implementation with relational state management.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>automation</category>
      <category>programming</category>
    </item>
    <item>
      <title>How to Add Error Handling and Monitoring to a Data Pipeline</title>
      <dc:creator>137Foundry</dc:creator>
      <pubDate>Tue, 28 Apr 2026 04:18:34 +0000</pubDate>
      <link>https://forem.com/137foundry/how-to-add-error-handling-and-monitoring-to-a-data-pipeline-53bg</link>
      <guid>https://forem.com/137foundry/how-to-add-error-handling-and-monitoring-to-a-data-pipeline-53bg</guid>
      <description>&lt;p&gt;Most data pipeline guides cover the happy path: extract data, transform it, load it to the destination. What they skip is everything that happens when the path isn't happy: the API that returns unexpected data, the transformation that fails partway through, the destination write that times out after writing 400 of 500 records.&lt;/p&gt;

&lt;p&gt;This guide is the other half: how to handle errors correctly, how to monitor pipeline health, and how to make your pipeline re-runnable after a failure.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Categorize Your Error Types
&lt;/h2&gt;

&lt;p&gt;Before writing error handling code, decide which category an error belongs to. The handling is different for each type.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Transient errors&lt;/strong&gt; are temporary conditions that resolve themselves: rate limit exceeded, connection timeout, destination temporarily unavailable. These should be retried automatically with exponential backoff. They should not fail the pipeline on first occurrence. After a configurable number of retries, they should escalate to an alert.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Structural errors&lt;/strong&gt; are problems with the data itself: a required field is null, a value doesn't match the expected type, a foreign key doesn't exist in the destination. These records cannot be processed with the current transformation logic. They should be written to a dead-letter log (with the record content, error type, and timestamp) and skipped. The pipeline should continue processing other records. At the end of the run, alert if structural error count exceeds zero or a defined threshold.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fatal errors&lt;/strong&gt; are conditions that make continued execution meaningless: the source system is completely unavailable, authentication has failed, the destination schema has changed in a way that invalidates all records. These should fail the pipeline immediately, log the full context, and alert immediately. Do not attempt to continue.&lt;/p&gt;

&lt;p&gt;The most common mistake is using a single broad exception handler that converts structural and fatal errors into transient ones. This produces the silent failure pattern where the pipeline "succeeds" by swallowing exceptions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Implement Retry Logic for Transient Errors
&lt;/h2&gt;

&lt;p&gt;A basic exponential backoff implementation for transient errors:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;with_retry&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_attempts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;base_delay&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_delay&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;60.0&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;attempt&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_attempts&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;fn&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;TransientError&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;attempt&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;max_attempts&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;raise&lt;/span&gt;
            &lt;span class="n"&gt;delay&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base_delay&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="n"&gt;attempt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;uniform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;max_delay&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;warning&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Transient error on attempt &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;attempt&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;. Retrying in &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;delay&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;delay&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The jitter (&lt;code&gt;random.uniform(0, 1)&lt;/code&gt;) prevents multiple concurrent pipelines from all retrying at the same time, which can amplify load spikes on the source system.&lt;/p&gt;

&lt;p&gt;The important detail: define &lt;code&gt;TransientError&lt;/code&gt; as a specific exception class (or a set of HTTP status codes: 429, 503, 502) rather than catching all exceptions. Retrying a &lt;code&gt;ValueError&lt;/code&gt; or &lt;code&gt;KeyError&lt;/code&gt; is not useful and hides bugs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Build a Dead-Letter Log for Structural Errors
&lt;/h2&gt;

&lt;p&gt;Records that fail transformation should not be silently dropped. Write them to a dead-letter log table or file with enough context to investigate:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_record&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dead_letter_log&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;transformed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;transformed&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;StructuralError&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;dead_letter_log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;utcnow&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error_message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;raw_record&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pipeline_run_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;current_run_id&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;pipeline_run_id&lt;/code&gt; is critical for correlating dead-letter records with a specific run when debugging later.&lt;/p&gt;

&lt;p&gt;At the end of each run, count dead-letter entries created during the run. If the count exceeds your threshold (zero for a stable pipeline, or a small absolute number), include the count in your run summary and trigger an alert.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftck2tfx7cey44b1r7fur.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftck2tfx7cey44b1r7fur.jpeg" alt="data pipeline error handling code development" width="800" height="533"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Photo by Markus Spiske on &lt;a href="https://www.pexels.com" rel="noopener noreferrer"&gt;Pexels&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Step 4: Add Run-Level Metrics Logging
&lt;/h2&gt;

&lt;p&gt;Write a metrics record at the end of each pipeline run. This is separate from error logging - it captures the overall run health:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;run_metrics&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;run_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;run_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pipeline_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;crm_to_warehouse&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;started_at&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;start_time&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;completed_at&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;utcnow&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;duration_seconds&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;utcnow&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start_time&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;total_seconds&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;records_extracted&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;extracted_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;records_transformed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;transformed_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;records_loaded&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;loaded_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;structural_errors&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;dead_letter_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;transient_errors_retried&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;retry_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;success&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;dead_letter_count&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;partial&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="n"&gt;metrics_log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_metrics&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With this log table, you can query:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"How many records did this pipeline process yesterday vs the historical average?"&lt;/li&gt;
&lt;li&gt;"How many structural errors have accumulated this week?"&lt;/li&gt;
&lt;li&gt;"Which runs took significantly longer than usual?"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These queries are the basis for alerting that doesn't require anyone to look at logs manually.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5: Implement Schema Validation
&lt;/h2&gt;

&lt;p&gt;Schema drift - unexpected changes to the source API's response format - is the most common cause of silent failures that aren't caught by error handling. Add explicit schema validation at the extraction step:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;EXPECTED_FIELDS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;email&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;created_at&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;updated_at&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;validate_schema&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;records&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;records&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;
    &lt;span class="n"&gt;sample_keys&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;records&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="n"&gt;new_fields&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sample_keys&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;EXPECTED_FIELDS&lt;/span&gt;
    &lt;span class="n"&gt;missing_fields&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;EXPECTED_FIELDS&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;sample_keys&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;new_fields&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;warning&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;New fields detected in source: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;new_fields&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;alert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Schema change detected&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;details&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;new_fields&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;new_fields&lt;/span&gt;&lt;span class="p"&gt;)})&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;missing_fields&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Expected fields missing from source: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;missing_fields&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;FatalError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Required fields missing: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;missing_fields&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;New fields are a warning (the API added something, you might want to include it). Missing required fields are a fatal error (the API removed something your transformation depends on).&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"A pipeline that validates its input schema and alerts on changes costs almost nothing to build. But it's the single most valuable defensive measure you can add, because schema changes in source systems are the failure mode that catches teams off guard most consistently." - Dennis Traina, &lt;a href="https://137foundry.com/services" rel="noopener noreferrer"&gt;founder of 137Foundry&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Step 6: Make the Pipeline Re-Runnable
&lt;/h2&gt;

&lt;p&gt;A pipeline that can be safely re-run after a partial failure is worth significantly more than one that can't. Two properties make this possible:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Checkpointing.&lt;/strong&gt; Store the high-water mark after each successful batch write. If the pipeline fails, restart from the last checkpoint rather than from the beginning. The checkpoint is typically a timestamp or sequence number stored in a persistent store (a database row, a file, a cache entry).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Idempotent loads.&lt;/strong&gt; Use upsert semantics at the destination rather than insert. An upsert with a unique key (customer ID, order number, record hash) ensures that re-running a batch doesn't create duplicate records. This interacts with checkpointing: if your checkpoint has any overlap window (you re-process the last N records for safety), upserts ensure the overlap records are updated rather than duplicated.&lt;/p&gt;

&lt;h2&gt;
  
  
  Setting Up Alerting
&lt;/h2&gt;

&lt;p&gt;With run metrics logging in place, alerting is straightforward. Three alert conditions cover most critical failures:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Pipeline didn't run.&lt;/strong&gt; Alert if no run has completed within &lt;code&gt;expected_interval * 1.5&lt;/code&gt; of the last successful run.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Record count anomaly.&lt;/strong&gt; Alert if &lt;code&gt;records_extracted&lt;/code&gt; is more than 30% below the historical average for this pipeline and time window.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Structural errors above threshold.&lt;/strong&gt; Alert if &lt;code&gt;structural_errors &amp;gt; 0&lt;/code&gt; (or your defined threshold).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;These three conditions catch the vast majority of real pipeline failures without requiring manual monitoring.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://137foundry.com/articles/how-to-build-etl-pipeline-business-data-syncing" rel="noopener noreferrer"&gt;How to Build an ETL Pipeline for Business Data Syncing&lt;/a&gt; covers the extraction and load design that this error handling layer builds on top of - incremental extraction, idempotent upserts, and checkpoint management in an integrated design.&lt;/p&gt;

&lt;p&gt;For help building data pipelines with these operational properties from the start, &lt;a href="https://137foundry.com" rel="noopener noreferrer"&gt;https://137foundry.com&lt;/a&gt; works with businesses on both architecture and implementation. The &lt;a href="https://137foundry.com/services/data-integration" rel="noopener noreferrer"&gt;data integration services&lt;/a&gt; cover the full pipeline lifecycle, including the monitoring setup that makes pipelines trustworthy in production.&lt;/p&gt;

&lt;p&gt;For pipeline implementation, &lt;a href="https://www.python.org/" rel="noopener noreferrer"&gt;Python&lt;/a&gt; with &lt;a href="https://www.sqlalchemy.org/" rel="noopener noreferrer"&gt;SQLAlchemy&lt;/a&gt; is the standard stack for custom ETL with relational databases. &lt;a href="https://www.postgresql.org/" rel="noopener noreferrer"&gt;PostgreSQL&lt;/a&gt; handles both pipeline operational state (dead-letter tables, run logs) and destination storage. For orchestration-level error handling and retry policies, &lt;a href="https://apache.org/" rel="noopener noreferrer"&gt;Apache Airflow&lt;/a&gt; provides per-task retry configuration and failure branching.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnyhs1n1bid5b628x05h1.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnyhs1n1bid5b628x05h1.jpeg" alt="data pipeline monitoring infrastructure server" width="800" height="532"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Photo by panumas nikhomkhai on &lt;a href="https://www.pexels.com" rel="noopener noreferrer"&gt;Pexels&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>automation</category>
      <category>programming</category>
    </item>
    <item>
      <title>How to Build a Vendor Scoring Rubric That Stakeholders Actually Trust</title>
      <dc:creator>137Foundry</dc:creator>
      <pubDate>Sat, 25 Apr 2026 11:28:03 +0000</pubDate>
      <link>https://forem.com/137foundry/how-to-build-a-vendor-scoring-rubric-that-stakeholders-actually-trust-4edb</link>
      <guid>https://forem.com/137foundry/how-to-build-a-vendor-scoring-rubric-that-stakeholders-actually-trust-4edb</guid>
      <description>&lt;p&gt;A vendor scoring rubric sounds like the kind of bureaucratic box-ticking that delays decisions rather than producing them. Used correctly, it does the opposite: it forces the committee to agree on criteria before they see the vendors, reduces the influence of whoever talks loudest in the room, and gives minority viewpoints a visible record in the final data.&lt;/p&gt;

&lt;p&gt;The key distinction is whether the rubric was built before the demos or after. A rubric built after the demos is usually rationalization. A rubric built before the demos is a decision tool.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Identify Your Evaluation Dimensions
&lt;/h2&gt;

&lt;p&gt;Start by listing every dimension that matters for the decision. Group these into functional, operational, and vendor relationship categories:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Functional:&lt;/strong&gt; Does the software do what you need it to do? This includes core features, edge case handling, integration capabilities, and the UX quality for your team's daily workflows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Operational:&lt;/strong&gt; How does the software fit into your environment? This covers security and compliance, implementation timeline, data migration complexity, support tier availability, and uptime guarantees.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Vendor relationship:&lt;/strong&gt; What is the vendor like to work with? This includes response times during the evaluation, flexibility on contract terms, references from current customers, and the stability of the vendor's business.&lt;/p&gt;

&lt;p&gt;For each dimension, list the specific questions or scenarios you'll use to evaluate it. "Security and compliance" is not evaluable. "SOC 2 Type II certified with audit reports available, GDPR-compliant data processing agreement available for review" is evaluable.&lt;/p&gt;

&lt;p&gt;User review platforms like &lt;a href="https://www.g2.com/" rel="noopener noreferrer"&gt;G2&lt;/a&gt; can help you identify evaluation dimensions you might otherwise miss. The most helpful negative reviews on G2 consistently surface the same categories of failure: poor support responsiveness, missing integrations, confusing pricing, and slow implementation timelines. Running through the top negative reviews for vendors you're considering before you finalize your evaluation dimensions often surfaces operational and vendor relationship criteria that don't appear in feature comparison lists but have significant impact on day-to-day experience. What users repeatedly complain about after twelve months of use is worth weighting heavily as an evaluation dimension before any demos.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Assign Weights Before Any Demos
&lt;/h2&gt;

&lt;p&gt;Once you have your evaluation dimensions, assign weights before you see any vendor. Weights represent how much each dimension affects the decision relative to others.&lt;/p&gt;

&lt;p&gt;A simple weighting system:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Weight 3:&lt;/strong&gt; Deal-breaker criteria. A vendor that fails this dimension is out.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Weight 2:&lt;/strong&gt; Important but negotiable criteria. Strong performance here moves a vendor up significantly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Weight 1:&lt;/strong&gt; Nice-to-have criteria. Good to have, but not decision-driving.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the committee disagrees on weights, that's the most important disagreement to resolve before the demos start. A committee member who weights "API access" at 3 and another who weights it at 1 have fundamentally different visions of what the software needs to do. Better to surface and resolve that now.&lt;/p&gt;

&lt;p&gt;Document the final weights with committee sign-off. This step sounds bureaucratic but prevents the weights from being revised retroactively to favor a vendor someone liked.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2p7uuqenpfavmtd2nibu.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2p7uuqenpfavmtd2nibu.jpeg" alt="Evaluation matrix spreadsheet with scoring criteria and weighted columns" width="800" height="534"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Photo by Markus Winkler on &lt;a href="https://www.pexels.com" rel="noopener noreferrer"&gt;Pexels&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Define the Scoring Scale
&lt;/h2&gt;

&lt;p&gt;The scoring scale should be simple enough that committee members can apply it consistently. A three-point scale works well:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;3:&lt;/strong&gt; Exceeds requirements. The vendor handles this better than we expected or needs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;2:&lt;/strong&gt; Meets requirements. The vendor handles this adequately for our use case.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;1:&lt;/strong&gt; Partially meets requirements. The vendor can address this with workarounds or configuration.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;0:&lt;/strong&gt; Does not meet requirements. The vendor cannot address this criterion.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The difference between 1 and 0 is important: 1 means the gap can be closed at acceptable cost; 0 means it can't. A vendor that scores 0 on a Weight-3 criterion is automatically disqualified regardless of how they score elsewhere. Build that logic into your rubric explicitly so there's no ambiguity.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: Score Independently Before Comparing
&lt;/h2&gt;

&lt;p&gt;After each demo, each committee member should complete their scores independently before the group discussion. Group discussion before independent scoring produces groupthink -- the strongest personality or the most confident speaker dominates.&lt;/p&gt;

&lt;p&gt;Collect all scores before any comparison discussion. When you reveal the aggregate, disagreements become visible: two people scored a criterion 3 and two scored it 1. Those disagreements are where the useful discussion lives. Why does one person think it meets requirements and another think it doesn't? Usually the answer reveals a difference in what each person thought the scenario was testing.&lt;/p&gt;

&lt;p&gt;This step slows down the scoring process by about thirty minutes per vendor. It produces decisions that hold up better under scrutiny, because every committee member can trace the final recommendation back to specific evaluation data.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5: Guard Against Score Adjustments
&lt;/h2&gt;

&lt;p&gt;The most common way rubrics fail is retroactive adjustment. After the demos, someone on the committee decides that a criterion they previously weighted at 1 should actually be a 3, because the vendor they prefer scored poorly on a high-weight criterion and someone else's preferred vendor scored well on a low-weight one.&lt;/p&gt;

&lt;p&gt;The way to prevent this is to have the completed weights and criteria locked by a neutral party (often the evaluation lead) before the demos. Changes to the rubric after any demo require documented justification and committee agreement.&lt;/p&gt;

&lt;p&gt;This sounds overly procedural. In practice, it rarely comes up when everyone knows the lock is in place. The threat of "we'll need to document why we changed this weight after seeing the demos" is usually enough to prevent casual retroactive adjustment.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using the Rubric Output
&lt;/h2&gt;

&lt;p&gt;The rubric output is a starting point for the decision, not the decision itself. A vendor who scores 87 vs. 83 on a rubric isn't definitively better -- the margin is too small to be meaningful given the inherent subjectivity in scores.&lt;/p&gt;

&lt;p&gt;What the rubric output is good for: identifying clear winners, identifying clear losers, and surfacing the criteria where the committee is genuinely split. The rubric should tell you which vendors to eliminate and what the final decision comes down to.&lt;/p&gt;

&lt;p&gt;For close calls, the rubric data supports a structured conversation rather than settling it by fiat. If two vendors are within a few points and the committee is split, the next step is to examine the specific criteria where they differ, not to run the evaluation again.&lt;/p&gt;

&lt;p&gt;The rubric also serves a documentation purpose beyond the immediate decision. A completed rubric shows which criteria and weights the committee agreed on before any demos, the independent scores each evaluator assigned, and the reasoning behind close calls. That record is useful if the decision needs to be justified to a CFO or board, and it's valuable a year later when the winning vendor hasn't delivered as expected and you need to determine whether that was a product failure or an evaluation gap. Saving the rubric as a template also reduces setup time for future evaluations -- most organizations face similar evaluation categories across different software purchases, and a reusable structure with your standard weighting logic is worth maintaining.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://137foundry.com/articles/software-vendor-evaluation-process" rel="noopener noreferrer"&gt;vendor evaluation framework from 137Foundry&lt;/a&gt; covers rubric design alongside requirements definition, short-listing, demo structure, and stakeholder alignment as a complete process. &lt;a href="https://137foundry.com" rel="noopener noreferrer"&gt;137Foundry&lt;/a&gt; works with companies on technology initiatives where structured vendor selection is part of a larger implementation project.&lt;/p&gt;

&lt;p&gt;For research on decision-making quality in group procurement processes, &lt;a href="https://hbr.org/" rel="noopener noreferrer"&gt;Harvard Business Review&lt;/a&gt; and &lt;a href="https://www.gartner.com/" rel="noopener noreferrer"&gt;Gartner&lt;/a&gt; both publish on sourcing and vendor selection methodology with evidence on which process structures produce better outcomes.&lt;/p&gt;

</description>
      <category>business</category>
      <category>technology</category>
      <category>productivity</category>
    </item>
    <item>
      <title>12 Free Tools and Resources for Running a Software Vendor Evaluation</title>
      <dc:creator>137Foundry</dc:creator>
      <pubDate>Sat, 25 Apr 2026 11:28:02 +0000</pubDate>
      <link>https://forem.com/137foundry/12-free-tools-and-resources-for-running-a-software-vendor-evaluation-2dbo</link>
      <guid>https://forem.com/137foundry/12-free-tools-and-resources-for-running-a-software-vendor-evaluation-2dbo</guid>
      <description>&lt;p&gt;Software vendor evaluations are process-heavy and produce better decisions when you have the right tools for each stage. Most of what you need is either free or already in your organization's toolbox. Here are twelve resources worth using, organized by evaluation stage.&lt;/p&gt;

&lt;h2&gt;
  
  
  Discovery and Market Research
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. G2
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.g2.com/" rel="noopener noreferrer"&gt;G2&lt;/a&gt; is a software review platform with user reviews, feature comparisons, and market category maps. For the discovery stage, the category pages give you a realistic market overview for most software verticals -- who the major players are, what features are standard, and what users commonly complain about. The user reviews are more candid than vendor marketing and are often more useful than analyst reports for mid-market procurement.&lt;/p&gt;

&lt;p&gt;The "Most Helpful Negative Reviews" filter on each vendor's G2 profile is particularly valuable. Negative reviews from verified users consistently surface the same categories of problems: poor customer support, missing integrations, confusing pricing, slow implementation. Seeing the same criticism from fifteen different reviewers over the past twelve months is a stronger signal than any single reference call a vendor arranges for you. The category comparison tools let you build a side-by-side feature matrix in minutes rather than hours.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Gartner
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.gartner.com/" rel="noopener noreferrer"&gt;Gartner&lt;/a&gt; publishes Magic Quadrant reports that categorize vendors by execution strength and vision in established software categories. The full reports require a paid subscription, but the summary findings are frequently cited in vendor RFP responses and public summaries. For enterprise software procurement, Gartner's category definitions help you understand where a vendor sits in the market and who their primary competition is.&lt;/p&gt;

&lt;p&gt;The Magic Quadrant positioning (Leader, Challenger, Visionary, Niche Player) is useful context but not a substitute for your own evaluation. A Niche Player that specializes in your industry or company size may be a better fit than a Leader built for a different use case. Gartner's positioning reflects execution at scale and vision breadth, not fit for your specific requirements. Use it to build your initial vendor list, then evaluate based on your own criteria.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Harvard Business Review
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://hbr.org/" rel="noopener noreferrer"&gt;Harvard Business Review&lt;/a&gt; publishes research on procurement strategy, vendor management, and technology decision-making. The articles on make-vs-buy decisions and supplier selection are directly applicable to software evaluations. The research is written for business leaders rather than IT specialists, which makes it useful for building cross-functional alignment on procurement criteria.&lt;/p&gt;

&lt;h2&gt;
  
  
  Requirements and Scoring
&lt;/h2&gt;

&lt;h3&gt;
  
  
  4. Notion or Confluence (your existing tools)
&lt;/h3&gt;

&lt;p&gt;Most teams already have access to a wiki or collaborative document tool. For vendor evaluation, these tools work well for writing and circulating the requirements document, storing the vendor comparison rubric, and documenting scoring decisions. The value isn't in the tool -- it's in having a single shared location that all stakeholders can reference.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Google Sheets or Excel
&lt;/h3&gt;

&lt;p&gt;A scoring rubric in a spreadsheet is the standard approach for comparable vendor evaluation. Rows are requirements or evaluation criteria; columns are vendors; cells are scores. The spreadsheet calculates weighted totals automatically. The file lives somewhere the full committee can access, reducing the risk of different people working from different versions of the evaluation.&lt;/p&gt;

&lt;p&gt;For weighted scoring, the formula structure is straightforward: each criterion has a weight (typically 1-3 based on priority) and a score (typically 1-3 based on fit), and the weighted total is the sum of weight-times-score across all criteria.&lt;/p&gt;

&lt;h2&gt;
  
  
  Vendor Research
&lt;/h2&gt;

&lt;h3&gt;
  
  
  6. LinkedIn
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.linkedin.com/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; is useful for two specific evaluation tasks. First, checking the vendor's team size and growth trajectory -- a vendor that was 50 people two years ago and is now 200 is a different business risk than one that was 200 and is now 50. Second, finding current and former customers who might give you a candid reference call, which is more reliable than vendor-provided references.&lt;/p&gt;

&lt;h3&gt;
  
  
  7. CrunchBase
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.crunchbase.com/" rel="noopener noreferrer"&gt;Crunchbase&lt;/a&gt; provides funding history, investor profiles, and company data for private companies. For SaaS vendors, knowing how much runway they have and who their investors are helps you assess business continuity risk. A vendor on their last funding round with a high burn rate is a different counterparty risk than one that recently closed a Series B.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8ex8zbfz2brsjh99qzmd.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8ex8zbfz2brsjh99qzmd.jpg" alt="Business research with laptop showing company data and funding information" width="800" height="533"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Photo by &lt;a href="https://pixabay.com/users/StartupStockPhotos-690514/" rel="noopener noreferrer"&gt;StartupStockPhotos&lt;/a&gt; on &lt;a href="https://pixabay.com" rel="noopener noreferrer"&gt;Pixabay&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  8. TrustRadius
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.trustradius.com/" rel="noopener noreferrer"&gt;TrustRadius&lt;/a&gt; is a software review site similar to G2 but with a different reviewer community. Checking both sites for a vendor you're seriously evaluating gives you a broader sample of user perspectives. Reviews that appear on both platforms (different users, similar conclusions) are more reliable signals than reviews that appear on only one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Legal and Contract Research
&lt;/h2&gt;

&lt;h3&gt;
  
  
  9. NIST
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.nist.gov/" rel="noopener noreferrer"&gt;NIST&lt;/a&gt; publishes cybersecurity and data protection frameworks that are useful for evaluating a SaaS vendor's security posture. The NIST Cybersecurity Framework is a commonly cited reference for what "good security" looks like, and asking vendors how they map to NIST controls is a quick way to assess whether their security practices are mature or aspirational.&lt;/p&gt;

&lt;p&gt;For practical security evaluation, the most useful question to ask vendors is whether they have a current SOC 2 Type II report available for review. SOC 2 Type II means an independent auditor has verified their security controls over a six-to-twelve month period. Vendors who have only Type I (a point-in-time snapshot) or no SOC 2 at all are at a materially lower security maturity level than vendors with current Type II certification. NIST guidance provides the framework for understanding what those controls cover.&lt;/p&gt;

&lt;h3&gt;
  
  
  10. OECD Guidelines
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.oecd.org/" rel="noopener noreferrer"&gt;OECD&lt;/a&gt; publishes guidelines on data governance and digital trade that are useful context for evaluating vendor contracts that involve data processing, especially for international operations. The guidelines help non-legal stakeholders understand what standard protections look like and what terms are worth negotiating.&lt;/p&gt;

&lt;h2&gt;
  
  
  Post-Selection
&lt;/h2&gt;

&lt;h3&gt;
  
  
  11. 137Foundry Vendor Evaluation Framework
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://137foundry.com" rel="noopener noreferrer"&gt;full-stack development firm 137Foundry&lt;/a&gt; publishes a complete &lt;a href="https://137foundry.com/articles/software-vendor-evaluation-process" rel="noopener noreferrer"&gt;vendor evaluation framework&lt;/a&gt; covering requirements definition, short-listing, structured demos, scoring, stakeholder alignment, and contract negotiation. The framework is designed for teams that run one to two software evaluations per year and need a repeatable process that doesn't require a full procurement department.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://137foundry.com/services" rel="noopener noreferrer"&gt;services overview&lt;/a&gt; covers the types of engagements where vendor selection happens as part of broader technology initiatives.&lt;/p&gt;

&lt;h3&gt;
  
  
  12. Forrester
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.forrester.com/" rel="noopener noreferrer"&gt;Forrester&lt;/a&gt; publishes research on technology procurement and vendor management, including reports on SaaS contract best practices and vendor relationship management. For teams doing a significant software evaluation for the first time, Forrester's publicly available blog posts and summaries provide useful framing for what to expect in contract negotiations and post-implementation support.&lt;/p&gt;




&lt;p&gt;The most important tool in any vendor evaluation isn't on this list: it's a clearly written requirements document that the full committee agrees on before any vendor conversations begin. Without that, no amount of market research, scoring, or legal review will produce a decision that sticks.&lt;/p&gt;

&lt;p&gt;The sequence matters as much as the tools. Start with G2 and Gartner for market orientation before building a vendor list. Move to LinkedIn and Crunchbase for vendor-specific research before scheduling demos. Use NIST and OECD guidelines when reviewing security and compliance terms before signing. The tools above are useful in context; pulling from them at the wrong stage of the evaluation creates noise rather than clarity. For a sequenced framework that organizes these resources into a repeatable process, the &lt;a href="https://137foundry.com/articles/software-vendor-evaluation-process" rel="noopener noreferrer"&gt;137Foundry vendor evaluation guide&lt;/a&gt; is a practical starting point.&lt;/p&gt;

</description>
      <category>business</category>
      <category>technology</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Step-by-Step Webhook Signature Verification for Any Sender</title>
      <dc:creator>137Foundry</dc:creator>
      <pubDate>Fri, 24 Apr 2026 11:15:03 +0000</pubDate>
      <link>https://forem.com/137foundry/step-by-step-webhook-signature-verification-for-any-sender-2nje</link>
      <guid>https://forem.com/137foundry/step-by-step-webhook-signature-verification-for-any-sender-2nje</guid>
      <description>&lt;p&gt;Webhook signature verification is the first line of defense against forged events. Without it, any HTTP client that knows your endpoint URL can POST fabricated events. The verification process is the same across most webhook senders, even when the specific header names and hash algorithms differ.&lt;/p&gt;

&lt;p&gt;This guide walks through implementing signature verification for a webhook receiver, from parsing the header to computing the HMAC to returning the right response.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Get the Raw Request Body Before Any Parsing
&lt;/h2&gt;

&lt;p&gt;This is the step most implementations get wrong. Signature verification computes an HMAC over the raw request body bytes. The HMAC must be computed before any framework deserialization happens.&lt;/p&gt;

&lt;p&gt;Web frameworks parse JSON bodies automatically. When they do, they may normalize whitespace, change encoding, or reorder keys. Computing the HMAC over the parsed-then-re-serialized body will produce a different hash than the sender computed over the original bytes. The verification will fail even for valid payloads.&lt;/p&gt;

&lt;p&gt;In Python with FastAPI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;fastapi&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Request&lt;/span&gt;

&lt;span class="nd"&gt;@app.post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/webhooks/events&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;receive_webhook&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;raw_body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;body&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# raw bytes, before any JSON parsing
&lt;/span&gt;    &lt;span class="n"&gt;signature&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;X-Signature-256&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# verify against raw_body, not json.loads(raw_body)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In Node.js with Express:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/webhooks/events&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;express&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;rawBody&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// Buffer, not parsed JSON&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;signature&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;x-signature-256&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
  &lt;span class="c1"&gt;// verify against rawBody&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;express.raw()&lt;/code&gt; middleware captures the body as a Buffer instead of parsing it. Without it, &lt;code&gt;req.body&lt;/code&gt; contains the parsed JSON object, which breaks verification.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Parse the Signature Header
&lt;/h2&gt;

&lt;p&gt;Webhook senders typically include both the HMAC hash and a timestamp in the signature header, either as separate headers or combined in a single header. &lt;a href="https://stripe.com" rel="noopener noreferrer"&gt;Stripe&lt;/a&gt; combines them in a single header with a specific format:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Stripe-Signature: t=1714000000,v1=abc123...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Parse the header to extract the timestamp and the hash:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;parse_stripe_signature&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;header&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;parts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;part&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;header&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;part&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;parts&lt;/span&gt;

&lt;span class="n"&gt;sig_parts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;parse_stripe_signature&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Stripe-Signature&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;timestamp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sig_parts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;t&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;received_hash&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sig_parts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;v1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Other senders use a simpler format with the hash prefixed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;X-Signature-256: sha256=abc123...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;received_hash&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;signature_header&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;removeprefix&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sha256=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check the sender's documentation for their specific header format. The concept is the same; only the parsing differs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Compute the Expected HMAC
&lt;/h2&gt;

&lt;p&gt;With the raw body bytes and the shared secret, compute the expected HMAC using the algorithm your sender specifies. Most use SHA-256.&lt;/p&gt;

&lt;p&gt;For Stripe-style signatures where the hash is computed over &lt;code&gt;{timestamp}.{body}&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;hmac&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;compute_hmac&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;raw_body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;secret&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;signed_payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;timestamp&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;raw_body&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;hmac&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;secret&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="n"&gt;signed_payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;hashlib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sha256&lt;/span&gt;
    &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;hexdigest&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For simpler senders where the hash is computed directly over the body:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;compute_hmac_simple&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;raw_body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;secret&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;hmac&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;secret&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="n"&gt;raw_body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;hashlib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sha256&lt;/span&gt;
    &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;hexdigest&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 4: Compare Using Constant-Time Equality
&lt;/h2&gt;

&lt;p&gt;Never use &lt;code&gt;==&lt;/code&gt; to compare the expected and received hashes. Standard string equality short-circuits on the first differing character, which creates a timing side channel. An attacker can measure how long the comparison takes to determine how many leading characters of the hash they got right, eventually reconstructing a valid hash without knowing the secret.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;is_valid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hmac&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compare_digest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expected_hash&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;received_hash&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;hmac.compare_digest&lt;/code&gt; takes the same amount of time regardless of where the strings differ. The equivalent in JavaScript uses the &lt;code&gt;crypto&lt;/code&gt; module's &lt;code&gt;timingSafeEqual&lt;/code&gt; function.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5: Check the Timestamp Window
&lt;/h2&gt;

&lt;p&gt;If the sender includes a timestamp in the signature header, check that it's within an acceptable window (typically five minutes). This prevents replay attacks where an attacker captures a valid signed request and re-sends it later.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;is_timestamp_valid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_age_seconds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;event_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;event_time&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;max_age_seconds&lt;/span&gt;
    &lt;span class="nf"&gt;except &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;TypeError&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the timestamp is more than five minutes old, return 400 and log the event for investigation. The sender should not be sending payloads with stale timestamps; this is a potential replay attempt.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 6: Return the Right Status Codes
&lt;/h2&gt;

&lt;p&gt;The response status code tells the sender whether to retry:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;200&lt;/strong&gt;: Event received and accepted. Don't retry.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;400&lt;/strong&gt;: Invalid signature or malformed request. Don't retry (retries won't fix a tampered payload).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;500&lt;/strong&gt;: Server error. Retry according to retry policy.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Return 400 on signature failures, not 500. A 500 tells the sender to retry, which is wrong behavior for a forged payload. A 400 tells the sender the request was rejected as invalid.&lt;/p&gt;

&lt;h2&gt;
  
  
  Testing the Verification Logic
&lt;/h2&gt;

&lt;p&gt;The most important tests to write:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Valid signature passes verification&lt;/li&gt;
&lt;li&gt;Tampered body fails verification&lt;/li&gt;
&lt;li&gt;Wrong secret fails verification&lt;/li&gt;
&lt;li&gt;Missing signature header returns 400&lt;/li&gt;
&lt;li&gt;Expired timestamp is rejected&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Use &lt;a href="https://www.postman.com" rel="noopener noreferrer"&gt;Postman&lt;/a&gt; to send test requests with custom signature headers to a running server. &lt;a href="https://ngrok.com" rel="noopener noreferrer"&gt;ngrok&lt;/a&gt; lets you test against a real external sender during development.&lt;/p&gt;

&lt;h2&gt;
  
  
  Handling Senders That Deviate from the Standard Pattern
&lt;/h2&gt;

&lt;p&gt;Not all webhook senders follow the same verification scheme. Most use HMAC-SHA256 with a shared secret, but the payload construction, header format, and timestamp handling vary.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Timestamp in the signed payload.&lt;/strong&gt; Stripe constructs the signed payload as &lt;code&gt;{timestamp}.{raw_body}&lt;/code&gt;, not just the raw body. This means you need to extract the timestamp from the signature header, construct the signed string manually before computing the HMAC, and verify that string against the received hash. A receiver that computes the HMAC over just the raw body will fail verification for Stripe webhooks even if the secret is correct.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multiple hash versions.&lt;/strong&gt; Some senders include multiple hash versions in the header (for example, both a v0 and v1 hash). The receiver should check whether any of the provided hashes matches the expected value, rather than requiring a specific version. This allows the sender to rotate their signing scheme without breaking receivers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No timestamp.&lt;/strong&gt; Some simpler webhook senders include only the HMAC hash with no timestamp. For these, timestamp validation isn't possible, but you should still verify the HMAC. The absence of a timestamp means replay attacks are technically possible, which is worth noting in your integration documentation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Custom algorithms.&lt;/strong&gt; A small number of senders use HMAC-SHA1 instead of SHA256. HMAC-SHA1 is not considered broken for this use case, but SHA256 is preferred. Implement the algorithm your specific sender specifies.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common Verification Failures in Production
&lt;/h2&gt;

&lt;p&gt;A few signature verification failures that appear consistently across production webhook integrations:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Middleware that reads the body.&lt;/strong&gt; Some logging middleware or request parsing middleware reads the request body before your verification code runs. If the body is consumed and not replaced with the original bytes, your verification code gets an empty body. Make sure any body-reading middleware restores the raw bytes before the request reaches the webhook handler.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Content-Type normalization.&lt;/strong&gt; Some frameworks normalize or re-parse the body when the Content-Type header is &lt;code&gt;application/json&lt;/code&gt;. Using &lt;code&gt;express.raw()&lt;/code&gt; or the framework-equivalent prevents this. Always test signature verification explicitly in your actual framework environment, not just in isolation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Header name case sensitivity.&lt;/strong&gt; HTTP header names are case-insensitive by spec but some implementations are not. If your verification code looks for &lt;code&gt;X-Signature-256&lt;/code&gt; but the sender sends &lt;code&gt;x-signature-256&lt;/code&gt;, some frameworks will pass it through and some won't. Use case-insensitive header lookup.&lt;/p&gt;

&lt;p&gt;For the broader receiver architecture (async processing, idempotency, failure handling), &lt;a href="https://137foundry.com/articles/webhook-receiver-production-guide" rel="noopener noreferrer"&gt;How to Build a Webhook Receiver That Handles Real-World Traffic&lt;/a&gt; covers how signature verification fits into the complete pattern.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://137foundry.com" rel="noopener noreferrer"&gt;This development service&lt;/a&gt; builds and maintains data integration infrastructure including webhook receivers, event processors, and API integrations for production workloads.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd6cf3a1qbuodg0n0waa3.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd6cf3a1qbuodg0n0waa3.jpg" alt="Security key lock mechanism close-up" width="800" height="600"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Photo by &lt;a href="https://pixabay.com/users/Hans-2/" rel="noopener noreferrer"&gt;Hans&lt;/a&gt; on &lt;a href="https://pixabay.com" rel="noopener noreferrer"&gt;Pixabay&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>programming</category>
      <category>api</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Why Synchronous Webhook Processing Is a Production Trap</title>
      <dc:creator>137Foundry</dc:creator>
      <pubDate>Fri, 24 Apr 2026 11:15:00 +0000</pubDate>
      <link>https://forem.com/137foundry/why-synchronous-webhook-processing-is-a-production-trap-5cp7</link>
      <guid>https://forem.com/137foundry/why-synchronous-webhook-processing-is-a-production-trap-5cp7</guid>
      <description>&lt;p&gt;Most webhook implementations start the same way: the event arrives, the handler parses the payload, does some database work, maybe fires an email, and returns 200. It works in testing. It works in early production with low event volumes. Then it fails in predictable and expensive ways.&lt;/p&gt;

&lt;p&gt;The failure modes of synchronous webhook processing are not edge cases. They're the normal operating conditions of a production webhook integration. Understanding why they fail makes the fix obvious.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Synchronous Processing Looks Like
&lt;/h2&gt;

&lt;p&gt;A synchronous webhook handler processes the event in the same request context where it was received. In pseudocode:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;POST /webhooks/events
  1. Parse payload
  2. Verify signature
  3. Query database to get account
  4. Update account records
  5. Send confirmation email
  6. Return 200
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Steps 3, 4, and 5 involve external calls. A database query under load might take 500ms. An email provider having a slow day might take 2 seconds. If anything in steps 3-5 throws an exception, the handler returns a 500.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Retry Problem
&lt;/h2&gt;

&lt;p&gt;Most webhook senders interpret a non-2xx response as delivery failure and schedule a retry. &lt;a href="https://stripe.com" rel="noopener noreferrer"&gt;Stripe&lt;/a&gt; retries webhooks for up to three days, with increasing intervals between attempts. GitHub retries delivery failures. Most enterprise webhook senders follow a similar policy.&lt;/p&gt;

&lt;p&gt;When your synchronous handler returns 500 because the database query timed out, the sender queues a retry. The retry arrives, your database is still under load, and returns 500 again. After several retries, you have a queue of the same event being retried repeatedly, each attempt potentially writing partial state to the database before failing.&lt;/p&gt;

&lt;p&gt;The synchronous handler created a worst-case scenario: the database is slow, so the handler fails, so there are more retries, so the database load increases. This is a feedback loop.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Timeout Problem
&lt;/h2&gt;

&lt;p&gt;Webhook senders enforce delivery timeouts. If your endpoint doesn't respond within their timeout window (often 5-30 seconds), they treat it as a failed delivery and schedule a retry.&lt;/p&gt;

&lt;p&gt;For most simple operations, this isn't a problem. For operations that involve slow downstream services, it is. A third-party API call that normally completes in 1 second might take 15 seconds under load. Your handler, waiting for that call to complete, times out from the sender's perspective before returning a response. The sender retries. You now have the same event being processed twice simultaneously, each racing to write to the same database records.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Idempotency Problem That Synchronous Processing Creates
&lt;/h2&gt;

&lt;p&gt;Synchronous processing combined with retries creates idempotency problems in code that was never designed to handle duplicate events. If your handler does:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;credits&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;
&lt;span class="n"&gt;account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Running this twice doubles the credit amount. Running it once is correct. Designing for exactly-once execution is hard when you can't guarantee the sender won't retry.&lt;/p&gt;

&lt;p&gt;Idempotent processing (checking whether an event ID has already been handled before doing any work) is the correct solution. But tacking it onto a synchronous handler doesn't fix the underlying architecture problem. You're still doing work inside the request window, still subject to timeouts, and still returning 500s on failures that cause retries.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Correct Architecture Separates Receiving from Processing
&lt;/h2&gt;

&lt;p&gt;The fix is to separate what happens in the request from what happens after it. The receiver endpoint does three things: verify the signature, store the raw payload, and return 200. Everything else happens in a background worker after the request has been acknowledged.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;POST /webhooks/events
  1. Verify signature -&amp;gt; 400 if invalid
  2. Check idempotency (event_id already seen?) -&amp;gt; 200 immediately
  3. Write raw payload to queue with status "pending"
  4. Return 200

[background worker]
  1. Read "pending" event from queue
  2. Process event (queries, updates, notifications)
  3. Mark event as "processed" or "failed"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The receiver now completes in under 500 milliseconds regardless of what processing involves. The sender gets a 200 immediately after delivery. Retries only happen if the network connection fails before the response, not because processing was slow.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the Worker Gets
&lt;/h2&gt;

&lt;p&gt;The worker processes events asynchronously, which changes what's possible. Retrying failed events is now the worker's responsibility, not the sender's. If a database is slow, the worker backs off and retries with exponential delay. If a downstream service is down, the event stays in the queue until it becomes available. No 500s, no sender retries, no feedback loops.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://redis.io" rel="noopener noreferrer"&gt;Redis&lt;/a&gt; works well as the queue layer for this pattern. The receiver appends events to a list or stream. Workers consume from the stream, update event status on completion, and move failed events to a dead-letter queue after exhausting retries.&lt;/p&gt;

&lt;h2&gt;
  
  
  Designing Worker Retry Logic
&lt;/h2&gt;

&lt;p&gt;The worker's retry behavior matters as much as the receiver's architecture. Without explicit retry logic, a single transient failure leaves the event in a failed state permanently.&lt;/p&gt;

&lt;p&gt;A practical worker retry pattern:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Pick up the event and attempt processing.&lt;/li&gt;
&lt;li&gt;On success, mark the event as "processed" with a completion timestamp.&lt;/li&gt;
&lt;li&gt;On failure, increment a retry count. If below the threshold, return the event to the queue with an exponential delay. If the retry count exceeds the threshold, move the event to a dead-letter queue and emit an alert.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The delay between retries should grow with each attempt. Flat retry intervals put sustained pressure on a downstream service that's already struggling. Exponential backoff -- retry after 10 seconds, then 100, then 1000 -- gives external services time to recover without exhausting retries immediately. Most production systems cap the maximum interval to avoid events sitting in the queue indefinitely.&lt;/p&gt;

&lt;p&gt;Queue infrastructure handles much of this natively. Redis streams track unacknowledged messages and allow a configurable pending timeout before re-delivery. RabbitMQ's dead-letter exchange can route a message to a retry queue with a delay after a configurable rejection count.&lt;/p&gt;

&lt;h2&gt;
  
  
  Dead-Letter Queue Design
&lt;/h2&gt;

&lt;p&gt;Events that exhaust their retry limit need a place to go that isn't silently deleted. A dead-letter queue preserves events that couldn't be processed after multiple attempts, making them available for inspection and manual replay.&lt;/p&gt;

&lt;p&gt;The minimum useful dead-letter record includes: the original payload, the event source, the retry count, the last error message, and the timestamp of the last attempt. The error message is critical -- without it, debugging what went wrong requires reconstructing the failure from distributed application logs, which is much slower.&lt;/p&gt;

&lt;p&gt;Dead-letter management can be straightforward. A separate database table, a query to list failed events by source and time range, and a replay operation that resets a set of events back to "pending" covers most operational needs. The engineering work is in setting up an alert when dead-letter depth grows past a threshold so the failures are visible before they affect business-critical event types.&lt;/p&gt;

&lt;p&gt;Testing the full async flow end-to-end during development is important. Unit tests verify the processing logic in isolation, but they can't replicate the sender's retry timing or the behavior of the real queue consumer. &lt;a href="https://ngrok.com" rel="noopener noreferrer"&gt;ngrok&lt;/a&gt; exposes your local receiver to the actual external sender so you can exercise the complete path including signature verification, queue writes, and worker consumption under realistic delivery conditions.&lt;/p&gt;

&lt;h2&gt;
  
  
  When the Synchronous Approach Is Acceptable
&lt;/h2&gt;

&lt;p&gt;For very simple processing (a webhook that only logs the event to a table) and very small volumes, synchronous processing is fine. The failure modes described here only manifest at meaningful event volumes or when processing involves slow external calls.&lt;/p&gt;

&lt;p&gt;For a complete implementation of the async receiver pattern including signature verification, idempotency, and failure handling, &lt;a href="https://137foundry.com/articles/webhook-receiver-production-guide" rel="noopener noreferrer"&gt;How to Build a Webhook Receiver That Handles Real-World Traffic&lt;/a&gt; covers each component with implementation notes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://137foundry.com" rel="noopener noreferrer"&gt;This team&lt;/a&gt; at 137Foundry builds data integration infrastructure, including webhook receivers for high-volume event processing environments.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F637rsv3wtsjd3kffgqq4.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F637rsv3wtsjd3kffgqq4.jpg" alt="Forklift loading warehouse sorting conveyor" width="800" height="533"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Photo by &lt;a href="https://pixabay.com/users/delphinmedia-348407/" rel="noopener noreferrer"&gt;delphinmedia&lt;/a&gt; on &lt;a href="https://pixabay.com" rel="noopener noreferrer"&gt;Pixabay&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>programming</category>
      <category>api</category>
      <category>productivity</category>
    </item>
    <item>
      <title>7 Free UX Tools for Researching and Testing Web Form Design</title>
      <dc:creator>137Foundry</dc:creator>
      <pubDate>Thu, 23 Apr 2026 15:30:19 +0000</pubDate>
      <link>https://forem.com/137foundry/7-free-ux-tools-for-researching-and-testing-web-form-design-35d2</link>
      <guid>https://forem.com/137foundry/7-free-ux-tools-for-researching-and-testing-web-form-design-35d2</guid>
      <description>&lt;p&gt;Designing better forms requires data about how users interact with them. These seven tools help with different parts of that process: understanding where forms fail, testing how users experience them, checking accessibility, and researching what evidence-based form design looks like across products and contexts. All have a meaningful free tier.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Microsoft Clarity
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://clarity.microsoft.com" rel="noopener noreferrer"&gt;Microsoft Clarity&lt;/a&gt; is a free behavioral analytics tool that records user sessions and generates heatmaps of click, scroll, and interaction patterns. For form design, the session recording feature is particularly valuable: you can watch how users interact with specific form fields, where they pause, which fields they re-enter, and at which point they abandon the form.&lt;/p&gt;

&lt;p&gt;Clarity's "rage click" and "dead click" detection automatically flags interactions where users appear frustrated (rapid repeated clicks) or where clicks are not triggering expected responses. Both of these patterns frequently appear in form interaction data and can surface problems with small touch targets, confusing validation states, and non-interactive-looking submit buttons.&lt;/p&gt;

&lt;p&gt;The session recording capability does not capture personally identifiable information or form field contents by default, which makes it safer to use on forms without additional configuration. The free tier includes unlimited session recordings and heatmaps.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Google Analytics 4 (with Event Tracking)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://analytics.google.com" rel="noopener noreferrer"&gt;Google Analytics 4&lt;/a&gt; tracks user behavior across your site and can be configured with custom events to measure form-specific metrics: how many users viewed a form, how many started it, how many completed it, and what percentage abandoned at each step of a multi-step form.&lt;/p&gt;

&lt;p&gt;The funnel analysis feature in GA4 allows you to define a sequence of steps and see the dropout rate at each point. For multi-step forms, this reveals exactly which step drives the most abandonment. For single-page forms with multiple fields, field-level events require manual event implementation, but the resulting data is highly specific to your actual form and users.&lt;/p&gt;

&lt;p&gt;GA4 is free at standard traffic volumes. The event tracking setup for forms requires some JavaScript implementation, but the payoff in diagnostic specificity is significant.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Maze (Free Tier)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://maze.design" rel="noopener noreferrer"&gt;Maze&lt;/a&gt; is an unmoderated user testing platform that lets you create tasks for users to complete, including filling out a prototype or live form, and then analyzes where users get stuck or fail. The free tier includes a limited number of tests per month and access to the core path and mission metrics.&lt;/p&gt;

&lt;p&gt;For form testing, Maze is useful for discovering usability problems before launch by having representative users attempt to complete the form while recording where they hesitate, fail, or succeed. The platform aggregates results across multiple participants and shows paths through the form as a visual flow.&lt;/p&gt;

&lt;p&gt;The unmoderated format means testing can happen asynchronously without requiring you to be present, which makes it practical to run a quick test before shipping a form change.&lt;/p&gt;

&lt;p&gt;For the principles behind what these tools help you identify, the guide at &lt;a href="https://137foundry.com/articles/how-to-design-web-forms-users-complete" rel="noopener noreferrer"&gt;137foundry.com/articles/how-to-design-web-forms-users-complete&lt;/a&gt; covers validation patterns, field count, mobile layout, and error message design in detail.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. WAVE Accessibility Checker
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://webaim.org" rel="noopener noreferrer"&gt;WebAIM&lt;/a&gt; produces WAVE, a browser-based accessibility evaluation tool that checks web pages including forms for accessibility errors and warnings. Running WAVE on a form reveals missing labels, insufficient color contrast, unlabeled form controls, and missing ARIA attributes that would make the form inaccessible to users of assistive technology.&lt;/p&gt;

&lt;p&gt;The browser extension version evaluates pages in their current state, including dynamic states like validation errors, which makes it more useful for form accessibility testing than crawling-based tools that only see the initial page state.&lt;/p&gt;

&lt;p&gt;WAVE is free as both a browser extension and a web-based tool. For teams embedding accessibility checks in a development workflow, the API version allows automated scanning as part of a CI pipeline.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Axe DevTools (Free Browser Extension)
&lt;/h2&gt;

&lt;p&gt;The axe DevTools browser extension from &lt;a href="https://www.deque.com" rel="noopener noreferrer"&gt;Deque Systems&lt;/a&gt; performs automated accessibility audits on web pages. Like WAVE, it identifies accessibility violations and provides specific guidance on how to fix them.&lt;/p&gt;

&lt;p&gt;Where axe differentiates itself for development teams is in its integration with the browser DevTools panel, making it easy to inspect specific elements alongside their accessibility issues. The extension is built on the same axe-core rules used by tools like Jest-axe and Playwright's accessibility testing APIs, which means issues found in browser testing with axe are consistent with what automated testing will catch.&lt;/p&gt;

&lt;p&gt;The free extension covers a substantial portion of WCAG 2.1 violations. The paid DevTools Pro version adds guided testing and more comprehensive rule sets.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. The A11y Project Checklist
&lt;/h2&gt;

&lt;p&gt;The &lt;a href="https://www.a11yproject.com" rel="noopener noreferrer"&gt;A11y Project&lt;/a&gt; maintains a comprehensive checklist of web accessibility requirements organized by WCAG criteria. For form design specifically, the checklist covers labels, error identification, keyboard navigation, focus management, and timeout notifications, all in plain language that is more actionable than reading the WCAG specification directly.&lt;/p&gt;

&lt;p&gt;This is a reference tool rather than a testing tool, but using it as a design checklist before building a form reduces the number of accessibility fixes required after testing. It is particularly useful for designers and developers who are not accessibility specialists and need a clear, prioritized list of what to check.&lt;/p&gt;

&lt;h2&gt;
  
  
  7. Nielsen Norman Group Research Reports (Free Articles)
&lt;/h2&gt;

&lt;p&gt;The &lt;a href="https://www.nngroup.com" rel="noopener noreferrer"&gt;Nielsen Norman Group&lt;/a&gt; makes a substantial portion of its UX research findings freely available in article form. For form design, the NNG article archive covers field ordering, label placement, error message design, mobile form patterns, multi-step form design, and checkout UX in detail backed by usability studies.&lt;/p&gt;

&lt;p&gt;While the full research reports require a subscription or purchase, the free articles provide enough evidence-based guidance to inform most form design decisions. Searching the NNG archive for "form design" or "form usability" returns a large set of relevant articles that can be used as a reference layer alongside your own testing data.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8ag5g8e9yh2hv7vn2dyf.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8ag5g8e9yh2hv7vn2dyf.jpg" alt="person laptop testing interface design form ux" width="800" height="533"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Photo by &lt;a href="https://pixabay.com/users/Pexels-2286921/" rel="noopener noreferrer"&gt;Pexels&lt;/a&gt; on &lt;a href="https://pixabay.com" rel="noopener noreferrer"&gt;Pixabay&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  How These Tools Work Together
&lt;/h2&gt;

&lt;p&gt;Using these tools together covers the full form design and validation cycle. Clarity and Google Analytics provide behavioral data from real users on your live forms. Maze lets you test with representative users before or alongside launch. WAVE and Axe check accessibility compliance at the implementation level. The A11y Project gives you a reference checklist for design decisions. NNG research provides the evidence base for why certain patterns work and others do not.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://137foundry.com" rel="noopener noreferrer"&gt;UX and web studio 137Foundry&lt;/a&gt; builds and tests forms as part of broader web design and development projects. The &lt;a href="https://137foundry.com/services/web-development" rel="noopener noreferrer"&gt;web development services page&lt;/a&gt; describes how form design and UX testing fit into our project process.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://www.w3.org" rel="noopener noreferrer"&gt;World Wide Web Consortium&lt;/a&gt; maintains the WCAG accessibility standards that WAVE, Axe, and the A11y Project checklist are built around, and provides the authoritative reference for understanding accessibility requirements at a specification level.&lt;/p&gt;

&lt;p&gt;The most effective approach to form improvement combines at least two of these tools: one that provides behavioral data from real users (Clarity, GA4) and one that provides a way to understand the why behind that behavior (Maze user testing, NNG research). Behavioral data tells you where users stop. User testing and research tell you why. Acting on behavioral data without understanding why the abandonment is happening can lead to fixing symptoms rather than the underlying design problem. The combination of quantitative data and qualitative insight is what produces form improvements that hold up over time rather than winning a single A/B test and then plateauing.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>ux</category>
      <category>tools</category>
      <category>productivity</category>
    </item>
    <item>
      <title>How Inline Validation Reduces Form Abandonment and Errors</title>
      <dc:creator>137Foundry</dc:creator>
      <pubDate>Thu, 23 Apr 2026 15:28:31 +0000</pubDate>
      <link>https://forem.com/137foundry/how-inline-validation-reduces-form-abandonment-and-errors-5258</link>
      <guid>https://forem.com/137foundry/how-inline-validation-reduces-form-abandonment-and-errors-5258</guid>
      <description>&lt;p&gt;Form validation is one of the most consequential UX decisions in web development. The same set of validation rules, implemented with two different timing strategies, can produce meaningfully different completion rates. Inline validation, where feedback appears field-by-field as users progress through a form, consistently outperforms submit-and-validate-all patterns for user experience and completion.&lt;/p&gt;

&lt;p&gt;This article covers how inline validation works, when to use it, how to implement it correctly, and the specific patterns that make it effective versus counterproductive.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Submit-Time Validation Creates a Poor Experience
&lt;/h2&gt;

&lt;p&gt;The traditional validation pattern, validate all fields when the user clicks submit, creates several compounding problems.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Error discovery is deferred.&lt;/strong&gt; The user completes the entire form before learning anything is wrong. At that point they have the most invested in the task and the most to lose psychologically if they have to redo work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Error location requires searching.&lt;/strong&gt; Validation errors returned after submit are typically shown at the top of the form or highlighted inline, but the user must scroll back through the form to find each highlighted field. On a long form, this requires significant navigation. On mobile, it can feel like starting over.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multiple errors appear simultaneously.&lt;/strong&gt; When several fields fail validation at once, users face a list of errors to work through. Each one requires re-reading the instructions, locating the field, and correcting it. The cognitive and emotional cost compounds.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;False success signals occur.&lt;/strong&gt; A user who fills in a field incorrectly but receives no feedback until submitting believes the field is fine until the error appears. The correction feels like a reversal rather than a natural part of the process.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Inline Validation Changes
&lt;/h2&gt;

&lt;p&gt;Inline validation checks each field individually after the user leaves it (on blur). Feedback appears immediately below the field while the user is still in the context of that section of the form. Errors are corrected one at a time, at the moment of lowest cost.&lt;/p&gt;

&lt;p&gt;The research on this is consistent. A landmark study by the Interaction Design Foundation and subsequent replications found that inline validation reduced errors by 22%, reduced completion time by 42%, and increased satisfaction scores compared to after-submit validation for the same form content. The gains are largest for long forms and forms with complex field requirements.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://137foundry.com" rel="noopener noreferrer"&gt;Web design agency 137Foundry&lt;/a&gt; implements inline validation as the default validation pattern on forms built for client projects. The principle is covered in our broader form design guide at &lt;a href="https://137foundry.com/articles/how-to-design-web-forms-users-complete" rel="noopener noreferrer"&gt;137foundry.com/articles/how-to-design-web-forms-users-complete&lt;/a&gt;, which covers field count, input types, mobile layout, and confirmation experience alongside validation strategy.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Critical Implementation Detail: Validate on Blur, Not on Input
&lt;/h2&gt;

&lt;p&gt;The most common inline validation mistake is triggering validation while the user is still typing (on the &lt;code&gt;input&lt;/code&gt; event). This produces false errors constantly.&lt;/p&gt;

&lt;p&gt;An email field checked on input will show "invalid email" the moment the user types a single character. A user who has not yet typed the @ symbol is not making an error; they are in the middle of typing. Checking at this point creates visual noise and anxiety without providing useful feedback.&lt;/p&gt;

&lt;p&gt;The correct event to validate on is &lt;code&gt;blur&lt;/code&gt;, which fires when the user moves focus out of the field. At that point, the user has finished entering their input and validation feedback is appropriate and timely.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;querySelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;#email&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;addEventListener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;blur&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;function &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;validateField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For confirm-password or interdependent fields where one field's validity depends on another's value, you may need to re-validate one field when the other changes. For example, confirming that a "confirm password" field matches the password field should re-run when either field changes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;password&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;querySelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;#password&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;confirm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;querySelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;#confirm-password&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nx"&gt;confirm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addEventListener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;blur&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;validateMatch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;password&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;confirm&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="nx"&gt;password&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addEventListener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;blur&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;confirm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nf"&gt;validateMatch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;password&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;confirm&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Error Message Placement and Content
&lt;/h2&gt;

&lt;p&gt;Error messages should appear immediately below the field they describe, in the reading flow between the field and the next element. They should be visible without scrolling, associated with the field via &lt;code&gt;aria-describedby&lt;/code&gt; for screen reader accessibility, and dismissed automatically when the user corrects the error.&lt;/p&gt;

&lt;p&gt;Message content should be specific about what is wrong and what the correct format is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;label&lt;/span&gt; &lt;span class="na"&gt;for=&lt;/span&gt;&lt;span class="s"&gt;"phone"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;Phone number&lt;span class="nt"&gt;&amp;lt;/label&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;input&lt;/span&gt;
  &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;"tel"&lt;/span&gt;
  &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"phone"&lt;/span&gt;
  &lt;span class="na"&gt;aria-describedby=&lt;/span&gt;&lt;span class="s"&gt;"phone-error"&lt;/span&gt;
  &lt;span class="na"&gt;aria-invalid=&lt;/span&gt;&lt;span class="s"&gt;"true"&lt;/span&gt;
&lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;p&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"phone-error"&lt;/span&gt; &lt;span class="na"&gt;role=&lt;/span&gt;&lt;span class="s"&gt;"alert"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  Enter a phone number with 10 digits, like 5551234567
&lt;span class="nt"&gt;&amp;lt;/p&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;aria-invalid="true"&lt;/code&gt; attribute signals to screen readers that the field has an error. The &lt;code&gt;role="alert"&lt;/code&gt; on the error paragraph causes screen readers to announce the message when it appears, without requiring the user to navigate to it. The &lt;a href="https://developer.mozilla.org" rel="noopener noreferrer"&gt;Mozilla Developer Network&lt;/a&gt; provides the full reference for ARIA form patterns, and the &lt;a href="https://www.w3.org/WAI" rel="noopener noreferrer"&gt;Web Accessibility Initiative&lt;/a&gt; documents the accessibility requirements for form error identification.&lt;/p&gt;

&lt;h2&gt;
  
  
  Visual Design of Inline Validation States
&lt;/h2&gt;

&lt;p&gt;Each field should have three visible states beyond the default: active/focused, valid, and error.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Active/focused:&lt;/strong&gt; A clear focus ring that meets WCAG 2.1 contrast requirements. Do not remove the native focus ring without providing a visible alternative.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Valid:&lt;/strong&gt; A subtle success indicator, typically a green checkmark or border color change, that appears when the user leaves a field after entering acceptable input. Keep this understated; a form that aggressively celebrates each correct field becomes noisy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Error:&lt;/strong&gt; A red border, error icon, and the error message. Red should not be the only indicator (for color-blind users); combine it with an icon and the text message.&lt;/p&gt;

&lt;p&gt;Avoid using placeholder text to communicate required format or examples. Placeholder text disappears when the user starts typing, which means they cannot reference it if they are unsure what to enter. Visible hint text below the label, present before the user interacts with the field, is the correct pattern.&lt;/p&gt;

&lt;h2&gt;
  
  
  Testing Inline Validation With Automated Tools
&lt;/h2&gt;

&lt;p&gt;Inline validation introduces dynamic content changes to the DOM, which means your standard HTML validation pass may not catch all issues. Testing should cover:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Keyboard navigation:&lt;/strong&gt; Tab through all fields and verify that error messages appear and are announced correctly without requiring a mouse.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Screen reader testing:&lt;/strong&gt; Use NVDA (on Windows) or VoiceOver (on macOS and iOS) to verify that errors are announced at the right moment and associated correctly with their fields.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automated accessibility checks:&lt;/strong&gt; Tools like Axe (from &lt;a href="https://www.deque.com" rel="noopener noreferrer"&gt;deque.com&lt;/a&gt;) and the built-in browser DevTools accessibility panel catch missing ARIA attributes, insufficient color contrast, and unlabeled fields.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Example: programmatically triggering validation for testing&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;runValidationTests&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;fields&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;querySelectorAll&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;[data-validate]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;fields&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;forEach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;field&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;field&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dispatchEvent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Event&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;blur&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;a href="https://www.a11yproject.com" rel="noopener noreferrer"&gt;A11y Project&lt;/a&gt; maintains a checklist that covers the accessibility requirements for form validation states. &lt;a href="https://webaim.org" rel="noopener noreferrer"&gt;WebAIM&lt;/a&gt; provides additional documentation on accessible form design and testing approaches.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Show Positive Confirmation
&lt;/h2&gt;

&lt;p&gt;Not every field needs a success state. For fields where the validity criteria are simple and familiar (email, phone number, date), a success indicator after the user leaves the field provides reassurance that the input was accepted. For fields with complex or unusual requirements (password strength, specific numeric ranges), the success state after validation is more valuable because it confirms that the requirements were met.&lt;/p&gt;

&lt;p&gt;For password fields, showing strength feedback while the user is typing (on the &lt;code&gt;input&lt;/code&gt; event) is one of the few legitimate exceptions to the blur-validation rule, because the feedback is progressive and genuinely useful during input, not a false error.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;password&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addEventListener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;input&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;function &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;updateStrengthMeter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;a href="https://www.nngroup.com" rel="noopener noreferrer"&gt;Nielsen Norman Group&lt;/a&gt; has published specific research on password field design and strength meter usability that provides useful reference for this specific case.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;Inline validation on the blur event, with specific error messages, correct ARIA attributes, and clear visual states, is consistently better than submit-time validation for both users and completion rates. The implementation is straightforward in vanilla JavaScript and can be adapted to any front-end framework. The gains in completion rate, user satisfaction, and error reduction are well-documented and reliably reproducible by applying the pattern correctly.&lt;/p&gt;

&lt;p&gt;For the broader context on how validation fits into a complete form UX strategy, the &lt;a href="https://137foundry.com/services" rel="noopener noreferrer"&gt;137Foundry services page&lt;/a&gt; covers the web design and development work where these patterns are applied in production.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>ux</category>
      <category>javascript</category>
      <category>productivity</category>
    </item>
    <item>
      <title>7 Free Tools for Testing and Analyzing HTTP Caching Behavior</title>
      <dc:creator>137Foundry</dc:creator>
      <pubDate>Wed, 22 Apr 2026 11:15:31 +0000</pubDate>
      <link>https://forem.com/137foundry/7-free-tools-for-testing-and-analyzing-http-caching-behavior-2n29</link>
      <guid>https://forem.com/137foundry/7-free-tools-for-testing-and-analyzing-http-caching-behavior-2n29</guid>
      <description>&lt;p&gt;Getting HTTP caching right is mostly a matter of setting the correct headers. But knowing whether you set them correctly requires being able to inspect actual response headers, simulate cache behavior, and verify that resources are being cached and invalidated the way you intend.&lt;/p&gt;

&lt;p&gt;These seven tools let you do that without paying for anything. They cover browser-level inspection, command-line header analysis, performance auditing, and CDN-layer caching behavior.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Chrome DevTools Network Panel
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.google.com/chrome" rel="noopener noreferrer"&gt;Chrome DevTools&lt;/a&gt; is the fastest way to inspect cache headers for any resource a page loads. Open the Network panel, load a page with cache disabled, and click any request to see its response headers.&lt;/p&gt;

&lt;p&gt;The panel shows &lt;code&gt;Cache-Control&lt;/code&gt;, &lt;code&gt;ETag&lt;/code&gt;, &lt;code&gt;Last-Modified&lt;/code&gt;, &lt;code&gt;Expires&lt;/code&gt;, and &lt;code&gt;Vary&lt;/code&gt; headers directly. On subsequent loads, the Size column displays "disk cache" or "memory cache" for resources served from cache. The status column shows 304 for revalidated resources.&lt;/p&gt;

&lt;p&gt;The Lighthouse tab in DevTools includes a "Serve static assets with an efficient cache policy" audit that lists every resource with a TTL under one week and estimates the bandwidth savings from extending it.&lt;/p&gt;

&lt;p&gt;This should be the first tool in any caching audit workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. curl
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://curl.se" rel="noopener noreferrer"&gt;curl&lt;/a&gt; is the most reliable way to inspect HTTP headers from the command line. It makes actual HTTP requests to your server or CDN and displays the raw response headers.&lt;/p&gt;

&lt;p&gt;To see just the response headers without downloading the body:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-I&lt;/span&gt; https://example.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To follow redirects and see all response headers along the way:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-IL&lt;/span&gt; https://example.com/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;-I&lt;/code&gt; flag sends a HEAD request. For resources where HEAD behaves differently from GET, use &lt;code&gt;-X GET --head&lt;/code&gt; instead.&lt;/p&gt;

&lt;p&gt;curl is particularly useful for checking how your CDN modifies cache headers relative to what your origin server sends. Run the same command against the CDN URL and the origin URL directly and compare the output. Differences between the two often explain why a resource appears to cache correctly at the origin but fails to cache at the edge.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. WebPageTest
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/WPO-Foundation/webpagetest" rel="noopener noreferrer"&gt;WebPageTest&lt;/a&gt; is a free, open-source performance testing tool that runs synthetic tests from multiple geographic locations. It measures real page load times including the effect of caching on repeat visits.&lt;/p&gt;

&lt;p&gt;The "Repeat View" feature runs the same test twice: once for a first-time visitor and once for a returning visitor who has cached resources. The difference between the two load times tells you how much your current cache configuration is helping for repeat visitors.&lt;/p&gt;

&lt;p&gt;WebPageTest also produces a waterfall chart that shows when each resource was requested, whether it was cached, and what the response headers contained. This is useful for identifying resources that should be cached but are not.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. PageSpeed Insights
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://pagespeed.web.dev" rel="noopener noreferrer"&gt;PageSpeed Insights&lt;/a&gt; is Google's free performance analysis tool that runs &lt;a href="https://developer.chrome.com/docs/lighthouse" rel="noopener noreferrer"&gt;Lighthouse&lt;/a&gt; audits against any public URL. Its "Serve static assets with an efficient cache policy" audit surfaces resources with short or missing cache TTLs and estimates the bandwidth savings from extending them.&lt;/p&gt;

&lt;p&gt;Because PageSpeed Insights runs Lighthouse server-side, results are consistent and reproducible regardless of which device or browser you are testing from. This makes it useful for confirming that a cache configuration change had the intended effect after deployment without relying on local browser state.&lt;/p&gt;

&lt;p&gt;The tool separates lab data from field data. Lab data shows what Lighthouse measured in a controlled test run. Field data draws from the Chrome User Experience Report, giving you a sense of real-world caching performance across actual user visits. For cache header auditing, the lab data section is most directly relevant because it shows exactly which headers each resource returned during the test. For teams that ship frequently, running PageSpeed Insights against a production URL after each deployment is a low-cost check that cache header regressions have not crept in through new asset types or updated server configurations. The audit output names each offending resource alongside its current TTL and a recommended minimum, which maps directly to the &lt;code&gt;Cache-Control&lt;/code&gt; directives you need to adjust.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Redbot
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://redbot.org" rel="noopener noreferrer"&gt;Redbot&lt;/a&gt; is a purpose-built HTTP header analysis tool maintained by the HTTP working group community. You enter a URL and it fetches the resource and analyzes the response headers in detail, explaining what each directive means and flagging problems.&lt;/p&gt;

&lt;p&gt;Redbot explains why a header is or is not correct, not just whether it is present. For developers learning caching behavior, this explanatory output is more useful than a binary pass/fail.&lt;/p&gt;

&lt;p&gt;It checks &lt;code&gt;Cache-Control&lt;/code&gt;, &lt;code&gt;ETag&lt;/code&gt;, &lt;code&gt;Last-Modified&lt;/code&gt;, &lt;code&gt;Vary&lt;/code&gt;, &lt;code&gt;Content-Encoding&lt;/code&gt;, and several other headers, and it follows redirects to check the headers at the final destination.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Fastly's Cache Simulator
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.fastly.com" rel="noopener noreferrer"&gt;Fastly&lt;/a&gt; provides a free cache behavior simulator as part of their developer documentation. It lets you input response headers and see how a CDN interprets them, including which directives control what behavior at the shared cache layer.&lt;/p&gt;

&lt;p&gt;While Fastly is a paid CDN service, the simulator itself is free and useful for understanding CDN-specific behavior independently of which CDN you actually use. Different CDNs have different default behaviors for responses without explicit &lt;code&gt;public&lt;/code&gt; directives or for responses that include &lt;code&gt;Set-Cookie&lt;/code&gt; headers.&lt;/p&gt;

&lt;p&gt;The simulator is particularly useful for verifying how &lt;code&gt;s-maxage&lt;/code&gt; and &lt;code&gt;stale-while-revalidate&lt;/code&gt; behave at the CDN layer before you deploy.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"The most common caching audit finding we see is long-lived HTML pages referencing short-lived assets. Two header changes fix the whole pattern." - Dennis Traina, &lt;a href="https://137foundry.com/services" rel="noopener noreferrer"&gt;founder of 137Foundry&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  7. Nginx Logs With Cache Hit Analysis
&lt;/h2&gt;

&lt;p&gt;If you are running &lt;a href="https://www.nginx.com" rel="noopener noreferrer"&gt;Nginx&lt;/a&gt; as a reverse proxy or CDN equivalent, its proxy cache module logs include a &lt;code&gt;$upstream_cache_status&lt;/code&gt; variable that reports whether each request was a HIT, MISS, BYPASS, or EXPIRED in the cache.&lt;/p&gt;

&lt;p&gt;Adding this variable to your access log format gives you a real-time cache hit rate breakdown without any additional tooling:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight nginx"&gt;&lt;code&gt;&lt;span class="k"&gt;log_format&lt;/span&gt; &lt;span class="s"&gt;cache_log&lt;/span&gt; &lt;span class="s"&gt;'&lt;/span&gt;&lt;span class="nv"&gt;$remote_addr&lt;/span&gt; &lt;span class="s"&gt;-&lt;/span&gt; &lt;span class="nv"&gt;$upstream_cache_status&lt;/span&gt; &lt;span class="s"&gt;-&lt;/span&gt; &lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt; &lt;span class="nv"&gt;$status&lt;/span&gt;&lt;span class="s"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;access_log&lt;/span&gt; &lt;span class="n"&gt;/var/log/nginx/cache.log&lt;/span&gt; &lt;span class="s"&gt;cache_log&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After collecting a few thousand requests, parsing the log for cache status gives you a practical hit rate for each URL pattern. A consistently low hit rate on resources that should be cacheable points to a configuration problem.&lt;/p&gt;

&lt;p&gt;This approach works for any Nginx-based cache, including Nginx configured as a local caching proxy in front of a Node.js or Python application.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to Use These Tools Together
&lt;/h2&gt;

&lt;p&gt;A typical caching audit workflow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Use Chrome DevTools to identify resources with missing or short Cache-Control headers&lt;/li&gt;
&lt;li&gt;Verify the exact headers with curl from the command line to confirm what the CDN is passing through&lt;/li&gt;
&lt;li&gt;Run WebPageTest or GTmetrix to see the repeat-visit improvement from fixing the headers&lt;/li&gt;
&lt;li&gt;Use Redbot on individual URLs to get detailed explanations for anything unclear&lt;/li&gt;
&lt;li&gt;Use Nginx logs or Fastly's simulator to verify CDN-layer behavior&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The audit is iterative rather than one-time. New resources added after a deployment often inherit whatever default header configuration is in place, which may not match the correct TTL for their type. Reviewing caching headers after each major deployment is a low-effort way to catch these regressions before they compound into a significant performance difference. Automating the check with curl against a known resource list as part of your deployment verification process eliminates the need for manual audits in the first place.&lt;/p&gt;

&lt;p&gt;For a deeper look at the caching patterns behind these checks, the article &lt;a href="https://137foundry.com/articles/http-caching-web-application-guide" rel="noopener noreferrer"&gt;HTTP Caching: A Practical Guide for Web Developers&lt;/a&gt; covers the strategy behind what the tools surface. &lt;a href="https://137foundry.com" rel="noopener noreferrer"&gt;137Foundry&lt;/a&gt; includes caching configuration as a standard part of web application delivery. &lt;a href="https://developer.mozilla.org" rel="noopener noreferrer"&gt;MDN's HTTP caching documentation&lt;/a&gt; provides the canonical reference for every header and directive these tools analyze.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw95tc4yvbg8cmi0cfujk.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw95tc4yvbg8cmi0cfujk.jpeg" alt="developer tools browser showing http request headers and cache status" width="800" height="533"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Photo by svetlana photographer on &lt;a href="https://www.pexels.com" rel="noopener noreferrer"&gt;Pexels&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>programming</category>
      <category>productivity</category>
    </item>
    <item>
      <title>How HTTP Caching Works at the Browser, CDN, and Proxy Layer</title>
      <dc:creator>137Foundry</dc:creator>
      <pubDate>Wed, 22 Apr 2026 11:09:56 +0000</pubDate>
      <link>https://forem.com/137foundry/how-http-caching-works-at-the-browser-cdn-and-proxy-layer-j5h</link>
      <guid>https://forem.com/137foundry/how-http-caching-works-at-the-browser-cdn-and-proxy-layer-j5h</guid>
      <description>&lt;p&gt;HTTP caching is not one thing. It is a set of behaviors that happen at different layers of the network stack, each governed by the same response headers but producing different effects depending on which layer is doing the caching.&lt;/p&gt;

&lt;p&gt;Understanding each layer separately makes it much easier to diagnose caching problems, because a bug that looks like "the browser is not caching this" is often actually "the CDN is stripping the header before it reaches the browser." Treating all caching as one system obscures the actual source of the problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Three Cache Layers
&lt;/h2&gt;

&lt;p&gt;A typical web request passes through three caches on its way from origin server to user.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The browser cache&lt;/strong&gt; is local to the user's device. It stores responses that the server marks as cacheable, keyed by URL. Subsequent requests for the same URL check this cache first. If the stored response is still fresh, the request never leaves the device.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The CDN edge cache&lt;/strong&gt; is a shared cache maintained by your CDN provider at geographic nodes distributed around the world. When a user requests a resource, the CDN node closest to them checks whether it has a cached copy. If it does, it serves the response directly. If not, it fetches from the origin and caches the response for future requests in that region.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Intermediate proxies&lt;/strong&gt; sit between the user and the CDN, or between the CDN and origin, depending on network topology. Corporate networks often include forward proxies that cache responses on behalf of internal users. These proxies also consult Cache-Control directives.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Freshness Works
&lt;/h2&gt;

&lt;p&gt;A cached response has a lifetime. The server signals how long the response should be considered fresh via the &lt;code&gt;max-age&lt;/code&gt; directive in the &lt;code&gt;Cache-Control&lt;/code&gt; header.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Cache-Control: max-age=3600&lt;/code&gt; means the response is fresh for 3,600 seconds after it was received. During that window, caches at any layer can serve the stored response without consulting the server.&lt;/p&gt;

&lt;p&gt;After the window expires, the response is stale. A stale response can still be served in some cases, but the cache should attempt to revalidate it first. Revalidation sends a conditional request to the server: "I have this response from earlier. Has anything changed?"&lt;/p&gt;

&lt;p&gt;The server responds either with &lt;code&gt;304 Not Modified&lt;/code&gt;, which means the cached copy is still valid and can be served, or with a full &lt;code&gt;200 OK&lt;/code&gt; response containing the updated content.&lt;/p&gt;

&lt;p&gt;Freshness applies independently at each cache layer. A browser cache entry might expire before a CDN cache entry for the same resource, or vice versa, depending on how long the resource has been cached at each layer.&lt;/p&gt;

&lt;h2&gt;
  
  
  How the Browser Cache Decides What to Store
&lt;/h2&gt;

&lt;p&gt;The browser caches a response if the response headers permit it. The decision involves several rules:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The request method must be GET or HEAD. POST responses are not cached.&lt;/li&gt;
&lt;li&gt;The response status must be cacheable (200, 301, 302, 404, and a few others are cacheable by default).&lt;/li&gt;
&lt;li&gt;The response must not include &lt;code&gt;Cache-Control: no-store&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;If &lt;code&gt;Cache-Control: private&lt;/code&gt; is present, the response is cached only in the browser, not in shared caches.&lt;/li&gt;
&lt;li&gt;If no Cache-Control header is present, the browser may cache heuristically based on &lt;code&gt;Last-Modified&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When a cached response becomes stale, the browser sends a conditional request. If the response included an &lt;code&gt;ETag&lt;/code&gt; header, the browser sends &lt;code&gt;If-None-Match: "etag-value"&lt;/code&gt;. If the response included &lt;code&gt;Last-Modified&lt;/code&gt;, the browser sends &lt;code&gt;If-Modified-Since: timestamp&lt;/code&gt;. The server responds with 304 if the resource has not changed, allowing the browser to extend the life of its cached copy.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8zs0vfqhuj9g6bia8jsg.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8zs0vfqhuj9g6bia8jsg.jpeg" alt="browser network waterfall showing cached and uncached resources" width="800" height="533"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Photo by Markus Spiske on &lt;a href="https://www.pexels.com" rel="noopener noreferrer"&gt;Pexels&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  How the CDN Cache Differs From the Browser Cache
&lt;/h2&gt;

&lt;p&gt;CDN caches are shared: they store responses that are served to many different users. This introduces considerations that browser caches do not have.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Personalization.&lt;/strong&gt; The CDN must not store responses that differ by user. &lt;code&gt;Cache-Control: private&lt;/code&gt; tells CDNs to skip the response. Without this directive, a CDN might cache a personalized response and serve it to the wrong user.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cache keys.&lt;/strong&gt; CDNs cache by URL by default. If a response varies by request header (e.g., different responses for mobile vs. desktop based on &lt;code&gt;User-Agent&lt;/code&gt;), the &lt;code&gt;Vary&lt;/code&gt; header must be included so the CDN stores separate entries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TTL differentiation.&lt;/strong&gt; &lt;code&gt;s-maxage&lt;/code&gt; lets you specify a TTL for shared caches independently of the browser TTL. &lt;code&gt;Cache-Control: max-age=60, s-maxage=3600&lt;/code&gt; gives the browser a 1-minute freshness window and the CDN a 1-hour window. This pattern is useful for resources where you want the CDN to cache aggressively but the browser to check for updates more often.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Purging.&lt;/strong&gt; Unlike browser caches, CDN caches can be cleared server-side. Most CDNs offer an API to purge cached entries by URL, path prefix, or tag. This enables cache invalidation as part of a deployment pipeline.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Intermediate Proxies Behave
&lt;/h2&gt;

&lt;p&gt;Intermediate proxies follow the same HTTP caching spec as CDNs. They respect &lt;code&gt;Cache-Control: private&lt;/code&gt; to avoid storing personalized responses, and they honor &lt;code&gt;no-store&lt;/code&gt; to skip caching entirely.&lt;/p&gt;

&lt;p&gt;The main practical difference is that you typically cannot predict which proxies a request might pass through, and you cannot purge their caches. A corporate proxy that caches a response with a long TTL will continue serving that response until the TTL expires, regardless of what happens on your origin.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;Cache-Control: no-cache&lt;/code&gt; directive is useful here. It allows responses to be stored in intermediate caches but requires revalidation before serving. This means even a proxy that has cached the response for a long time will check with the server before serving it to a new request.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Service Workers Interact With HTTP Caching
&lt;/h2&gt;

&lt;p&gt;Service workers add a programmable cache layer between the browser and the network. They can intercept fetch requests, serve responses from their own cache, and bypass HTTP caching entirely.&lt;/p&gt;

&lt;p&gt;A service worker's cache is independent of the browser's HTTP cache. A resource cached by a service worker may be served even when the HTTP cache would have revalidated or rejected the cached copy.&lt;/p&gt;

&lt;p&gt;This means HTTP caching behavior and service worker behavior can conflict. If you are debugging a caching issue in an application that uses a service worker, check whether the service worker is intercepting the request before checking the HTTP cache configuration.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fymuc3pr26h8k4pg3qfp0.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fymuc3pr26h8k4pg3qfp0.jpeg" alt="server and CDN cache architecture diagram concept" width="800" height="532"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Photo by Brett Sayles on &lt;a href="https://www.pexels.com" rel="noopener noreferrer"&gt;Pexels&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Practical Takeaway
&lt;/h2&gt;

&lt;p&gt;Each cache layer operates independently but responds to the same response headers. &lt;code&gt;Cache-Control&lt;/code&gt; is the single header that controls all of them, with directives that target different layers specifically.&lt;/p&gt;

&lt;p&gt;The most reliable approach to multi-layer caching:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use &lt;code&gt;public, max-age=31536000, immutable&lt;/code&gt; for static assets with content-addressed URLs&lt;/li&gt;
&lt;li&gt;Use &lt;code&gt;no-cache&lt;/code&gt; for HTML pages so browsers revalidate but CDN caches work with short TTLs&lt;/li&gt;
&lt;li&gt;Use &lt;code&gt;private, no-store&lt;/code&gt; for personalized API responses&lt;/li&gt;
&lt;li&gt;Use &lt;code&gt;s-maxage&lt;/code&gt; to differentiate CDN and browser TTLs when needed&lt;/li&gt;
&lt;li&gt;Add &lt;code&gt;Vary&lt;/code&gt; for any response where content differs by request header&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For more detail on how these directives interact with CDN behavior and deployment workflows, &lt;a href="https://137foundry.com" rel="noopener noreferrer"&gt;137Foundry&lt;/a&gt; covers caching configuration as part of web application delivery at &lt;a href="https://137foundry.com/services" rel="noopener noreferrer"&gt;our services page&lt;/a&gt;. The full article &lt;a href="https://137foundry.com/articles/http-caching-web-application-guide" rel="noopener noreferrer"&gt;HTTP Caching: A Practical Guide for Web Developers&lt;/a&gt; covers each directive in depth. The &lt;a href="https://httpwg.org" rel="noopener noreferrer"&gt;HTTP caching RFC at the IETF&lt;/a&gt; is the authoritative specification if you need to resolve ambiguous behavior. &lt;a href="https://web.dev" rel="noopener noreferrer"&gt;Google's web.dev caching guide&lt;/a&gt; is the most accessible reference for practical application. For CDN-specific caching behavior and how edge nodes modify response headers, &lt;a href="https://www.cloudflare.com" rel="noopener noreferrer"&gt;Cloudflare's caching documentation&lt;/a&gt; is the most detailed public reference for a widely deployed CDN.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fizj2fdot9vp0lfx5ys2o.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fizj2fdot9vp0lfx5ys2o.jpeg" alt="developer reviewing web performance metrics on dashboard" width="800" height="533"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Photo by Daniil Komov on &lt;a href="https://www.pexels.com" rel="noopener noreferrer"&gt;Pexels&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>programming</category>
      <category>productivity</category>
    </item>
  </channel>
</rss>
