<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem</title>
    <description>The most recent home feed on Forem.</description>
    <link>https://forem.com</link>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed"/>
    <language>en</language>
    <item>
      <title>Synchronization in Node.js: Why Single-Threaded Does Not Mean Safe From Concurrency Problems</title>
      <dc:creator>CodeWithIshwar</dc:creator>
      <pubDate>Mon, 18 May 2026 16:41:53 +0000</pubDate>
      <link>https://forem.com/codewithishwar/synchronization-in-nodejs-why-single-threaded-does-not-mean-safe-from-concurrency-problems-52dm</link>
      <guid>https://forem.com/codewithishwar/synchronization-in-nodejs-why-single-threaded-does-not-mean-safe-from-concurrency-problems-52dm</guid>
      <description>&lt;p&gt;One of the most common misconceptions about Node.js is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Node.js is single-threaded, so synchronization problems cannot happen.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is only partially true.&lt;/p&gt;

&lt;p&gt;Yes, JavaScript execution in Node.js runs on a single thread using the event loop. But modern backend applications deal with asynchronous operations, external services, databases, queues, and distributed systems — all of which introduce concurrency challenges.&lt;/p&gt;

&lt;p&gt;Understanding synchronization in Node.js is essential if you want to build scalable and reliable backend systems.&lt;/p&gt;




&lt;h1&gt;
  
  
  What Is Synchronization?
&lt;/h1&gt;

&lt;p&gt;Synchronization is the process of controlling access to shared resources when multiple operations happen at the same time.&lt;/p&gt;

&lt;p&gt;The goal is to prevent:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Race conditions&lt;/li&gt;
&lt;li&gt;Data inconsistency&lt;/li&gt;
&lt;li&gt;Duplicate processing&lt;/li&gt;
&lt;li&gt;Lost updates&lt;/li&gt;
&lt;li&gt;Unexpected behavior&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  Why Synchronization Still Matters in Node.js
&lt;/h1&gt;

&lt;p&gt;Node.js applications commonly handle:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Multiple API requests simultaneously&lt;/li&gt;
&lt;li&gt;Concurrent database updates&lt;/li&gt;
&lt;li&gt;Shared cache access&lt;/li&gt;
&lt;li&gt;Async file operations&lt;/li&gt;
&lt;li&gt;Background job processing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Even though JavaScript itself runs on one thread, async operations can overlap in execution timing.&lt;/p&gt;

&lt;p&gt;This creates situations where multiple operations interact with the same resource concurrently.&lt;/p&gt;




&lt;h1&gt;
  
  
  Example: Race Condition
&lt;/h1&gt;

&lt;p&gt;Imagine a wallet service:&lt;/p&gt;

&lt;p&gt;Initial balance = ₹1000&lt;/p&gt;

&lt;p&gt;Two requests arrive:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Request A deducts ₹300&lt;/li&gt;
&lt;li&gt;Request B deducts ₹500&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If both requests:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Read the same balance&lt;/li&gt;
&lt;li&gt;Update independently&lt;/li&gt;
&lt;li&gt;Save the result&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;…the final balance may become incorrect.&lt;/p&gt;

&lt;p&gt;This is called a race condition.&lt;/p&gt;




&lt;h1&gt;
  
  
  Common Synchronization Techniques in Node.js
&lt;/h1&gt;

&lt;h2&gt;
  
  
  1. Database Transactions
&lt;/h2&gt;

&lt;p&gt;Transactions ensure operations execute safely as a single unit.&lt;/p&gt;

&lt;p&gt;Useful for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Payment systems&lt;/li&gt;
&lt;li&gt;Banking workflows&lt;/li&gt;
&lt;li&gt;Order processing&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  2. Atomic Operations
&lt;/h2&gt;

&lt;p&gt;Databases provide atomic update mechanisms.&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;MongoDB &lt;code&gt;$inc&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;PostgreSQL row locking&lt;/li&gt;
&lt;li&gt;Optimistic locking&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These reduce concurrency conflicts.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Redis Distributed Locks
&lt;/h2&gt;

&lt;p&gt;In distributed systems, Redis locks help ensure only one worker processes a task at a time.&lt;/p&gt;

&lt;p&gt;Commonly used in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Payment handling&lt;/li&gt;
&lt;li&gt;Cron jobs&lt;/li&gt;
&lt;li&gt;Distributed workers&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  4. Mutexes
&lt;/h2&gt;

&lt;p&gt;Mutexes restrict access to critical sections of code.&lt;/p&gt;

&lt;p&gt;Only one async operation can enter at a time.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Message Queues
&lt;/h2&gt;

&lt;p&gt;Queues serialize workloads and reduce concurrency problems.&lt;/p&gt;

&lt;p&gt;Popular tools:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;BullMQ&lt;/li&gt;
&lt;li&gt;RabbitMQ&lt;/li&gt;
&lt;li&gt;Kafka&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  Important Takeaway
&lt;/h1&gt;

&lt;p&gt;Single-threaded does NOT mean concurrency-safe.&lt;/p&gt;

&lt;p&gt;As applications scale, synchronization becomes critical for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;High-traffic APIs&lt;/li&gt;
&lt;li&gt;Financial systems&lt;/li&gt;
&lt;li&gt;Real-time platforms&lt;/li&gt;
&lt;li&gt;Distributed architectures&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The real complexity in backend engineering often comes from handling concurrency correctly.&lt;/p&gt;




&lt;h1&gt;
  
  
  Conclusion
&lt;/h1&gt;

&lt;p&gt;Node.js provides excellent performance through asynchronous and non-blocking architecture.&lt;/p&gt;

&lt;p&gt;But scalable backend systems still require proper synchronization strategies to maintain correctness and reliability.&lt;/p&gt;

&lt;p&gt;Understanding concurrency is what transforms developers into backend engineers capable of designing production-grade systems.&lt;/p&gt;

&lt;h1&gt;
  
  
  NodeJS #JavaScript #BackendDevelopment #SystemDesign #Concurrency #SoftwareEngineering #Programming #DistributedSystems #WebDevelopment #Tech #codewithishwar
&lt;/h1&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>programming</category>
      <category>javascript</category>
    </item>
    <item>
      <title>AI Coding Tools Need Better Boundaries, Not Better Prompts</title>
      <dc:creator>ClickIT - DevOps and Software Development</dc:creator>
      <pubDate>Mon, 18 May 2026 16:41:48 +0000</pubDate>
      <link>https://forem.com/clickit_devops/ai-coding-tools-need-better-boundaries-not-better-prompts-ipk</link>
      <guid>https://forem.com/clickit_devops/ai-coding-tools-need-better-boundaries-not-better-prompts-ipk</guid>
      <description>&lt;p&gt;One thing becoming increasingly obvious with AI-assisted development:&lt;/p&gt;

&lt;p&gt;LLMs are great at generating code.&lt;br&gt;
They’re not great at making architectural decisions.&lt;/p&gt;

&lt;p&gt;A lot of teams are discovering the same pattern:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;rapid prototyping feels amazing,&lt;/li&gt;
&lt;li&gt;shipping gets faster,&lt;/li&gt;
&lt;li&gt;but long-term maintainability starts degrading quietly in the background.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The problem usually isn’t the generated code itself.&lt;/p&gt;

&lt;p&gt;It’s the lack of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;clear contracts,&lt;/li&gt;
&lt;li&gt;deterministic workflows,&lt;/li&gt;
&lt;li&gt;validation layers,&lt;/li&gt;
&lt;li&gt;and shared engineering conventions before generation even starts.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without those boundaries, AI tends to optimize for local correctness instead of system consistency.&lt;/p&gt;

&lt;p&gt;That’s why workflows like &lt;strong&gt;Spec-Driven Development (SDD)&lt;/strong&gt; are becoming more relevant as teams integrate AI deeper into production environments.&lt;/p&gt;

&lt;p&gt;Instead of relying on increasingly complex prompts, SDD focuses on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;defining contracts first,&lt;/li&gt;
&lt;li&gt;validating specs before implementation,&lt;/li&gt;
&lt;li&gt;constraining generation scope,&lt;/li&gt;
&lt;li&gt;and treating LLMs more like implementation engines than autonomous architects.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In practice, this tends to produce:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;more predictable outputs,&lt;/li&gt;
&lt;li&gt;cleaner collaboration between engineers,&lt;/li&gt;
&lt;li&gt;and codebases that are actually maintainable months later.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We’ve been exploring this topic internally and recently put together a breakdown of how Spec-Driven Development can help create more reliable AI-assisted workflows in real-world engineering environments.&lt;/p&gt;

&lt;p&gt;If the topic sounds interesting, here’s the discussion:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://youtu.be/VFVfnv8u8Fo?si=Px-gu1Dmxo46aedi" rel="noopener noreferrer"&gt;Stop "Vibe Coding" and Start Spec-Driven Development | Part 1&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Curious how other teams here are approaching this shift:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Are you introducing stricter boundaries around AI-generated code?&lt;/li&gt;
&lt;li&gt;Have specs become more important in your workflow?&lt;/li&gt;
&lt;li&gt;Or are you still experimenting with prompting strategies first?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Feels like the industry is slowly moving from: &lt;em&gt;“AI can generate code”&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;to: &lt;em&gt;“How do we engineer systems around probabilistic generators?”&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;And that’s a much more interesting problem...&lt;/p&gt;

</description>
      <category>ai</category>
      <category>vibecoding</category>
      <category>discuss</category>
    </item>
    <item>
      <title>11 Agentic Testing Tools to Know in 2026</title>
      <dc:creator>Alvin Lee</dc:creator>
      <pubDate>Mon, 18 May 2026 16:40:46 +0000</pubDate>
      <link>https://forem.com/alvinslee/11-agentic-testing-tools-to-know-in-2026-22dg</link>
      <guid>https://forem.com/alvinslee/11-agentic-testing-tools-to-know-in-2026-22dg</guid>
      <description>&lt;p&gt;Agentic testing tools help teams plan, generate, adapt, and run tests with far less manual effort. They’re quickly becoming part of how modern QA scales without slowing delivery.&lt;/p&gt;

&lt;p&gt;One thing to get right from the start is scope. Not all agentic testing tools operate at the same level of scope or strategic impact. They vary significantly in what they do and where they fit. Some are point solutions that help you author or run tests faster. Others sit inside broader AI-driven quality platforms that prioritize risk, optimize test portfolios, and enforce quality gates across the pipeline.&lt;/p&gt;

&lt;p&gt;This post covers 11 agentic testing tools to know about in 2026. They’re grouped so you can compare them based on scope, strengths, and fit for your organization.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is an agentic testing tool?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;An agentic testing tool is software that uses AI agents to autonomously plan, generate, maintain, and execute tests. It often makes decisions based on context, such as requirements, code changes, risk signals, or past results.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It goes beyond AI-assisted automation by adding initiative and workflow-level decision-making. Instead of only suggesting what to do next, it takes action within defined boundaries.&lt;/p&gt;

&lt;p&gt;Here are 11 agentic testing tools grouped by scope. Each includes a summary and key strengths and considerations. Let’s go!&lt;/p&gt;

&lt;h2&gt;
  
  
  Enterprise AI-driven quality platforms
&lt;/h2&gt;

&lt;p&gt;These platforms extend beyond &lt;a href="https://www.tricentis.com/blog/agentic-test-creation-ai-qtest" rel="noopener noreferrer"&gt;test creation&lt;/a&gt; to orchestrate automation, intelligence, and governance at scale. They are suited for organizations that require stability, risk prioritization, and release confidence across complex environments.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Tricentis Tosca
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.tricentis.com/products/automate-continuous-testing-tosca" rel="noopener noreferrer"&gt;Tricentis Tosca&lt;/a&gt; is designed for enterprise &lt;a href="https://www.tricentis.com/blog/the-third-era-of-test-automation-ai" rel="noopener noreferrer"&gt;test automation&lt;/a&gt; where stability, scale, and governance matter. In an agentic context, the shift is moving from “write and maintain scripts” to “orchestrate outcomes,” especially across complex apps and high-change environments.&lt;/p&gt;

&lt;p&gt;Tricentis enables &lt;a href="https://www.tricentis.com/solutions/ai-powered-solutions" rel="noopener noreferrer"&gt;AI-driven testing and agentic quality engineering&lt;/a&gt; across your delivery pipeline. It also positions MCP as a way to bridge AI and testing tools through a universal integration approach, which matters if you’re thinking about agentic workflows that span multiple systems.&lt;/p&gt;

&lt;h4&gt;
  
  
  Strengths
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Suitable for large regression suites and complex end-to-end workflows.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;AI-assisted resilience helps reduce long-term maintenance costs.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Considerations
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The highest value shows up when teams commit to governance and standardization (not “ad hoc scripts”).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Adoption typically requires alignment across QA, engineering, and release stakeholders.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. SmartBear
&lt;/h3&gt;

&lt;p&gt;SmartBear is best viewed as a broad testing portfolio vendor that has been positioning around AI across testing workflows.&lt;/p&gt;

&lt;h4&gt;
  
  
  Strengths
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Covers multiple testing disciplines.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Suitable for consolidated vendor strategies.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Considerations
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;AI depth varies across products.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Portfolio integration matters.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. UiPath Test Suite
&lt;/h3&gt;

&lt;p&gt;UiPath Test Suite extends testing into broader automation ecosystems. In an agentic context, it is relevant for teams that want testing integrated into AI-driven business process automation and orchestration environments.&lt;/p&gt;

&lt;h4&gt;
  
  
  Strengths
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Aligns testing with broader automation initiatives.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Fits organizations standardizing around enterprise automation platforms.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Considerations
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Strongest value when already invested in the UiPath ecosystem.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Organizations must evaluate how deeply autonomous testing workflows integrate with CI/CD.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  AI-native testing platforms
&lt;/h2&gt;

&lt;p&gt;AI-native testing platforms are built with AI at the core of test creation and execution workflows. They aim to reduce friction from requirements to automation and help teams maintain speed and stability as systems evolve.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. ACCELQ
&lt;/h3&gt;

&lt;p&gt;ACCELQ positions itself around AI-powered automation and end-to-end testing acceleration. For agentic buyers, the key question is whether the platform reduces friction from requirements to automation to execution and whether it can keep pace as systems change.&lt;/p&gt;

&lt;h4&gt;
  
  
  Strengths
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Faster ramp-up for automation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Structured automation workflows.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Considerations
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Like any platform, success depends on fit with your stack and operating model.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Ensure governance and explainability are strong enough for enterprise release standards.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  5. mabl
&lt;/h3&gt;

&lt;p&gt;mabl is an AI-native testing vendor geared toward continuous testing and reducing maintenance overhead. For agentic tool evaluation, focus on whether AI helps you run reliably at speed, not just generate tests during setup.&lt;/p&gt;

&lt;h4&gt;
  
  
  Strengths
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;CI/CD integration.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Automation resilience focus.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Considerations
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Primarily web-centric workflows.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Enterprise governance depth varies.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  6. Functionize
&lt;/h3&gt;

&lt;p&gt;Functionize is commonly positioned as AI-forward test automation focused on reducing manual work across authoring, execution, and maintenance. In a practical agentic sense, tools like this aim to do more of the work for you, especially around test upkeep as systems evolve.&lt;/p&gt;

&lt;h4&gt;
  
  
  Strengths
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Lifecycle focus: value isn’t only authoring, but also keeping tests healthy over time.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;AI-forward orientation fits teams pushing toward higher autonomy.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Considerations
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Scope depends on team maturity.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Organizations may need to evaluate governance needs more deeply.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Point-solution agentic tools
&lt;/h2&gt;

&lt;p&gt;Point-solution agentic tools focus on solving a specific testing bottleneck rather than managing the full quality lifecycle. They are often used to accelerate test authoring, execution, or UI interaction without requiring a broader platform shift.&lt;/p&gt;

&lt;h3&gt;
  
  
  7. testRigor
&lt;/h3&gt;

&lt;p&gt;testRigor is typically associated with natural-language-driven test creation and reducing scripting complexity. For agentic buyers, it often lands in the “make authoring easier” category.&lt;/p&gt;

&lt;h4&gt;
  
  
  Strengths
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Lower barrier to authoring.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Rapid initial automation.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Considerations
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Primarily focused on UI regression.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Potential trade-off between depth and creation speed.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  8. QA Wolf
&lt;/h3&gt;

&lt;p&gt;QA Wolf is often positioned around fast test creation and managed execution models for teams that want results without building everything in-house. In an agentic tooling conversation, this fits as a way to compress time-to-value, especially when internal bandwidth is limited.&lt;/p&gt;

&lt;h4&gt;
  
  
  Strengths
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Fast time to coverage.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Managed execution support.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Considerations
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The operational model differs from in-house-only tools.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Evaluate long-term scaling fit.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  9. Virtuoso QA
&lt;/h3&gt;

&lt;p&gt;Virtuoso is frequently grouped with AI-led UI testing approaches that aim to reduce manual scripting and increase resilience. Its relevance depends on whether it meaningfully adapts and maintains tests as the app changes, not just how quickly it creates them.&lt;/p&gt;

&lt;h4&gt;
  
  
  Strengths
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Faster UI automation creation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Reduced scripting complexity.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Considerations
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Validate the reality of flake handling and maintenance in your environment (dynamic UIs expose gaps quickly).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Ensure pipeline integration and evidence output meet enterprise needs.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  10. AskUI
&lt;/h3&gt;

&lt;p&gt;AskUI approaches automation through UI perception and interaction. That can matter when you test across varied front ends, remote desktops, or environments where DOM-level automation is not always feasible.&lt;/p&gt;

&lt;h4&gt;
  
  
  Strengths
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Useful for UI-driven automation challenges.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Works across heterogeneous UI surfaces.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Considerations
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Typically narrower in scope than end-to-end platforms.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Validate stability and evidence outputs for long-running regression usage.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  11. CoTester by TestGrid
&lt;/h3&gt;

&lt;p&gt;CoTester lands in the agentic assistant space for testing workflows. Tools in this category typically let you offload specific tasks, helping your team by generating tests, suggesting validations, or scaling coverage with less effort.&lt;/p&gt;

&lt;h4&gt;
  
  
  Strengths
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Assistant-style support for testing tasks.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Accelerates defined QA activities.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Considerations
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Not a full end-to-end platform.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Best as a complementary capability.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How agentic technology applies to modern testing
&lt;/h2&gt;

&lt;p&gt;Agentic testing brings the agent loop into quality workflows. It decides what to test, executes the work, evaluates results, and adjusts based on context.&lt;/p&gt;

&lt;p&gt;Here’s what that looks like in real delivery pipelines:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Planning&lt;/strong&gt;: Interpreting requirements, code changes, and risk signals to select the right tests.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Execution&lt;/strong&gt;: Running tests and collecting evidence.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Adaptation&lt;/strong&gt;: Repairing brittle selectors and managing flakiness as systems change.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Governance&lt;/strong&gt;: Enforcing quality gates based on measurable signals such as coverage and change impact.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Agentic testing is not AI that writes tests. It is AI that runs a quality workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to choose the right agentic testing tool
&lt;/h2&gt;

&lt;p&gt;Buying decisions usually fail for one of two reasons: teams choose a point tool when they actually need a platform, or they buy a platform when they need quick, targeted relief. Use this checklist to avoid both mistakes.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Start with scope: assistant, point solution, or platform?
&lt;/h3&gt;

&lt;p&gt;Ask one blunt question: Do you need help authoring tests, or do you need help governing release confidence?&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Demand measurable outcomes, not demos
&lt;/h3&gt;

&lt;p&gt;Demos can look impressive, but real value shows up in production metrics. Look for clear improvements in regression time, maintenance effort, flake rate, defect escapes, and coverage visibility. If success cannot be measured, ROI will be hard to prove.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Validate governance: explainability, auditability, control
&lt;/h3&gt;

&lt;p&gt;Agentic systems take action, so your team must understand why. You should be able to explain test selection, recent changes, and the evidence behind a release decision, especially in regulated and enterprise environments.&lt;/p&gt;




&lt;p&gt;If you want agentic testing that scales beyond a single team or application, you need more than a test generator. You need an AI-driven approach that connects automation, intelligence, and governance.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ: Agentic testing tools in 2026
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What makes a testing tool truly agentic?
&lt;/h3&gt;

&lt;p&gt;A testing tool is truly agentic if it can independently plan and execute testing actions based on context, such as code changes, requirements, or risk signals. It does not just suggest next steps. It selects tests after a pull request, generates tests from requirements, repairs broken locators, and enforces quality gates with minimal human input.&lt;/p&gt;

&lt;h3&gt;
  
  
  Are agentic testing tools the same as AI test automation?
&lt;/h3&gt;

&lt;p&gt;No. AI test automation typically assists with parts of automation, such as smarter locators or faster script creation. Agentic testing tools go further by automating decision-making across workflows. They can decide which tests to run for a build, identify untested code changes, and prioritize high-risk areas without manual triage.&lt;/p&gt;

&lt;h3&gt;
  
  
  What results should I expect from agentic testing?
&lt;/h3&gt;

&lt;p&gt;Most teams see measurable improvements in regression cycle time and maintenance effort when agentic workflows are implemented correctly. A realistic benchmark is reducing regression runtime by 30–70% through change-based test selection and cutting maintenance effort by 30–50% through self-healing automation and flake reduction.&lt;/p&gt;

</description>
      <category>testing</category>
      <category>agents</category>
      <category>ai</category>
      <category>tooling</category>
    </item>
    <item>
      <title>I Replaced a Polling Loop With Three React Hooks and a Firestore Rule</title>
      <dc:creator>R.N.Krishnan</dc:creator>
      <pubDate>Mon, 18 May 2026 16:40:45 +0000</pubDate>
      <link>https://forem.com/sukuna_rayomen_3c62288044/i-replaced-a-polling-loop-with-three-react-hooks-and-a-firestore-rule-495m</link>
      <guid>https://forem.com/sukuna_rayomen_3c62288044/i-replaced-a-polling-loop-with-three-react-hooks-and-a-firestore-rule-495m</guid>
      <description>&lt;p&gt;The first version of the VORTEX dashboard polled an API endpoint every five seconds. It worked. It also meant the UI was always up to four seconds behind reality, and every agent write to Firestore required a separate read path just to surface it on screen. I replaced the whole thing with three custom hooks and &lt;code&gt;onSnapshot&lt;/code&gt; listeners. The dashboard has been real-time since, with no polling, no message queue, and no separate read model.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5almmju726vrg0bqhdj9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5almmju726vrg0bqhdj9.png" alt=" " width="800" height="478"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Data Model First
&lt;/h2&gt;

&lt;p&gt;Before writing a single hook, I mapped out exactly which Firestore collections existed and who owned them:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection&lt;/th&gt;
&lt;th&gt;Writers&lt;/th&gt;
&lt;th&gt;Readers&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;leads&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Agent 1, Agent 7&lt;/td&gt;
&lt;td&gt;Dashboard, Agent 4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;activity_feed&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;All agents (append only)&lt;/td&gt;
&lt;td&gt;Dashboard&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;product_intelligence&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Agent 6&lt;/td&gt;
&lt;td&gt;Dashboard&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;agent_logs&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;All agents&lt;/td&gt;
&lt;td&gt;Dashboard (Debate Log)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This table is the reason the security rules look the way they do:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// firestore.rules&lt;/span&gt;
&lt;span class="nx"&gt;match&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;activity_feed&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;eventId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;allow&lt;/span&gt; &lt;span class="na"&gt;read&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;   &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nx"&gt;allow&lt;/span&gt; &lt;span class="na"&gt;create&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;    &lt;span class="c1"&gt;// All agents append&lt;/span&gt;
  &lt;span class="c1"&gt;// No update, no delete — the feed is append-only by design&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;match&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;leads&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;leadId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;allow&lt;/span&gt; &lt;span class="na"&gt;read&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;auth&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nx"&gt;allow&lt;/span&gt; &lt;span class="na"&gt;write&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;auth&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;activity_feed&lt;/code&gt; collection is append-only deliberately. No agent ever updates or deletes a feed entry. This means the feed is a reliable audit trail of what happened, in order — you can replay it from any point without worrying about entries being mutated after the fact.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Three Hooks
&lt;/h2&gt;

&lt;p&gt;The entire dashboard data layer is three hooks: &lt;code&gt;useLeads&lt;/code&gt;, &lt;code&gt;useActivityFeed&lt;/code&gt;, and &lt;code&gt;useProductIntel&lt;/code&gt;. Each one owns one collection and one &lt;code&gt;onSnapshot&lt;/code&gt; listener.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// hooks/index.jsx — useLeads&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;useLeads&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;leads&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setLeads&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;INITIAL_LEADS&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="nf"&gt;useEffect&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;q&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="nf"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;leads&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="nf"&gt;orderBy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;intent_score&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;desc&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="nf"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;onSnapshot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;q&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;snapshot&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;fbLeads&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;snapshot&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;data&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
      &lt;span class="p"&gt;}));&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;fbLeads&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nf"&gt;setLeads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;fbLeads&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;[]);&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;leads&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three things worth noting here:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The &lt;code&gt;if (fbLeads.length &amp;gt; 0)&lt;/code&gt; guard.&lt;/strong&gt; Without this, an empty Firestore collection on first load would wipe the seed data. The hook falls back to &lt;code&gt;INITIAL_LEADS&lt;/code&gt; — a hardcoded set of mock leads — until real data arrives. This means the dashboard is never blank, even before Firebase is configured.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;onSnapshot&lt;/code&gt; returns its own unsubscribe function.&lt;/strong&gt; Returning it directly from &lt;code&gt;useEffect&lt;/code&gt; means React calls it on unmount, cleaning up the listener automatically. No manual cleanup needed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;orderBy('intent_score', 'desc')&lt;/code&gt;&lt;/strong&gt; means the Kanban always shows highest-intent leads first within each column, without any client-side sorting logic.&lt;/p&gt;

&lt;p&gt;The activity feed hook is similar but has a time-based limit instead:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// hooks/index.jsx — useActivityFeed&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;useActivityFeed&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;events&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setEvents&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;INITIAL_EVENTS&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="nf"&gt;useEffect&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;q&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="nf"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;activity_feed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="nf"&gt;orderBy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;timestamp&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;desc&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="nf"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;onSnapshot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;q&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;snapshot&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;fbEvents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;snapshot&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;data&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
      &lt;span class="p"&gt;}));&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;fbEvents&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nf"&gt;setEvents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;fbEvents&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;[]);&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;events&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Twenty events, most recent first. Every agent write to &lt;code&gt;activity_feed&lt;/code&gt; triggers this listener and the feed item appears in the UI within milliseconds.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Metrics Hook Problem
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;useMetrics&lt;/code&gt; caused the most grief. The original version returned an array:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// The version that broke everything downstream&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;useMetrics&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;leads&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;useMemo&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;label&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Total Leads&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;leads&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;trend&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;+12%&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;label&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Hot Leads Today&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;hotLeads&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;trend&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;+5&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="c1"&gt;// ...&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;leads&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Six components consumed this hook. Three of them destructured it as an array. Three treated it as a named object — &lt;code&gt;metrics.totalLeads&lt;/code&gt;, &lt;code&gt;metrics.hotLeads&lt;/code&gt;, &lt;code&gt;metrics.conversionRate&lt;/code&gt;. The array-consuming components worked. The object-consuming components silently got &lt;code&gt;undefined&lt;/code&gt; for every value and displayed nothing.&lt;/p&gt;

&lt;p&gt;The fix was making &lt;code&gt;useMetrics&lt;/code&gt; return a proper object:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;useMetrics&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;leads&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;useMemo&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;totalLeads&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;leads&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;hotLeads&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;leads&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;l&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;l&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;HOT_LEAD&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;conversionRate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;12.4&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;highestScoreLead&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;leads&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;reduce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;max&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;l&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;max&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;l&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;intent_score&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;max&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;intent_score&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;l&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;max&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="kc"&gt;null&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;totalLeads&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="nx"&gt;hotLeads&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;emailsSent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;312&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;callsPlaced&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;89&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;demosBooked&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="nx"&gt;conversionRate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;highestScore&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;highestScoreLead&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;intent_score&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="nx"&gt;highestScoreLead&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;leads&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The lesson: if a hook returns structured data that multiple components consume, make it a named object from day one. Arrays are fine for lists. They're not fine for typed data shapes where consumers care about specific fields.&lt;/p&gt;

&lt;h2&gt;
  
  
  The useCountUp Hook
&lt;/h2&gt;

&lt;p&gt;The sidebar metrics animate from zero to their real value on load. That required a &lt;code&gt;useCountUp&lt;/code&gt; hook — something the codebase was importing but that didn't exist yet.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;useCountUp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;target&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;duration&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setValue&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;rafRef&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useRef&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="nf"&gt;useEffect&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;performance&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;to&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Number&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;target&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;tick&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;now&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;elapsed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;now&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;start&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;progress&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;elapsed&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nx"&gt;duration&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;eased&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;progress&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// ease-out cubic&lt;/span&gt;
      &lt;span class="nf"&gt;setValue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;to&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;eased&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;progress&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;rafRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;requestAnimationFrame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;tick&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;

    &lt;span class="nx"&gt;rafRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;requestAnimationFrame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;tick&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;cancelAnimationFrame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;rafRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;target&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;duration&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Ease-out cubic means the number counts up fast at first and slows as it approaches the target. &lt;code&gt;requestAnimationFrame&lt;/code&gt; keeps it tied to the display refresh rate rather than a fixed interval. The &lt;code&gt;rafRef&lt;/code&gt; holds the animation frame ID so the cleanup function can cancel it properly on unmount — without this, switching tabs mid-animation would leave a hanging rAF loop.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6n7xfuerzj5hohgt4l8l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6n7xfuerzj5hohgt4l8l.png" alt=" " width="800" height="490"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The StrictMode Bug
&lt;/h2&gt;

&lt;p&gt;React's StrictMode runs effects twice in development — mount, unmount, remount. This exposed a bug in the &lt;code&gt;DebateTerminal&lt;/code&gt; component that replays the Hindsight agent log line by line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// The broken version&lt;/span&gt;
&lt;span class="nf"&gt;useEffect&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;idx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;show&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;idx&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nx"&gt;allLines&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nf"&gt;setDisplayed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prev&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[...&lt;/span&gt;&lt;span class="nx"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;allLines&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;]]);&lt;/span&gt;
    &lt;span class="nx"&gt;idx&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;show&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// recursive — never cleaned up&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;t&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;show&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;clearTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// only cancels the first timeout&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;[]);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The cleanup only cancelled the initial 300ms delay. Once &lt;code&gt;show&lt;/code&gt; started calling itself recursively, those timeouts had no handle. In StrictMode, the simulated unmount left the first chain running, then the remount started a second chain. Two parallel loops, both writing to &lt;code&gt;displayed&lt;/code&gt; state with independent &lt;code&gt;idx&lt;/code&gt; counters, producing duplicate and out-of-order lines.&lt;/p&gt;

&lt;p&gt;The fix was storing every timeout ID in a ref:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;timeoutRef&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useRef&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nf"&gt;useEffect&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;idx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;show&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;idx&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nx"&gt;allLines&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nf"&gt;setPlaying&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;allLines&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="nf"&gt;setDisplayed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prev&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[...&lt;/span&gt;&lt;span class="nx"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;line&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="nx"&gt;idx&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nx"&gt;timeoutRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;show&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;line&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;isHeader&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="nx"&gt;timeoutRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;show&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;clearTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;timeoutRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// cancels the whole chain&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;[]);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now the cleanup always cancels whichever timeout is currently pending. The chain breaks cleanly on unmount.&lt;/p&gt;

&lt;h2&gt;
  
  
  Seed Data as the Demo Mode
&lt;/h2&gt;

&lt;p&gt;The hooks layer has a deliberate fallback: if Firestore returns empty or throws, the UI renders from &lt;code&gt;INITIAL_LEADS&lt;/code&gt; and &lt;code&gt;INITIAL_EVENTS&lt;/code&gt;. This means the entire dashboard works without a Firebase project configured — useful for demos, useful for development, useful when the backend is down.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;leads&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setLeads&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;INITIAL_LEADS&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// fallback always set first&lt;/span&gt;

&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;onSnapshot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;q&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;snapshot&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;fbLeads&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;snapshot&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;data&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;}));&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;fbLeads&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nf"&gt;setLeads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;fbLeads&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// only overwrite if real data exists&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;VITE_USE_DEMO_DATA&lt;/code&gt; environment flag extends this further — when set, the Firebase initialization is skipped entirely and the hooks return seed data without attempting any Firestore connection.&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaways
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;onSnapshot&lt;/code&gt; is simpler than it looks.&lt;/strong&gt; It returns its own cleanup function, it handles reconnection automatically, and it pushes updates to all listeners simultaneously. For a dashboard that needs to reflect agent writes in real time, it's the right tool and it requires less infrastructure than a polling setup.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Return named objects from data hooks, not arrays.&lt;/strong&gt; The &lt;code&gt;useMetrics&lt;/code&gt; bug would have been caught immediately with TypeScript. Without it, the silent &lt;code&gt;undefined&lt;/code&gt; failures are hard to trace because the component renders without errors — it just shows nothing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;StrictMode is a useful stress test.&lt;/strong&gt; The &lt;code&gt;DebateTerminal&lt;/code&gt; bug only appeared in development because of StrictMode's double-invoke behavior. That's the point — it surfaces cleanup bugs before they reach production.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Seed data is infrastructure.&lt;/strong&gt; Having realistic fallback data in the hooks layer means the dashboard is always demonstrable, always developable, and always recoverable. It's not a hack — it's a design decision.&lt;/p&gt;

&lt;h2&gt;
  
  
  Closing
&lt;/h2&gt;

&lt;p&gt;The dashboard started as a polling loop hitting a REST endpoint. It's now three hooks, each owning one Firestore collection, each cleaning up after itself on unmount. The real-time behavior came for free once the data model was right. The hard part wasn't the Firestore integration — it was making the hooks clean enough that six different components could consume them without knowing anything about the underlying data source.&lt;/p&gt;

</description>
      <category>database</category>
      <category>javascript</category>
      <category>react</category>
      <category>webdev</category>
    </item>
    <item>
      <title>5 Free Image Compression Tools Compared: Privacy, Speed, and Quality (2026)</title>
      <dc:creator>yangjiaqiang12</dc:creator>
      <pubDate>Mon, 18 May 2026 16:37:02 +0000</pubDate>
      <link>https://forem.com/yangjiaqiang12/5-free-image-compression-tools-compared-privacy-speed-and-quality-2026-305n</link>
      <guid>https://forem.com/yangjiaqiang12/5-free-image-compression-tools-compared-privacy-speed-and-quality-2026-305n</guid>
      <description>&lt;h2&gt;
  
  
  The Test
&lt;/h2&gt;

&lt;p&gt;I tested 5 popular free image compression tools on the same 2MB photo to compare privacy, speed, and output quality. Here are the results.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Privacy&lt;/th&gt;
&lt;th&gt;Batch&lt;/th&gt;
&lt;th&gt;WebP&lt;/th&gt;
&lt;th&gt;Output Size&lt;/th&gt;
&lt;th&gt;Time&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Squash&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;? Local&lt;/td&gt;
&lt;td&gt;? Yes&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;td&gt;384KB&lt;/td&gt;
&lt;td&gt;1.2s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Squoosh&lt;/td&gt;
&lt;td&gt;? Local&lt;/td&gt;
&lt;td&gt;? No&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;td&gt;367KB&lt;/td&gt;
&lt;td&gt;1.8s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TinyPNG&lt;/td&gt;
&lt;td&gt;? Upload&lt;/td&gt;
&lt;td&gt;? Yes&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;td&gt;402KB&lt;/td&gt;
&lt;td&gt;4.3s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Compressor.io&lt;/td&gt;
&lt;td&gt;? Upload&lt;/td&gt;
&lt;td&gt;? Yes&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;td&gt;411KB&lt;/td&gt;
&lt;td&gt;6.1s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Optimizilla&lt;/td&gt;
&lt;td&gt;? Upload&lt;/td&gt;
&lt;td&gt;? Yes&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;td&gt;426KB&lt;/td&gt;
&lt;td&gt;5.8s&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Key Findings
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Privacy-first tools are faster.&lt;/strong&gt; Squash and Squoosh process images locally using the Canvas API. No network round-trip means 3-5x faster compression.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Batch mode matters.&lt;/strong&gt; Squoosh produces the smallest files but processes images one at a time. If you have 20 product photos, that is 20 manual clicks. Squash combines batch processing with local privacy -- a combination no other free tool offers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;WebP is the format to beat.&lt;/strong&gt; Tools supporting WebP output achieved 20-30% smaller files than JPEG-only tools at equivalent quality. WebP browser support is now at 97% globally.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Upload-based tools are slower.&lt;/strong&gt; TinyPNG and Compressor.io add 3-6 seconds of network latency per image. For batch work, this adds up quickly.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Privacy Factor
&lt;/h2&gt;

&lt;p&gt;Uploading images to a third-party server is not just a privacy concern -- it is a compliance issue. If you handle client work, medical images, financial documents, or unreleased products, server-based tools are a liability.&lt;/p&gt;

&lt;p&gt;Browser-based tools solve this completely. The image never leaves your device. There is no server to hack, no database to leak, no privacy policy to trust.&lt;/p&gt;

&lt;h2&gt;
  
  
  Bottom Line
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Best overall:&lt;/strong&gt; &lt;a href="https://yangjiaqiang12.github.io/squash-image-compressor/" rel="noopener noreferrer"&gt;Squash&lt;/a&gt; -- free, private, batch mode, multi-format&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Best quality:&lt;/strong&gt; Squoosh -- slightly better compression but no batch mode&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Best if you do not care about privacy:&lt;/strong&gt; TinyPNG -- established, reliable, but uploads your files&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;?? &lt;strong&gt;Try Squash:&lt;/strong&gt; &lt;a href="https://yangjiaqiang12.github.io/squash-image-compressor/" rel="noopener noreferrer"&gt;yangjiaqiang12.github.io/squash-image-compressor&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;?? &lt;strong&gt;Source:&lt;/strong&gt; &lt;a href="https://github.com/yangjiaqiang12/squash-image-compressor" rel="noopener noreferrer"&gt;github.com/yangjiaqiang12/squash-image-compressor&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;? &lt;strong&gt;Support:&lt;/strong&gt; &lt;a href="https://ko-fi.com/squashtools" rel="noopener noreferrer"&gt;ko-fi.com/squashtools&lt;/a&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>design</category>
      <category>tools</category>
      <category>performance</category>
    </item>
    <item>
      <title>Custom behavior without custom code</title>
      <dc:creator>Ian Johnson</dc:creator>
      <pubDate>Mon, 18 May 2026 16:36:06 +0000</pubDate>
      <link>https://forem.com/tacoda/custom-behavior-without-custom-code-4l9g</link>
      <guid>https://forem.com/tacoda/custom-behavior-without-custom-code-4l9g</guid>
      <description>&lt;p&gt;Every successful SaaS product eventually meets the same question: a customer asks for something specific to them, you build it, and now you have a feature in your codebase that's only meant to run for one tenant. A year later, you have a dozen of these. The codebase has if-statements checking tenant IDs, the test suite mocks out customer-specific paths, and the senior engineer who knows which branch belongs to which customer is the only person who can refactor anything.&lt;/p&gt;

&lt;p&gt;There's a better shape, and it doesn't require giving up the per-customer customization. It does require separating, cleanly and firmly, the &lt;em&gt;code&lt;/em&gt; that defines what behaviors are possible from the &lt;em&gt;data&lt;/em&gt; that selects and parameterizes them. This article is about how to do that, where to store the data, and the security cliff you'll fall off if you let the data become code.&lt;/p&gt;

&lt;h2&gt;
  
  
  What not to do
&lt;/h2&gt;

&lt;p&gt;A handful of approaches show up over and over, and each has a fatal flaw:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Separate deployed instances per customer.&lt;/strong&gt; This solves customization by forking the operational surface. Now you have N versions of the database, N sets of background jobs, N deploy pipelines, N versions of every bug fix to roll out. It works for two or three customers and collapses by ten.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Conditional code in the backend&lt;/strong&gt; — &lt;code&gt;if tenant_id == "acme": ...&lt;/code&gt;. Cheap on day one, untenable by month six. Every developer has to know the customer landscape to make changes safely. Every refactor is risky in proportion to how many tenants have branches. Customer-specific logic spreads across the codebase by capillary action.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Code injected at build time.&lt;/strong&gt; A configuration that produces a different binary per tenant. Has the same operational cost as separate instances, plus the added joy of debugging behavior that depends on what compile-time flag was set. Don't.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The pattern that scales is to keep one codebase, one running cluster, one deploy pipeline — and to let per-tenant behavior live in &lt;em&gt;data&lt;/em&gt; that the code consults. Basically, I am describing multi-tenancy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Code defines the possibilities; data selects among them
&lt;/h2&gt;

&lt;p&gt;Identify the points in your system where behavior can vary per tenant. These are extension points: the discount engine, the approval workflow, the export format, the notification rules. At each one, your code defines a small set of behaviors it knows how to perform. Per-tenant data picks which behaviors to use and supplies the parameters.&lt;/p&gt;

&lt;p&gt;Concretely: a class hierarchy. A common shape is a &lt;code&gt;CustomRule&lt;/code&gt; base class with a contract — say, &lt;code&gt;applies?(context)&lt;/code&gt; and &lt;code&gt;apply(context)&lt;/code&gt; — and a set of concrete implementations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CustomRule&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;applies&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;...&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PercentageDiscountRule&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CustomRule&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;percent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;min_order&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;percent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;percent&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;min_order&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;min_order&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;applies&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_total&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;min_order&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;discount&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_total&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;percent&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;FirstPurchaseDiscountRule&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CustomRule&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;applies&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_count&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;discount&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A tenant's configuration is then a small declarative description — which rules they have, with what parameters:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"discount_rules"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"percentage"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"percent"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"min_order"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"first_purchase"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"amount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At runtime, you load the tenant's config, hydrate it into instances of the right rule classes, and run them. The code knows how to perform every behavior; the data says which behaviors to apply, in what order, with what parameters. To add a new kind of rule, you add a new class. To add a new tenant configuration, you change data — no deploy, no migration, no engineering.&lt;/p&gt;

&lt;p&gt;Notice that the &lt;code&gt;apply&lt;/code&gt; methods &lt;em&gt;mutate&lt;/em&gt; the incoming value. If you prefer to not do so, just return that result and apply it when called. A reasonable name for this operation is &lt;code&gt;result&lt;/code&gt;. This is really up to your preference in terms of using mutable vs immutable data. In the context of a web app, you usually &lt;em&gt;do&lt;/em&gt; want mutability (for example, encoding and decoding a value from the database to a particular meaning for a tenant). If there is more complexity, you can put it behind a port to unit test it separately.&lt;/p&gt;

&lt;p&gt;The shape generalizes: any extension point in your system can have its own base class, its own family of implementations, and its own data schema describing how it's configured per tenant.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where the data lives
&lt;/h2&gt;

&lt;p&gt;The configuration has to be persisted somewhere. The options aren't equivalent:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;In-memory cache.&lt;/strong&gt; Tempting because it's fast, but caches get invalidated, evicted, and reset on deploy. If the cache is the &lt;em&gt;source of truth&lt;/em&gt;, you've lost the data the moment something restarts. Caches belong in front of the source of truth, not in place of it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Files on disk.&lt;/strong&gt; Workable for very small, very stable configurations, but file I/O is slow at scale, file deployment is operational overhead, and "edit a file and redeploy" doesn't fit the case where customer success needs to toggle something for a tenant at 4pm on a Friday.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Static configuration baked into the app.&lt;/strong&gt; Fine for values that genuinely never change between deploys. But if the values are tenant-specific, you're back to the "code per customer" problem.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A database.&lt;/strong&gt; If you're already running one — and you almost certainly are — this is the clear winner. Reads are fast (especially with a thin cache in front), updates are transactional, the data sits next to the tenant records it's associated with, and you get backups, replication, and access control for free.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Use the database you already have. Don't introduce a new piece of infrastructure for this.&lt;/p&gt;

&lt;h2&gt;
  
  
  A note on schema
&lt;/h2&gt;

&lt;p&gt;Whichever shape you pick, the configuration has to be retrievable by tenant. That means a &lt;code&gt;tenant_id&lt;/code&gt; foreign key, typically a dedicated &lt;code&gt;tenant_configurations&lt;/code&gt; table with &lt;code&gt;tenant_id&lt;/code&gt; referencing &lt;code&gt;tenants&lt;/code&gt;, indexed for fast lookup. The runtime question is always the same: "given the tenant for this request, what's their configuration?" Get that relationship in place first; everything else flows from being able to find the right rules for the right tenant.&lt;/p&gt;

&lt;p&gt;If you're using a relational database, the principled approach beyond that is to model the configuration with normalized tables — a &lt;code&gt;tenant_discount_rules&lt;/code&gt; table with &lt;code&gt;tenant_id&lt;/code&gt;, typed columns for rule type, percent, min_order, and so on, or a polymorphic schema with a separate table per rule type. This is fine, and you may end up there. But I'd push back on starting there.&lt;/p&gt;

&lt;p&gt;For an initial proof of concept, a single table is enough:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;tenant_configurations&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;tenant_id&lt;/span&gt;   &lt;span class="nb"&gt;BIGINT&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;tenants&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="n"&gt;config&lt;/span&gt;      &lt;span class="n"&gt;JSONB&lt;/span&gt;  &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="s1"&gt;'{}'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;jsonb&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;updated_at&lt;/span&gt;  &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One row per tenant, the primary key handles the lookup index, no migrations needed when you add a new kind of rule. You fetch the row by &lt;code&gt;tenant_id&lt;/code&gt;, parse the &lt;code&gt;config&lt;/code&gt; JSON, hydrate it into your rule classes, run them. When the configuration stabilizes, when querying &lt;em&gt;into&lt;/em&gt; the configuration becomes important, or when validation needs to live at the database level, that's the moment to normalize. Until then, JSON in a column is the shortest path from idea to working code, and you can refactor toward structure once you know what the structure should be.&lt;/p&gt;

&lt;h2&gt;
  
  
  The security cliff
&lt;/h2&gt;

&lt;p&gt;There is one thing you must not do, no matter how convenient it looks: &lt;strong&gt;do not store executable code in the configuration, and do not let configuration values be interpreted and run.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That means no &lt;code&gt;eval&lt;/code&gt;, no &lt;code&gt;exec&lt;/code&gt;, no embedded JavaScript or Python or Ruby expressions, no SQL fragments concatenated into queries, no template engines that allow arbitrary function calls. It is tempting (&lt;em&gt;really&lt;/em&gt; tempting) to support a configuration that looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"discount_amount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"order.total * 0.1 if customer.tier == 'gold' else 0"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;…and &lt;code&gt;eval&lt;/code&gt; that string at runtime. Do not. The moment you do, anyone who can write to that configuration row can execute arbitrary code on your servers, with the privileges of your application. That's not a feature; that's a remote code execution vulnerability you built on purpose. It doesn't matter that the configuration is "only" editable by admins, or "only" through your UI — the surface area expands the moment another bug exposes that table, the moment a credential leaks, the moment an internal account is phished. The configuration becomes the attacker's payload delivery mechanism, and you handed them the loaded gun.&lt;/p&gt;

&lt;p&gt;The correct discipline is strict: &lt;strong&gt;configuration is data&lt;/strong&gt;. It selects between behaviors the code already knows how to perform and supplies typed parameters to them. It never describes a &lt;em&gt;new&lt;/em&gt; behavior. If a customer needs a behavior the code doesn't have, the answer is to add a new rule class, not to let them write logic into a JSON blob.&lt;/p&gt;

&lt;p&gt;This is also what makes the system safe to expose to customer-success people, support engineers, and eventually self-service customers. The blast radius of a misconfigured rule is "the rule doesn't apply" or "the rule applies wrong". Never "the server runs whatever I told it to."&lt;/p&gt;

&lt;h2&gt;
  
  
  The shape, summarized
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Identify per-tenant extension points and write a small base class for each.&lt;/li&gt;
&lt;li&gt;Implement the concrete behaviors as subclasses of that base.&lt;/li&gt;
&lt;li&gt;Store tenant configurations as data; start with a JSON column on the tenant record, normalize later if it earns it.&lt;/li&gt;
&lt;li&gt;Hydrate the data into classes at runtime; let the classes do the work.&lt;/li&gt;
&lt;li&gt;Never, ever let the data become code.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The principle underneath all of this is that &lt;em&gt;code&lt;/em&gt; is the menu (the list of things your system is capable of doing) and &lt;em&gt;data&lt;/em&gt; is the order. Customers can pick from the menu, in any combination, with any parameters. They cannot rewrite the menu. The chef writes the menu. That's how you keep the kitchen safe.&lt;/p&gt;

</description>
      <category>softwareengineering</category>
      <category>webdev</category>
      <category>designsystem</category>
      <category>backend</category>
    </item>
    <item>
      <title>The Compiler: Heart and Tools of All Software</title>
      <dc:creator>Gideon Towolawi </dc:creator>
      <pubDate>Mon, 18 May 2026 16:36:00 +0000</pubDate>
      <link>https://forem.com/ayndlr/the-compiler-heart-and-tools-of-all-software-5gl8</link>
      <guid>https://forem.com/ayndlr/the-compiler-heart-and-tools-of-all-software-5gl8</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm6hyu9w59a4oat024w7a.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm6hyu9w59a4oat024w7a.jpg" alt="Ayn Dlr System Engineer" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Compiler: Heart and Tools of All Software
&lt;/h2&gt;

&lt;p&gt;Every program you have ever run — your operating system, your browser, the app that woke you up this morning, the firmware in your coffee machine — was once just text. Human-readable text. Ideas typed by someone who understood a problem well enough to describe its solution.&lt;/p&gt;

&lt;p&gt;But computers do not read ideas. They read instructions. Binary. Electrical signals that mean nothing without precise interpretation.&lt;/p&gt;

&lt;p&gt;The bridge between human intention and machine execution is the &lt;strong&gt;compiler&lt;/strong&gt;. It is the most consequential piece of software ever invented. Without it, computer science as we know it does not exist.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Computer Science Would Be Without Compilers
&lt;/h2&gt;

&lt;p&gt;Imagine a world where every programmer writes raw machine code. Not assembly — actual binary. Opcodes and operands encoded by hand. Every program is a miracle of patience, and every bug is a nightmare of hexadecimal archaeology.&lt;/p&gt;

&lt;p&gt;In this world:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Software development is artisanal&lt;/strong&gt;, not industrial. A single application takes years.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Portability is a myth&lt;/strong&gt;. Every CPU architecture requires rewriting everything from scratch.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Abstraction dies&lt;/strong&gt;. There are no functions, no types, no modules — just raw memory and jumps.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security is impossible&lt;/strong&gt;. Human minds cannot track the state of thousands of registers and memory locations simultaneously.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Computer science without compilers is not computer science. It is digital craftsmanship at the limit of human endurance. The compiler is what lets us think in &lt;strong&gt;concepts&lt;/strong&gt; instead of &lt;strong&gt;circuits&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Compiler as a Pipeline of Principles
&lt;/h2&gt;

&lt;p&gt;A compiler is not a single program. It is a &lt;strong&gt;pipeline of transformations&lt;/strong&gt;, each stage reducing complexity and increasing structure. The quality of a compiler depends entirely on the &lt;strong&gt;principles&lt;/strong&gt; baked into each stage.&lt;/p&gt;

&lt;p&gt;Most people know the classical stages:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Lexer&lt;/strong&gt; — characters → tokens&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parser&lt;/strong&gt; — tokens → syntax tree&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Semantic Analysis&lt;/strong&gt; — syntax tree → validated intermediate representation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Optimization&lt;/strong&gt; — IR → faster IR&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Code Generation&lt;/strong&gt; — IR → machine code&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;But this description misses the point. The stages are not just mechanical steps. They are &lt;strong&gt;guardians of meaning&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Stage 1: The Lexer — Dumb by Design
&lt;/h3&gt;

&lt;p&gt;The lexer is where principles begin. Its job is simple: convert a stream of characters into a stream of tokens. &lt;code&gt;int&lt;/code&gt;, &lt;code&gt;x&lt;/code&gt;, &lt;code&gt;=&lt;/code&gt;, &lt;code&gt;42&lt;/code&gt;, &lt;code&gt;;&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;A bad lexer tries to be smart. It merges &lt;code&gt;=&lt;/code&gt; &lt;code&gt;=&lt;/code&gt; into &lt;code&gt;==&lt;/code&gt;. It strips whitespace because "it doesn't matter." It reconstructs strings and throws away the original quotes.&lt;/p&gt;

&lt;p&gt;A &lt;strong&gt;principled lexer stays dumb&lt;/strong&gt;. It emits raw tokens with precise spatial information — where each token starts, where it ends, what line, what column. It does not interpret. It does not merge. It does not discard.&lt;/p&gt;

&lt;p&gt;Why? Because &lt;strong&gt;semantics belong to the parser&lt;/strong&gt;. The lexer cannot know whether &lt;code&gt;::&lt;/code&gt; is a scope resolution operator or two separate colons in a ternary expression. It cannot know whether whitespace inside a string literal is significant or decorative. By staying dumb, the lexer preserves &lt;strong&gt;all information&lt;/strong&gt; for downstream stages to make informed decisions.&lt;/p&gt;

&lt;p&gt;The token structure I use reflects this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="nc"&gt;Token&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;TokenType&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;      &lt;span class="c1"&gt;// what kind of token&lt;/span&gt;
  &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;lexeme&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// the raw text&lt;/span&gt;
  &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;         &lt;span class="c1"&gt;// visual line for errors&lt;/span&gt;
  &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;column&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;       &lt;span class="c1"&gt;// visual column for errors&lt;/span&gt;
  &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;span_to&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;      &lt;span class="c1"&gt;// exclusive byte offset in source&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;span_to&lt;/code&gt; is the critical field. It lets the parser reconstruct multi-token operators. It lets the formatter preserve original spacing. It lets the LSP highlight exact ranges. The lexer does not use this information — it merely &lt;strong&gt;records&lt;/strong&gt; it, faithfully and without interpretation.&lt;/p&gt;

&lt;p&gt;This is the first principle: &lt;strong&gt;reduce at the right stage, never earlier&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Principles Matter More Than Performance
&lt;/h3&gt;

&lt;p&gt;It is tempting to optimize the lexer. Merge tokens early. Strip separators. Compress the token stream. These optimizations feel productive.&lt;/p&gt;

&lt;p&gt;They are traps.&lt;/p&gt;

&lt;p&gt;Every piece of information discarded in the lexer is a piece of information that cannot be recovered in the parser, the semantic analyzer, or the code generator. A stripped space cannot be restored for formatting. A merged &lt;code&gt;==&lt;/code&gt; cannot be split back if the parser needs to report "unexpected token &lt;code&gt;=&lt;/code&gt; after &lt;code&gt;=&lt;/code&gt;". An interpreted string literal loses the original escape sequences.&lt;/p&gt;

&lt;p&gt;The cost of a "smart" lexer is &lt;strong&gt;permanent information loss&lt;/strong&gt;. The cost of a dumb lexer is a slightly larger token stream — trivial to optimize later, impossible to reconstruct if deleted early.&lt;/p&gt;

&lt;p&gt;This principle extends through every compiler stage:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Parser&lt;/strong&gt;: Validate syntax strictly, but do not constant-fold yet&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Semantic Graph&lt;/strong&gt;: Resolve types and ownership, but do not lower to machine concepts yet&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IR&lt;/strong&gt;: Represent semantics faithfully, optimize only when correctness is provable&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Backend&lt;/strong&gt;: Generate code for the target, but never modify semantic truth&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each stage has one job. Each stage does that job completely. No stage does another stage's work prematurely.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building Correct by Construction
&lt;/h2&gt;

&lt;p&gt;The compiler is not just a tool. It is a &lt;strong&gt;proof system&lt;/strong&gt;. It proves that your program means what you think it means, that it will not leak memory, that it will not access invalid lifetimes, that it will execute deterministically across architectures.&lt;/p&gt;

&lt;p&gt;This is not about being clever. It is about being &lt;strong&gt;correct by construction&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Comes Next
&lt;/h2&gt;

&lt;p&gt;Over the next weeks, I will document each stage of compiler construction in detail:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Why the lexer stays dumb and what that enables&lt;/li&gt;
&lt;li&gt;How the semantic graph builds structure from raw tokens&lt;/li&gt;
&lt;li&gt;What compile-time invariants mean for systems programming&lt;/li&gt;
&lt;li&gt;How to translate semantics into machine resources without losing correctness&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you are building compilers, thinking about language design, or simply curious about how software becomes real, &lt;a href="https://ayndlr.substack.com" rel="noopener noreferrer"&gt;subscribe to the newsletter&lt;/a&gt;. I share what I learn, what I get wrong, and how to avoid the traps I fall into.&lt;/p&gt;

&lt;p&gt;The compiler is the heart of software. Understanding it is understanding how we turn thought into action.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Building a systems language that writes like C++ and proves safety like Rust, without the mental overhead. &lt;a href="https://ayndlr.substack.com" rel="noopener noreferrer"&gt;Join the newsletter&lt;/a&gt; for weekly deep-dives on compiler architecture, language design, and systems programming.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>compilers</category>
      <category>programming</category>
      <category>systems</category>
      <category>softwareengineering</category>
    </item>
    <item>
      <title>How I Built FreeLabTools Using Only Claude and Gemini (And Why It Changes Everything)</title>
      <dc:creator>Free Lab Tools</dc:creator>
      <pubDate>Mon, 18 May 2026 16:35:26 +0000</pubDate>
      <link>https://forem.com/freelabtools/how-i-built-freelabtools-using-only-claude-and-gemini-and-why-it-changes-everything-j15</link>
      <guid>https://forem.com/freelabtools/how-i-built-freelabtools-using-only-claude-and-gemini-and-why-it-changes-everything-j15</guid>
      <description>&lt;p&gt;As a solo developer, the biggest bottleneck isn't usually the ideas—it's the time required to execution. Recently, I wanted to launch a suite of free web tools for developers and creators, but doing it all from scratch would have taken weeks. &lt;/p&gt;

&lt;p&gt;Instead, I decided to run an experiment: building the entire platform using a "tag-team" of Large Language Models (&lt;strong&gt;Claude&lt;/strong&gt; and &lt;strong&gt;Gemini&lt;/strong&gt;). &lt;/p&gt;

&lt;p&gt;The result? &lt;a href="https://freelabtools.com" rel="noopener noreferrer"&gt;FreeLabTools.com&lt;/a&gt; is now live, fully functional, and was built in a fraction of the time. Here is exactly how I did it.&lt;/p&gt;




&lt;h3&gt;
  
  
  The Strategy: Playing to Each AI's Strengths
&lt;/h3&gt;

&lt;p&gt;I quickly realized that treating AI models as generalists is a mistake. To build &lt;a href="https://freelabtools.com" rel="noopener noreferrer"&gt;FreeLabTools&lt;/a&gt; efficiently, I assigned specific roles to each LLM based on their core strengths:&lt;/p&gt;

&lt;h4&gt;
  
  
  1. Claude: The Architect &amp;amp; Lead Coder
&lt;/h4&gt;

&lt;p&gt;I used Claude (specifically Claude 3.5 Sonnet) as my primary software engineer. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;What it did:&lt;/strong&gt; Generated the clean, modular JavaScript logic for the tools, handled complex algorithms, and structured the UI using modern CSS/Tailwind.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Why it shined:&lt;/strong&gt; Claude’s ability to maintain context over long conversations and write production-ready code with minimal bugs is unmatched. It understood the "edge cases" of client-side web tools perfectly.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  2. Gemini: The Researcher, Optimizers &amp;amp; Copywriter
&lt;/h4&gt;

&lt;p&gt;While Claude was busy coding, I used Gemini to handle the broader scope of the project.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;What it did:&lt;/strong&gt; Optimized the code for speed, generated SEO-friendly meta descriptions, structured the JSON-LD schema for Google, and helped brainstorm user-friendly UI copy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Why it shined:&lt;/strong&gt; Gemini’s integration with up-to-date web standards and its fast processing made it the perfect tool for refining, auditing, and preparing the site for launch.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  The Workflow: How They Worked Together
&lt;/h3&gt;

&lt;p&gt;The synergy was surprisingly smooth. I would ask &lt;strong&gt;Claude&lt;/strong&gt; to generate a specific tool (for example, a robust code formatter or a secure password generator). Once the tool was functional, I would feed that code into &lt;strong&gt;Gemini&lt;/strong&gt; with the prompt: &lt;em&gt;“Review this code for performance bottlenecks and suggest SEO metadata for the tool page.”&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Gemini would often spot tiny optimizations or suggest better accessibility (ARIA) attributes, which I would then feed back to Claude to implement. It felt like managing a highly cooperative two-person dev team.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Takeaways for Solo Devs
&lt;/h3&gt;

&lt;p&gt;If you are planning to build your own SaaS or utility site like &lt;a href="https://freelabtools.com" rel="noopener noreferrer"&gt;FreeLabTools.com&lt;/a&gt;, here is my advice:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Be specific with prompts:&lt;/strong&gt; Don't just say "build a tool." Define the inputs, expected outputs, and constraints.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Double-check the math/logic:&lt;/strong&gt; Even though both AIs are incredibly smart, human oversight is still required to test the final output.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automate the boring stuff:&lt;/strong&gt; Let AI handle the boilerplate code so you can focus on user experience and deployment.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  What's Next?
&lt;/h3&gt;

&lt;p&gt;Building this project proved to me that the barrier to entry for launching web platforms has completely collapsed. &lt;/p&gt;

&lt;p&gt;I'd love for you to check out the final result at &lt;a href="https://freelabtools.com" rel="noopener noreferrer"&gt;FreeLabTools.com&lt;/a&gt; and let me know what you think. If you have any questions about the specific prompts I used to pair Claude and Gemini, drop a comment below!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Have you tried building a full project using multiple AI models? What was your experience?&lt;/strong&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>LLM Evaluation in CI: Stop Manual Testing Before It Costs You</title>
      <dc:creator>Charlie Hadley</dc:creator>
      <pubDate>Mon, 18 May 2026 16:35:21 +0000</pubDate>
      <link>https://forem.com/hadleyworks/llm-evaluation-in-ci-stop-manual-testing-before-it-costs-you-59i7</link>
      <guid>https://forem.com/hadleyworks/llm-evaluation-in-ci-stop-manual-testing-before-it-costs-you-59i7</guid>
      <description>&lt;h1&gt;
  
  
  LLM Evaluation in CI: Stop Manual Testing Before It Costs You
&lt;/h1&gt;

&lt;p&gt;You ship a prompt change to production. Two hours later, a customer complains your LLM is returning hallucinated data. You rollback. You lost an hour of revenue and some user trust.&lt;/p&gt;

&lt;p&gt;This happens because you tested the happy path, not the edge cases. LLM systems are probabilistic — the same input doesn't always produce the same output quality.&lt;/p&gt;

&lt;p&gt;The enterprise solution is Braintrust ($249/mo), LangSmith ($99/mo), or Arize. If you're indie, bootstrapped, or pre-PMF, those budgets simply don't exist.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Core Idea: Eval-as-Code
&lt;/h2&gt;

&lt;p&gt;Instead of vibes-based testing, you define quality as a &lt;strong&gt;rubric&lt;/strong&gt; with concrete attributes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Correctness&lt;/strong&gt; (0–10): Is the answer factually right?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Conciseness&lt;/strong&gt; (0–10): Does it avoid unnecessary padding?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hallucination risk&lt;/strong&gt; (0–10): Does it cite things it can't know?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tone&lt;/strong&gt; (0–10): Does it match expected register?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Usefulness&lt;/strong&gt; (0–10): Would a real user find this helpful?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A cheap judge model (GPT-4o-mini at ~$0.0001/call) scores each output against your rubric. You run 50 test cases per eval. Total cost: about £0.20 per full run.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building This in GitHub Actions
&lt;/h2&gt;

&lt;p&gt;Here's the minimal structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;LLM Eval&lt;/span&gt;
&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;pull_request&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;eval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v3&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run evals&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;python run_evals.py&lt;/span&gt;
        &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.OPENAI_API_KEY }}&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Check threshold&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;python check_threshold.py --min-score &lt;/span&gt;&lt;span class="m"&gt;7.5&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;run_evals.py&lt;/code&gt; script:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Loads your golden dataset (JSON file of input/expected-output pairs)&lt;/li&gt;
&lt;li&gt;Runs your LLM system on each input&lt;/li&gt;
&lt;li&gt;Sends (input, expected, actual) to GPT-4o-mini with your rubric&lt;/li&gt;
&lt;li&gt;Aggregates scores by attribute&lt;/li&gt;
&lt;li&gt;Writes results to &lt;code&gt;eval_results.json&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If aggregate score drops below your threshold, &lt;code&gt;check_threshold.py&lt;/code&gt; exits with code 1 — the PR fails.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Real Example From Production
&lt;/h2&gt;

&lt;p&gt;I changed a classification system prompt to improve response formatting. The change looked solid in manual testing on 5 examples. But I accidentally dropped a critical piece of context the model needed for correct classification.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Without evals:&lt;/strong&gt; ships to users. Angry support tickets. Rollback. Lost trust.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;With evals:&lt;/strong&gt; CI caught it in 4 minutes. PR fails. I fix the prompt. Evals pass. Ship confidently.&lt;/p&gt;

&lt;h2&gt;
  
  
  Golden Datasets: The Hard Part
&lt;/h2&gt;

&lt;p&gt;The hardest part is building your test cases. The key insight: &lt;strong&gt;start with failures, not successes.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Every time your LLM system makes a mistake:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Save the input&lt;/li&gt;
&lt;li&gt;Write down what the correct output should have been&lt;/li&gt;
&lt;li&gt;Add it to your golden dataset&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;After 2–3 weeks of normal usage, you'll have 30–50 meaningful test cases that represent real failure modes — far more valuable than synthetic test cases you invented upfront.&lt;/p&gt;

&lt;h2&gt;
  
  
  Multi-Model Comparison
&lt;/h2&gt;

&lt;p&gt;Before committing to an expensive model, run your eval suite across providers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;models&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o-mini&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-3-5-haiku&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-flash-1.5&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;run_eval_suite&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;golden_dataset&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Sort by (score / cost_per_1k_tokens) to find optimal tradeoff
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This stops you from paying for GPT-4o when Claude Haiku scores 92% as well at 20% of the cost.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cost Optimization
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Batch your calls&lt;/strong&gt;: OpenAI batch API gives 50% discount on async evals&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cache responses&lt;/strong&gt;: Hash (model + prompt + input) → cache hit avoids re-scoring&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Coarse-to-fine&lt;/strong&gt;: Use a 2-stage system — cheap model filters obvious passes, expensive model only sees borderline cases&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Weekly CI only&lt;/strong&gt;: Run full suite on PRs to main, not every commit&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A well-optimized setup runs 100 eval cases for under £0.10.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I've Packaged Up
&lt;/h2&gt;

&lt;p&gt;I've turned this into a complete ready-to-use system in &lt;strong&gt;&lt;a href="https://hadleyworks.gumroad.com/l/nyzala" rel="noopener noreferrer"&gt;The Indie Hacker's LLM Eval Playbook&lt;/a&gt;&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;6 golden dataset templates&lt;/strong&gt; for common LLM tasks (classification, summarization, retrieval, generation, code review, reasoning)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Complete rubric scoring system&lt;/strong&gt; in Python (copy-paste ready)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-model comparison script&lt;/strong&gt; with cost-efficiency ranking&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GitHub Actions workflow&lt;/strong&gt; — drop it in your repo and it works&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost optimization guide&lt;/strong&gt; with benchmarks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;£29 one-time.&lt;/strong&gt; One avoided production incident pays for it 10× over.&lt;/p&gt;

&lt;p&gt;If you have questions about implementing eval-as-code for your specific use case, drop them in the comments — happy to help.&lt;/p&gt;

</description>
      <category>llm</category>
      <category>ai</category>
      <category>testing</category>
      <category>devops</category>
    </item>
    <item>
      <title>Jenkins as a Code, or how I stopped clicking around in the UI</title>
      <dc:creator>Khachatur Ashotyan</dc:creator>
      <pubDate>Mon, 18 May 2026 16:35:13 +0000</pubDate>
      <link>https://forem.com/lanycrost/jenkins-as-a-code-or-how-i-stopped-clicking-around-in-the-ui-1nko</link>
      <guid>https://forem.com/lanycrost/jenkins-as-a-code-or-how-i-stopped-clicking-around-in-the-ui-1nko</guid>
      <description>&lt;p&gt;I've been running Jenkins in one form or another for years now. Different companies, different sizes of teams, but somehow the same story keeps repeating itself, and at some point I just couldn't take it anymore. So I decided to write down what I went through, what I learned, and where this journey took me. This is Part 1 of what I'm calling &lt;strong&gt;My CI/CD Odyssey&lt;/strong&gt; — a series where I want to share the ideas, the mistakes, and the things that actually worked.&lt;/p&gt;

&lt;p&gt;Future chapters will go deeper into the painful stuff — building macOS workers without losing your mind, using &lt;strong&gt;spot instances as GitHub Actions runners&lt;/strong&gt; to cut costs, and a few other rabbit holes I went into. But before we get there, let's start at the beginning, because the beginning is where most of the pain lives.&lt;/p&gt;

&lt;h2&gt;
  
  
  The "before" picture, and why it hurts
&lt;/h2&gt;

&lt;p&gt;If you've worked with Jenkins for any reasonable amount of time, you probably know this scene: someone opens the Jenkins UI, clicks "New Item", picks a freestyle or pipeline job, fills in twenty-something fields, scrolls past a wall of plugin options, and clicks Save. Then a month later somebody has to figure out why a job behaves differently in &lt;code&gt;dev&lt;/code&gt; than in &lt;code&gt;prod&lt;/code&gt;, and the answer is "because Arthur clicked a different checkbox in February and nobody remembers".&lt;/p&gt;

&lt;p&gt;That was basically my world for a long time. We had multi-tier environments — &lt;code&gt;dev&lt;/code&gt;, &lt;code&gt;stage&lt;/code&gt;, sometimes more — and on top of that, sometimes more than one Jenkins instance per tier. Each one was configured by hand. Plugins installed by hand. Pipelines copy-pasted from one Jenkins to another and edited by hand. Credentials added by hand. Workers attached by hand. Then one day you wake up and realize:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Nobody remembers what plugins are installed where.&lt;/li&gt;
&lt;li&gt;The "stage" Jenkins doesn't match production anymore, and you only notice when a pipeline breaks in prod.&lt;/li&gt;
&lt;li&gt;A plugin update on Friday afternoon kills a build, and rolling it back means a human clicking buttons under stress.&lt;/li&gt;
&lt;li&gt;A new team member joins and you spend three days explaining tribal knowledge that should really live in a repo.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last point is what really got me. Tribal knowledge is fine when there are two of you. It stops being fine very quickly.&lt;/p&gt;

&lt;h2&gt;
  
  
  The idea: treat Jenkins like any other piece of code
&lt;/h2&gt;

&lt;p&gt;So I started doing some research, and the direction was pretty obvious in hindsight: if Jenkins is a piece of infrastructure, and we treat infrastructure as code everywhere else (Terraform for cloud, Helm for Kubernetes, Ansible for hosts), then Jenkins itself shouldn't be the special snowflake we manage by hand. The whole controller, all the jobs, all the credentials wiring, the workers — everything should come out of a git repo. End to end.&lt;/p&gt;

&lt;p&gt;The goal I wrote down for myself was something like this:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;I want a Jenkins instance where I can throw away the whole VM, the whole cluster, the whole config, run a pipeline, and ten minutes later have an identical Jenkins back. And I want &lt;code&gt;dev&lt;/code&gt; to be code-to-code identical to &lt;code&gt;prod&lt;/code&gt;, so when I test a plugin upgrade or a pipeline change in dev, I actually know it will behave the same in prod.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If you've ever burned yourself on a "but it worked in stage" deploy, you know exactly why that sentence matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  The building blocks
&lt;/h2&gt;

&lt;p&gt;Once I started designing this, the picture broke down into a few moving pieces. None of these are revolutionary on their own — what matters is how they fit together.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu1tg081zv8f8le07bs74.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu1tg081zv8f8le07bs74.png" alt="A three-column architecture diagram: git repo on the left, Jenkins controller and operator in the middle, ephemeral Linux pods + cloud VMs + macOS VMs on the right." width="800" height="473"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  1. JCasC — Jenkins Configuration as Code
&lt;/h3&gt;

&lt;p&gt;This is the foundation. &lt;a href="https://plugins.jenkins.io/configuration-as-code/" rel="noopener noreferrer"&gt;&lt;strong&gt;JCasC&lt;/strong&gt;&lt;/a&gt; is a Jenkins plugin that lets you define the entire controller config in YAML. System settings, security realm, authorization strategy, clouds, credentials wiring, tools, global libraries — all of it. The controller reads the YAML on boot and configures itself.&lt;/p&gt;

&lt;p&gt;The moment I plugged JCasC in and could rebuild a controller from a YAML file, I knew I wasn't going back. No more "what's installed where". Whatever is in the YAML &lt;em&gt;is&lt;/em&gt; the truth. If it's not in the YAML, it doesn't exist.&lt;/p&gt;

&lt;p&gt;A minimal taste of what that looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;jenkins&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;systemMessage&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Managed&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;by&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;JCasC&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;—&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;do&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;not&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;edit&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;in&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;the&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;UI"&lt;/span&gt;
  &lt;span class="na"&gt;numExecutors&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;
  &lt;span class="na"&gt;mode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;EXCLUSIVE&lt;/span&gt;
  &lt;span class="na"&gt;securityRealm&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;github&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;clientID&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${GITHUB_CLIENT_ID}&lt;/span&gt;
      &lt;span class="na"&gt;clientSecret&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${GITHUB_CLIENT_SECRET}&lt;/span&gt;
  &lt;span class="na"&gt;clouds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;kubernetes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;eks"&lt;/span&gt;
        &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;jenkins"&lt;/span&gt;
        &lt;span class="na"&gt;jenkinsUrl&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://jenkins.jenkins.svc.cluster.local:8080"&lt;/span&gt;
&lt;span class="na"&gt;unclassified&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;globalLibraries&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;libraries&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ci-libs"&lt;/span&gt;
        &lt;span class="na"&gt;defaultVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;main"&lt;/span&gt;
        &lt;span class="na"&gt;retriever&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;modernSCM&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;scm&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;git&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                &lt;span class="na"&gt;remote&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://github.com/&amp;lt;org&amp;gt;/ci-libs.git"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Fifteen lines, and the whole controller knows who it is.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Job DSL — jobs from a git repo
&lt;/h3&gt;

&lt;p&gt;JCasC handles the controller, but it doesn't really handle jobs. For that I leaned on the &lt;a href="https://plugins.jenkins.io/job-dsl/" rel="noopener noreferrer"&gt;&lt;strong&gt;Job DSL plugin&lt;/strong&gt;&lt;/a&gt;. Jobs are defined in Groovy files in a git repo, and a small "seeder" job in Jenkins polls the repo, picks up all the DSL files, and recreates jobs from them. If a job is removed from git, it disappears from Jenkins. If a parameter changes in git, it changes in Jenkins on the next seed run.&lt;/p&gt;

&lt;p&gt;This means the Jenkins UI becomes basically read-only from a configuration point of view. Nobody edits a job in the UI anymore — if you do, the seeder will overwrite you on the next run. That's a feature, not a bug.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://jenkinsci.github.io/job-dsl-plugin/" rel="noopener noreferrer"&gt;Look here for declarative API&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Helm + Kubernetes for the controller
&lt;/h3&gt;

&lt;p&gt;I run the Jenkins controller in &lt;a href="https://kubernetes.io/" rel="noopener noreferrer"&gt;Kubernetes&lt;/a&gt;. &lt;a href="https://github.com/jenkinsci/helm-charts" rel="noopener noreferrer"&gt;Helm chart for the deploy&lt;/a&gt;, persistent volume for the home dir, a sidecar that injects JCasC config from a &lt;code&gt;ConfigMap&lt;/code&gt;. Upgrading Jenkins is just bumping a chart version. Rolling back is rolling back a chart version. Plugin lists are values in a Helm &lt;code&gt;values.yaml&lt;/code&gt; file, version-pinned, and reviewed in a pull request like any other change.&lt;/p&gt;

&lt;p&gt;This is honestly the part that made plugin upgrades stop being scary. They go through a PR. They get tested in &lt;code&gt;dev&lt;/code&gt; first. They get the same review as application code.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Side note: if you'd rather not deal with Helm at all, the community also maintains a &lt;a href="https://github.com/jenkinsci/kubernetes-operator" rel="noopener noreferrer"&gt;&lt;strong&gt;Jenkins Kubernetes Operator&lt;/strong&gt;&lt;/a&gt; that takes a CRD-first approach. I went with Helm for the simpler upgrade story, but the operator is a perfectly reasonable alternative if you're already heavy into the operator pattern.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  4. Packer for worker images
&lt;/h3&gt;

&lt;p&gt;The next big piece is the workers — the actual machines that run your builds. Here I went all-in on &lt;a href="https://www.packer.io/" rel="noopener noreferrer"&gt;&lt;strong&gt;Packer&lt;/strong&gt;&lt;/a&gt;. Every worker image is baked from a Packer template that lives in git: base OS, language runtimes, SDKs, build tools, everything pre-installed. The image gets a version. The version gets pinned in the worker config.&lt;/p&gt;

&lt;p&gt;This was the moment that builds started to feel reproducible. Before Packer, every worker was a slightly different snowflake, hand-installed and slowly drifting. After Packer, every worker that boots from image &lt;code&gt;v1.2.3&lt;/code&gt; is byte-for-byte the same as every other worker booted from image &lt;code&gt;v1.2.3&lt;/code&gt;. If a dependency upgrade breaks something, you know exactly which image introduced it, and you can pin back to the previous one in a one-line PR.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Ephemeral workers — born, used, destroyed
&lt;/h3&gt;

&lt;p&gt;This is the part that connects everything, and honestly the part I'm proudest of. Workers in this setup are &lt;strong&gt;ephemeral&lt;/strong&gt;. Not "long-lived agents we reboot once a week" — actually ephemeral. A pipeline asks Jenkins for a worker, dedicated job spins one up from a known Packer image, the worker runs the build, the worker dies. Always. Every build gets a virgin environment.&lt;/p&gt;

&lt;p&gt;The "something" depends on the platform, but the pattern is identical across all of them:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Linux builds&lt;/strong&gt; — the &lt;a href="https://plugins.jenkins.io/kubernetes/" rel="noopener noreferrer"&gt;Jenkins Kubernetes plugin&lt;/a&gt; schedules a pod in the EKS cluster from a container image we baked. Build finishes, pod is deleted. Lifecycle is seconds to minutes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AWS EC2 / Azure VMs (Linux and Windows)&lt;/strong&gt; — Dedicated job run &lt;a href="https://developer.hashicorp.com/terraform" rel="noopener noreferrer"&gt;terraform&lt;/a&gt; to provision and de-provision instances from packer templates.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;macOS VMs&lt;/strong&gt; — same idea, but the underlying virtualization is its own world. We spin up a fresh macOS VM from a Packer-baked image on each build (via Tart on Apple Silicon hosts, or vSphere for older fleets, or &lt;a href="https://github.com/cirruslabs/orchard" rel="noopener noreferrer"&gt;Orchard&lt;/a&gt; for pooled remote Macs), the build runs, and the VM is torn down at the end. macOS is messier and deserves its own post — that's Part 2 — but the contract is the same: &lt;strong&gt;born for one build, destroyed after&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The point is: &lt;strong&gt;every build starts from byte-identical state&lt;/strong&gt;. Not "mostly the same". Not "the same modulo &lt;code&gt;~/.cache&lt;/code&gt;". Identical. If &lt;code&gt;v1.2.3&lt;/code&gt; of an image is what's running, then every build on that image starts from the exact same filesystem snapshot the Packer pipeline produced. There's no human in between leaving footprints.&lt;/p&gt;

&lt;p&gt;That kills a whole category of bugs. No more "leftover state on the agent". No more "this worker has a weird &lt;code&gt;~/.cache&lt;/code&gt; somebody never cleaned up". No more "the disk filled up because of build artifacts from three weeks ago". No more "this only fails on Friday because the agent's been up since Monday and something is leaking". The worker simply doesn't live long enough to accumulate any of that.&lt;/p&gt;

&lt;p&gt;It also makes "build is non-reproducible" investigations a lot shorter. If two builds against the same commit produce different artifacts, the cause is almost never the worker — because the worker is brand new in both cases. That narrows the search dramatically.&lt;/p&gt;

&lt;p&gt;And it turns out to be a beautiful security property too: secrets that get pulled onto a worker disappear with it. There's no long-lived agent holding old tokens. If a credential leaks into a build environment, its blast radius is measured in minutes, not weeks.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F65jvcvpkb9y00w4vxxdo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F65jvcvpkb9y00w4vxxdo.png" alt="One build, one worker — three platforms, same contract" width="800" height="742"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Terraform / Terragrunt for everything else
&lt;/h3&gt;

&lt;p&gt;All the things that aren't Jenkins itself — VPCs, IAM, secret stores, the EKS cluster, image galleries — live in &lt;a href="https://www.terraform.io/" rel="noopener noreferrer"&gt;Terraform&lt;/a&gt;, organized with &lt;a href="https://terragrunt.gruntwork.io/" rel="noopener noreferrer"&gt;Terragrunt&lt;/a&gt; so the same modules get reused across &lt;code&gt;dev&lt;/code&gt; and &lt;code&gt;prod&lt;/code&gt; with different inputs. Same code, different variables. That's how I get &lt;code&gt;dev&lt;/code&gt; to be code-to-code identical to &lt;code&gt;prod&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;If you ever want to test how production will behave, just run the same Terraform with &lt;code&gt;ENV=stage&lt;/code&gt; instead of &lt;code&gt;ENV=prod&lt;/code&gt;. Same modules, same versions, just a different namespace. No surprises.&lt;/p&gt;

&lt;h2&gt;
  
  
  How it all clicks together
&lt;/h2&gt;

&lt;p&gt;The flow ends up looking like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Somebody opens a pull request — could be a new job, a plugin bump, a JCasC tweak, a new Packer image.&lt;/li&gt;
&lt;li&gt;CI runs validation: YAML lint, Groovy compile checks, Terraform plan, Packer build for changed images.&lt;/li&gt;
&lt;li&gt;PR gets reviewed and merged.&lt;/li&gt;
&lt;li&gt;On merge, &lt;a href="https://github.com/features/actions" rel="noopener noreferrer"&gt;GitHub Actions&lt;/a&gt; applies infra changes via Terraform, and the Jenkins seeder picks up new DSL files on its next poll.&lt;/li&gt;
&lt;li&gt;Next build that needs a worker pulls the new image. No human in the loop.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's the loop. That's the whole point. The Jenkins UI becomes a window into what the repo says should be running, not the source of truth.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this fixed for me
&lt;/h2&gt;

&lt;p&gt;Here's what I noticed had actually changed:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp8w99edkczsvh6cuxit8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp8w99edkczsvh6cuxit8.png" alt="Before / After — Jenkins as a Code" width="800" height="279"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No more "works on stage, breaks on prod".&lt;/strong&gt; Because the two are literally the same code with different inputs. If it works on stage, it works on prod, modulo data differences.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Plugin upgrades stopped being scary.&lt;/strong&gt; They go through a PR. They get tried on &lt;code&gt;dev&lt;/code&gt;. They roll back with &lt;code&gt;git revert&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Onboarding got faster.&lt;/strong&gt; New engineers read the repo. They don't have to be told secrets or shown a Jenkins UI tour.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Disaster recovery got real.&lt;/strong&gt; I can lose the controller VM, the EKS cluster, even the entire account, and as long as I have the repo I can rebuild.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audit trail came for free.&lt;/strong&gt; Every change to any pipeline is a git commit, with an author, a timestamp, and a PR description. No more "who changed this and when".&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What I'm still figuring out
&lt;/h2&gt;

&lt;p&gt;I don't want to make this sound like a finished story, because it's not. A few things still keep me up at night:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;macOS workers&lt;/strong&gt; are their own special kind of hell. You can't just spin up a Mac VM in AWS the same way you spin up Linux. There's a whole ecosystem of hypervisors, licensing rules, and hardware constraints to deal with. This deserves its own post — and it's getting one. &lt;strong&gt;Part 2 will be all about macOS workers&lt;/strong&gt;: &lt;a href="https://github.com/cirruslabs/tart" rel="noopener noreferrer"&gt;Tart&lt;/a&gt;, virtualization on Apple Silicon, the trade-offs between self-hosted and cloud-mac providers, and how to make signing and notarization not feel like a horror movie.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;GitHub actions &lt;strong&gt;Cost&lt;/strong&gt; at scale. There is easy way to run &lt;strong&gt;spot instances as GitHub Actions runners&lt;/strong&gt; to offload certain workloads cheaply, save money, and that's its own rabbit hole — different trade-offs, different failure modes, different cost curves. &lt;strong&gt;Part 3 will cover spot-based GitHub Actions runners&lt;/strong&gt; end to end.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Closing thought
&lt;/h2&gt;

&lt;p&gt;If there's one thing I'd say to anyone reading this who's still managing Jenkins by clicking buttons, it's this: you're not lazy for doing it, you're just paying the cost in places that don't show up on a dashboard. The cost shows up when someone leaves the team, when a plugin update breaks a build at 2am, when a customer-facing deploy fails because &lt;code&gt;stage&lt;/code&gt; lied to you. &lt;strong&gt;Jenkins as a Code doesn't make those costs disappear, but it makes them visible and reviewable.&lt;/strong&gt; And that, honestly, has been worth all the work.&lt;/p&gt;




&lt;h2&gt;
  
  
  Appendix — tools and plugins I leaned on
&lt;/h2&gt;

&lt;p&gt;For anyone who wants to skip straight to the implementations, here's the short list of what's actually wired up in this setup:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Jenkins plugins&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://plugins.jenkins.io/configuration-as-code/" rel="noopener noreferrer"&gt;Configuration as Code (JCasC)&lt;/a&gt; — the controller config in YAML.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://plugins.jenkins.io/job-dsl/" rel="noopener noreferrer"&gt;Job DSL&lt;/a&gt; — jobs defined in Groovy in a git repo.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://plugins.jenkins.io/kubernetes/" rel="noopener noreferrer"&gt;Kubernetes plugin&lt;/a&gt; — ephemeral pod agents in EKS.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://plugins.jenkins.io/workflow-cps-global-lib/" rel="noopener noreferrer"&gt;Pipeline: Shared Groovy Libraries&lt;/a&gt; — the global libraries that hold reusable pipeline code.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Deployment&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/jenkinsci/helm-charts" rel="noopener noreferrer"&gt;Jenkins official Helm chart&lt;/a&gt; — what I use to deploy the controller.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/jenkinsci/kubernetes-operator" rel="noopener noreferrer"&gt;Jenkins Kubernetes Operator&lt;/a&gt; — the CRD-based alternative, if you prefer operators over Helm.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Image building&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.packer.io/" rel="noopener noreferrer"&gt;HashiCorp Packer&lt;/a&gt; — bakes all the worker images (Linux, Windows, macOS).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Infrastructure&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.terraform.io/" rel="noopener noreferrer"&gt;Terraform&lt;/a&gt; — everything outside Jenkins (VPCs, IAM, secrets, EKS, image galleries).&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://terragrunt.gruntwork.io/" rel="noopener noreferrer"&gt;Terragrunt&lt;/a&gt; — keeps the same modules DRY across &lt;code&gt;dev&lt;/code&gt; / &lt;code&gt;stage&lt;/code&gt; / &lt;code&gt;prod&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://kubernetes.io/" rel="noopener noreferrer"&gt;Kubernetes&lt;/a&gt; / &lt;a href="https://aws.amazon.com/eks/" rel="noopener noreferrer"&gt;Amazon EKS&lt;/a&gt; — where the Jenkins controller lives.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://helm.sh/" rel="noopener noreferrer"&gt;Helm&lt;/a&gt; — package manager for the Kubernetes side.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/features/actions" rel="noopener noreferrer"&gt;GitHub Actions&lt;/a&gt; — applies Terraform on merge.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Coming up in later parts&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/cirruslabs/tart" rel="noopener noreferrer"&gt;Tart&lt;/a&gt; — macOS VMs on Apple Silicon (Part 2).&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/cirruslabs/orchard" rel="noopener noreferrer"&gt;Orchard&lt;/a&gt; — Tart cluster orchestration for macOS fleets (Part 2).&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;This is Part 1 of My CI/CD Odyssey. If you want to be pinged when Part 2 drops, follow me here on dev.to. And if you're doing JaaC differently — I'd love to hear about it in the comments.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>jenkins</category>
      <category>devops</category>
      <category>cicd</category>
      <category>gitops</category>
    </item>
    <item>
      <title>I Built a Debugger for LLM Agents — Here's Why "Observability" Wasn't Enough</title>
      <dc:creator>Raju Shanigarapu</dc:creator>
      <pubDate>Mon, 18 May 2026 16:32:33 +0000</pubDate>
      <link>https://forem.com/raazu_shanigarapu_65af2ba/i-built-a-debugger-for-llm-agents-heres-why-observability-wasnt-enough-4gia</link>
      <guid>https://forem.com/raazu_shanigarapu_65af2ba/i-built-a-debugger-for-llm-agents-heres-why-observability-wasnt-enough-4gia</guid>
      <description>&lt;p&gt;Every time I changed a prompt, I was running a hypothesis test.&lt;/p&gt;

&lt;p&gt;But I had no debugger. No way to pause execution. No structural comparison between "before" and "after." Just two terminal windows and a vague feeling that maybe it was better now.&lt;/p&gt;

&lt;p&gt;I built &lt;a href="https://github.com/RAJUSHANIGARAPU/agent-lens" rel="noopener noreferrer"&gt;agent-lens&lt;/a&gt; to fix this.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem with "Observability"
&lt;/h2&gt;

&lt;p&gt;Langfuse, LangSmith, Phoenix — these are great tools. They show you what happened. Traces, spans, token counts.&lt;/p&gt;

&lt;p&gt;But none of them answer the question I actually had: &lt;strong&gt;did this change make it better?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That requires something different:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A way to compare two runs structurally&lt;/li&gt;
&lt;li&gt;A record of &lt;em&gt;why&lt;/em&gt; you made the change (the hypothesis)&lt;/li&gt;
&lt;li&gt;A verdict — not just "here are the numbers," but "this was an improvement"&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What agent-lens Does Differently
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Pause a live agent mid-run
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;agent_lens&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;

&lt;span class="n"&gt;agent_lens&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;install&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;          &lt;span class="c1"&gt;# auto-patches OpenAI + Anthropic
&lt;/span&gt;&lt;span class="n"&gt;agent_lens&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dashboard&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;start&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# localhost:7878
&lt;/span&gt;
&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nd"&gt;@agent_lens.trace&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;my_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o-mini&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
    &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Open the dashboard, click &lt;strong&gt;Pause&lt;/strong&gt;. The agent blocks at the next LLM call.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. State a hypothesis before you change anything
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;POST /runs/{run_id}/fork
{
  "span_id": "abc123",
  "edited_messages": [{"role": "system", "content": "Be concise."}],
  "notes": "Hypothesis: shorter system prompt reduces hallucination",
  "expected_output": "concise"
}
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The note travels with the run forever. Future you can read your reasoning.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. GET /diff — one call, one verdict
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;GET /runs/&lt;span class="o"&gt;{&lt;/span&gt;run_a&lt;span class="o"&gt;}&lt;/span&gt;/diff/&lt;span class="o"&gt;{&lt;/span&gt;run_b&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"metrics_delta"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"latency_ms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"a"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1847&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"b"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;820&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nl"&gt;"pct_change"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;-55.6&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"total_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"a"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;453&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nl"&gt;"b"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;87&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="nl"&gt;"pct_change"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;-80.8&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"cost_usd"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;     &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"a"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.0045&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"b"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.00087&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"pct_change"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;-80.7&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"assertion_result"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"expected_output"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"concise"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"passed_in_a"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"passed_in_b"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"verdict"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"improved"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Hypothesis confirmed. With numbers.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Full Flow
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Agent running] → Pause → agent blocks at next LLM call
                              ↓
                    [Edit messages in dashboard]
                              ↓
                    Fork → new run diverges
                              ↓
                    Resume → original continues
                              ↓
              [Two runs. GET /diff. Get verdict.]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No restarts. No re-running preceding steps.&lt;/p&gt;




&lt;h2&gt;
  
  
  Zero Infrastructure
&lt;/h2&gt;

&lt;p&gt;Everything runs locally. SQLite at &lt;code&gt;~/.agent-lens/runs.db&lt;/code&gt;. No Docker. No cloud. No API keys needed to start exploring:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;agentlens-tracer
python examples/07_demo_mock.py  &lt;span class="c"&gt;# runs a full demo with no API key&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Works with LangChain and LlamaIndex Too
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agent_lens.integrations.langchain&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AgentLensCallbackHandler&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agent_lens.integrations.llamaindex&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AgentLensLlamaIndexHandler&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pass as a callback — every LLM call is traced automatically.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Matters
&lt;/h2&gt;

&lt;p&gt;You're not debugging a function. You're debugging a probabilistic system. Every prompt change is a hypothesis test.&lt;/p&gt;

&lt;p&gt;Today you run that test by eyeballing outputs. agent-lens makes it structural, repeatable, and recorded.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Vibes-based prompt engineering is debugging without a debugger.&lt;br&gt;
agent-lens is the debugger.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/RAJUSHANIGARAPU/agent-lens" rel="noopener noreferrer"&gt;https://github.com/RAJUSHANIGARAPU/agent-lens&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Install:&lt;/strong&gt; &lt;code&gt;pip install agentlens-tracer&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Would love to hear how you're currently debugging LLM agents — drop a comment below.&lt;/p&gt;

</description>
      <category>python</category>
      <category>llm</category>
      <category>opensource</category>
      <category>devtool</category>
    </item>
    <item>
      <title>I Built a Free Image Compressor That Never Uploads Your Files</title>
      <dc:creator>yangjiaqiang12</dc:creator>
      <pubDate>Mon, 18 May 2026 16:29:39 +0000</pubDate>
      <link>https://forem.com/yangjiaqiang12/i-built-a-free-image-compressor-that-never-uploads-your-files-5da7</link>
      <guid>https://forem.com/yangjiaqiang12/i-built-a-free-image-compressor-that-never-uploads-your-files-5da7</guid>
      <description>&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;Every time you use an online image compressor, your files get uploaded to someone else's computer. TinyPNG does it. Compressor.io does it. Even most so-called "free" tools collect your data on their servers.&lt;/p&gt;

&lt;p&gt;This has always felt wrong to me. Images can contain sensitive stuff -- screenshots of conversations, private photos, business documents, unreleased designs. Why should making them smaller require handing them to a stranger?&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution: Squash
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://yangjiaqiang12.github.io/squash-image-compressor/" rel="noopener noreferrer"&gt;Squash&lt;/a&gt;&lt;/strong&gt; is a free, open-source image compressor that runs &lt;strong&gt;entirely in your browser&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🚫 &lt;strong&gt;No uploads&lt;/strong&gt; -- everything stays on your device&lt;/li&gt;
&lt;li&gt;⚡ &lt;strong&gt;Instant compression&lt;/strong&gt; -- no waiting for network round-trips&lt;/li&gt;
&lt;li&gt;🎨 &lt;strong&gt;Multi-format&lt;/strong&gt; -- JPEG, PNG, WebP support&lt;/li&gt;
&lt;li&gt;📦 &lt;strong&gt;Batch processing&lt;/strong&gt; -- compress multiple images at once&lt;/li&gt;
&lt;li&gt;🎚️ &lt;strong&gt;Quality slider&lt;/strong&gt; -- full control from 1% to 100%&lt;/li&gt;
&lt;li&gt;📐 &lt;strong&gt;Resize&lt;/strong&gt; -- set max dimensions while compressing&lt;/li&gt;
&lt;li&gt;🌓 &lt;strong&gt;Dark mode&lt;/strong&gt; -- works great at night&lt;/li&gt;
&lt;li&gt;💰 &lt;strong&gt;Completely free&lt;/strong&gt; -- no limits, no watermarks&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How It Works
&lt;/h2&gt;

&lt;p&gt;Squash uses the browser Canvas API to decode images, apply compression settings, and re-encode them at your chosen quality level. All the heavy lifting happens on your device hardware -- not on some server farm.&lt;/p&gt;

&lt;p&gt;The processing pipeline:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Load&lt;/strong&gt; -- Image is decoded from file into raw pixel data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Resize&lt;/strong&gt; -- If a max width is set, image is scaled down&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Encode&lt;/strong&gt; -- Pixel data is re-encoded at the chosen quality and format&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Download&lt;/strong&gt; -- The compressed result is ready to save&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;No server. No upload. No privacy concern.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I Built This
&lt;/h2&gt;

&lt;p&gt;I wanted a tool that respects privacy. Not as a marketing slogan -- as a &lt;strong&gt;technical guarantee&lt;/strong&gt;. Your images literally cannot be uploaded because there is no server to upload them to. The entire application is static HTML, CSS, and JavaScript served from GitHub Pages.&lt;/p&gt;

&lt;p&gt;The source code is open (MIT license). You can inspect every line. You can host it yourself. You can verify that nothing leaves your browser.&lt;/p&gt;

&lt;h2&gt;
  
  
  Comparison with Other Tools
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Privacy&lt;/th&gt;
&lt;th&gt;Batch Mode&lt;/th&gt;
&lt;th&gt;WebP Support&lt;/th&gt;
&lt;th&gt;Price&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Squash&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Local only&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;Free&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Squoosh&lt;/td&gt;
&lt;td&gt;✅ Local only&lt;/td&gt;
&lt;td&gt;❌ No&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;Free&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TinyPNG&lt;/td&gt;
&lt;td&gt;❌ Uploads&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;Free (20/day)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Compressor.io&lt;/td&gt;
&lt;td&gt;❌ Uploads&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;❌ No&lt;/td&gt;
&lt;td&gt;Free (10MB cap)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;The key difference:&lt;/strong&gt; Squash and Squoosh process locally. But Squoosh has no batch mode. Squash is the only tool that combines &lt;strong&gt;local processing + batch mode + multi-format + unlimited use&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tech Stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Vanilla HTML/CSS/JavaScript&lt;/strong&gt; -- No frameworks, no dependencies&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Canvas API&lt;/strong&gt; -- Browser-native image processing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GitHub Pages&lt;/strong&gt; -- Free, fast static hosting&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zero backend&lt;/strong&gt; -- No server, no database, no API calls&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Try It Yourself
&lt;/h2&gt;

&lt;p&gt;👉 &lt;strong&gt;&lt;a href="https://yangjiaqiang12.github.io/squash-image-compressor/" rel="noopener noreferrer"&gt;Launch Squash&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;📂 &lt;strong&gt;&lt;a href="https://github.com/yangjiaqiang12/squash-image-compressor" rel="noopener noreferrer"&gt;Source Code on GitHub&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you find it useful, consider &lt;a href="https://ko-fi.com/squashtools" rel="noopener noreferrer"&gt;buying me a coffee&lt;/a&gt; ☕ -- it helps keep the project alive and improving.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built with vanilla HTML/CSS/JS. No frameworks, no dependencies, no build step. MIT licensed.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>javascript</category>
      <category>opensource</category>
      <category>privacy</category>
    </item>
  </channel>
</rss>
