<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Drizz</title>
    <description>The latest articles on Forem by Drizz (@drizzdev).</description>
    <link>https://forem.com/drizzdev</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Forganization%2Fprofile_image%2F12255%2Fefaf45d8-e9fa-4077-bc5e-9ea962af9f5f.png</url>
      <title>Forem: Drizz</title>
      <link>https://forem.com/drizzdev</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/drizzdev"/>
    <language>en</language>
    <item>
      <title>Your 2026 Mobile Stack Is Modern Everywhere Except Testing</title>
      <dc:creator>Jay Saadana</dc:creator>
      <pubDate>Fri, 27 Mar 2026 11:19:50 +0000</pubDate>
      <link>https://forem.com/drizzdev/your-2026-mobile-stack-is-modern-everywhere-except-testing-45gd</link>
      <guid>https://forem.com/drizzdev/your-2026-mobile-stack-is-modern-everywhere-except-testing-45gd</guid>
      <description>&lt;p&gt;I spent 6 months talking to mobile engineers about their tooling. Flutter or React Native on the frontend. Supabase or Firebase on the backend. GitHub Actions for CI/CD. Mixpanel for analytics. Sentry for crash reporting.&lt;/p&gt;

&lt;p&gt;Every layer modern, maintained, actually pleasant to work with.&lt;br&gt;
Then I'd ask about testing. The energy would shift.&lt;br&gt;
Appium suites held together by brittle XPaths and Thread.sleep(). Espresso on Android, XCUITest on iOS same user flow, written and maintained twice. Flakiness rates sitting at 15-20%, sometimes spiking to 25% on real devices. One mobile lead estimated $200K/year in engineering time just on test maintenance not catching bugs, but fixing selectors that broke because someone changed an accessibility label or moved a component one level deeper in the hierarchy.&lt;/p&gt;

&lt;p&gt;Some teams just stopped writing tests altogether. Fell back to manual QA for critical flows. Not because they wanted to because the testing experience was so painful that false failures every morning felt worse than no automation at all.&lt;/p&gt;

&lt;p&gt;The numbers tell the same story. I audited the modern mobile stack across 8 layers using adoption data from Stack Overflow's 2025 Developer Survey, Statista, and 40+ engineer conversations. &lt;/p&gt;

&lt;p&gt;Here's what stood out:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Flutter (46% market share) and React Native (35%) dominate frontend both shipped or had major architecture updates between 2017-2024.&lt;/li&gt;
&lt;li&gt;Supabase hit $2B valuation and 1.7M+ developers. 40% of recent YC batches build on it.&lt;/li&gt;
&lt;li&gt;GitHub Actions leads CI/CD for most teams. Bitrise reports 28% faster builds vs. GitHub Hosted Runners for mobile-specific workflows.&lt;/li&gt;
&lt;li&gt;Sentry's AI-powered root cause analysis hits 94.5% accuracy. Crashlytics remains free and solid.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All of this is 2019-2024 era tooling. Then there's testing still running on frameworks built in 2011-2012. Appium was created the same year Instagram launched. Think about that for a second.&lt;/p&gt;

&lt;p&gt;The core problem isn't that Appium doesn't work. It's architectural. Selector-based testing couples your tests to implementation details. Your test doesn't say "tap the login button" it says "find the element at //android.widget.Button[@resource-id='com.app:id/login_btn'] and click it." &lt;br&gt;
Designer renames that ID? Test breaks. A promo banner shifts the layout? Timing error.&lt;br&gt;
Need the same test on iOS? Rewrite it.&lt;/p&gt;

&lt;p&gt;None of these failures mean your app is broken. They mean your&lt;br&gt;
locator stopped matching. That's busywork, not QA.&lt;/p&gt;

&lt;p&gt;The architectural shift that's closing this gap is Vision AI testing. Instead of querying the element tree, it looks at the rendered screen the same pixels your user sees. Tools like Drizz identify a "Login" button visually whether the underlying component is a Button, a TouchableOpacity, or a custom View with an onPress handler.&lt;br&gt;
What that looks like in practice: a checkout flow that takes 30+ lines of Java with explicit waits and XPath selectors in Appium becomes 6 lines of plain English. Same coverage. Runs on both platforms without rewriting. And when the UI changes button moves, text updates, component gets refactored the test keeps passing because it's not tied to the DOM.&lt;/p&gt;

&lt;p&gt;The early numbers from teams running this approach: &amp;lt;5% flakiness vs. the 15-20% industry average. Test creation dropping from hours to minutes. And the part that surprised me most non-engineers (PMs, designers) actually contributing test cases because there's no code to write.&lt;/p&gt;

&lt;p&gt;I'm not saying rip out Appium tomorrow. If you've got a stable suite, deep device-level tests (biometrics, sensors, push notifications), or compliance requirements that mandate W3C WebDriver Appium is still the right tool. The full post gets into where each approach wins honestly.&lt;/p&gt;

&lt;p&gt;But if you're spending more sprint time fixing green-path tests than shipping features, the comparison is worth 10 minutes of your time.&lt;/p&gt;

&lt;p&gt;👉 &lt;a href="https://bit.ly/4uSv7QL" rel="noopener noreferrer"&gt;Read the full 8-layer stack audit with adoption stats, side by side code comparisons, and the ROI math on what test maintenance is actually costing your team&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Your frontend is 2026. Your backend is 2026. Is your testing layer still stuck in 2012?&lt;/p&gt;

</description>
      <category>mobile</category>
      <category>testing</category>
      <category>ai</category>
      <category>android</category>
    </item>
    <item>
      <title>Your Mobile Tests Keep Breaking. Vision AI Fixes That</title>
      <dc:creator>Jay Saadana</dc:creator>
      <pubDate>Mon, 02 Mar 2026 04:31:40 +0000</pubDate>
      <link>https://forem.com/drizzdev/your-mobile-tests-keep-breaking-vision-ai-fixes-that-384f</link>
      <guid>https://forem.com/drizzdev/your-mobile-tests-keep-breaking-vision-ai-fixes-that-384f</guid>
      <description>&lt;p&gt;68% of engineering teams say test maintenance is their biggest QA bottleneck. Not writing tests. Not finding bugs. Just keeping existing tests from breaking.&lt;br&gt;
The problem? Traditional test automation treats your app like a collection of XML nodes, not a visual interface designed for human eyes. Every time a developer refactors a screen, tests break. Even when the app works perfectly.&lt;/p&gt;


&lt;h2&gt;
  
  
  There's a Better Way
&lt;/h2&gt;

&lt;p&gt;Vision Language Models (VLMs)  the same AI shift behind ChatGPT, but with eyes  are changing the game. Instead of fragile locators, VLM powered testing agents see your app the way a human tester does.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;The results speak for themselves:&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;95%+ test stability&lt;/strong&gt;(vs. 70-80% with traditional automation)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test creation in minutes&lt;/strong&gt;, not hours&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;50%+ reduction&lt;/strong&gt; in maintenance effort&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Visual bugs caught&lt;/strong&gt; that locator-based tests consistently miss&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  What Does This Look Like in Practice?
&lt;/h2&gt;

&lt;p&gt;Instead of writing this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;driver.findElement(By.id("login_button")).click()
You simply write:
Tap on the Login button.

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The AI handles the rest  visually identifying elements, adapting to UI changes, and executing actions without a single locator.&lt;/p&gt;




&lt;h2&gt;
  
  
  But Wait, Isn't Every Tool Claiming "AI-Powered" Now?
&lt;/h2&gt;

&lt;p&gt;Yes. And most of them are still parsing the DOM under the hood.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;NLP-based tools&lt;/strong&gt; still generate locator-based scripts. When structure changes dramatically, they break.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Self-healing locators&lt;/strong&gt; fix minor issues like renamed IDs, but still depend on the element tree.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vision AI&lt;/strong&gt; eliminates locator dependency entirely. Tests are grounded in what's visible, not how elements are implemented.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The difference? Other platforms report 60–85% maintenance reduction. Vision AI achieves near-zero maintenance because tests never relied on brittle selectors in the first place.&lt;/p&gt;




&lt;h2&gt;
  
  
  How VLMs Actually Work
&lt;/h2&gt;

&lt;p&gt;Modern VLMs follow three primary architectural approaches. &lt;strong&gt;Fully integrated models&lt;/strong&gt; like GPT-4o and Gemini process images and text through unified transformer layers delivering the strongest reasoning but at the highest compute cost. &lt;strong&gt;Visual adapter models&lt;/strong&gt; like LLaVA and BLIP-2 connect pre trained vision encoders to LLMs, striking a practical balance between performance and efficiency. &lt;strong&gt;Parameter efficient models&lt;/strong&gt; like Phi-4 Multimodal achieve roughly 85–90% of the accuracy of larger VLMs while enabling sub-100ms inference ideal for edge and real-time use cases.&lt;br&gt;
Under the hood, these models learn through contrastive learning (aligning images and text into shared space), image captioning, and instruction tuning. CLIP's training on over 400 million image-text pairs laid the foundation for how most VLMs generalise across tasks today.&lt;/p&gt;




&lt;h2&gt;
  
  
  The VLM Landscape at a Glance
&lt;/h2&gt;

&lt;p&gt;The space is moving fast. &lt;strong&gt;GPT-4o&lt;/strong&gt; leads in complex reasoning. &lt;strong&gt;Gemini 2.5 Pro&lt;/strong&gt; handles long content up to 1M tokens. C*&lt;em&gt;laude 3.5 Sonnet&lt;/em&gt;* excels at document analysis and layouts. On the open-source side, &lt;strong&gt;Queen 2.5-VL-72B&lt;/strong&gt; delivers strong OCR at lower cost, while &lt;strong&gt;DeepSeek VL2&lt;/strong&gt; targets low-latency applications. Open-source models now perform within 5–10% of proprietary alternatives with full fine tuning flexibility and no per call API costs.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting Started with VLM-Powered Testing
&lt;/h2&gt;

&lt;p&gt;You don't need to rework your entire automation strategy. Start by identifying 20–30 critical test cases, the ones that break most often and create the most CI noise. Write them in plain English instead of locator-driven scripts. Then plug into your existing CI/CD pipeline (GitHub Actions, Jenkins, CircleCI all supported). Upload your APK, configure tests, and trigger on every build. Because tests rely on visual understanding, failures are more meaningful and far easier to diagnose.&lt;br&gt;
If you're curious to go deeper, we've written a more detailed breakdown on how VLMs work under the hood, why Vision AI outperforms most "AI testing" methods, benchmark comparisons, and a practical adoption guide. &lt;a href="https://bit.ly/4tRzcUV" rel="noopener noreferrer"&gt;You can read the full blog here&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  See It in Action
&lt;/h2&gt;

&lt;p&gt;Drizz brings Vision AI testing to teams who need reliability at speed. Upload your APK, write tests in plain English, and get your 20 most critical test cases running in CI/CD within a day.&lt;/p&gt;

&lt;p&gt;No locators. No flaky tests. No maintenance burden.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.drizz.dev/book-a-demo" rel="noopener noreferrer"&gt;Schedule a Demo&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>mobile</category>
      <category>productivity</category>
      <category>android</category>
    </item>
  </channel>
</rss>
