Forem: ARmedia

Building a Smart Refrigerator with a $15 IoT Camera and SAM 3: Solving the "Warm Water" Problem

ARmedia — Thu, 18 Dec 2025 07:43:46 +0000

I've been battling with AI (Claude) for 14 hours a day. Couldn't be happier.

— Akio Shiki (@ar_akio) October 20, 2025

Black Friday Haul and the Warm Water Problem

Hi, I'm Akio, an engineer at an AI startup.

So, how did everyone spend Amazon Black Friday this year? As usual, I bulk-ordered sparkling water and bottled drinks purely because they were "on sale," and now my entryway is buried under a tower of cardboard boxes.

And then comes the inevitable: "I forgot to move them to the fridge, and now I'm stuck drinking warm water."

"Can technology solve this?" ...Half joking, half serious.

This time, as a PoC (Proof of Concept) with an eye toward future business applications, I built an object recognition system combining the ultra-cheap IoT microcontroller "ESP32" with Meta's latest model "SAM 3 (Segment Anything Model 3)."

Mountains of bottled water still sitting in cardboard boxes at the office

Why "water bottles in a refrigerator"? Because hidden within this seemingly trivial problem are technical challenges that any AI engineer can appreciate.

Why "Refrigerator × Water Bottles" Is the Perfect PoC

This might seem like a joke topic, but from a computer vision (CV) perspective, the inside of a refrigerator is an S-rank difficulty dungeon.

Here are the three reasons I chose this as my PoC target. These same challenges apply directly to industrial robotics and autonomous vehicles:

The Difficulty of Transparent/Translucent Objects: Water bottles are transparent. Not only does the background show through, but they create complex light reflections from the interior lighting. Traditional CNN-based object detection often fails to capture the contours and processes them as part of the background.
The "Transparent Shelf" Trap: Modern refrigerators have transparent plastic or glass shelves. Even powerful segmentation models like SAM can misidentify shelf edges as "object boundaries," or detect objects on lower shelves by seeing through the transparent shelf above.
Challenging Lighting Conditions: The back of the fridge is dark; the front is bright. On top of the extreme contrast, bottle shadows fall on transparent shelves, and the system might mistake those "shadows" for actual objects.

The actual inside of the refrigerator

In other words: "If we can accurately segment water bottles in this harsh environment, most object recognition tasks in offices or factories will be a breeze."

System Architecture: $15 Edge Device Meets State-of-the-Art AI

The setup is extremely simple and low-cost:

Edge (Eyes): ESP32S3-CAM
Brain: Local PC Server (running Ryzen AI MAX+ 395 with Meta SAM 3)
Network: WiFi (HTTP POST)

The ESP32 itself doesn't have the horsepower to run SAM 3. It functions purely as a capture device, sending images to the server. The server handles inference and returns results (inventory count, mask images, etc.).

System architecture diagram

The Zero-Tuning Revelation: Fully AI-Driven Results

The achievement I want to emphasize most from this PoC isn't the recognition accuracy itself—it's that we made zero environment-specific customizations.

Typically, for demos like this, there's a temptation to "cheat" (or "optimize") by adjusting lighting or fixing the camera at an angle that's easy to recognize. For this system, we eliminated all of that.

Refrigerator-agnostic: Zero calibration for specific refrigerator models or shelf arrangements.
Object-agnostic: No pre-training on specific water bottle brands or shapes (round vs. square).
Camera position-free: No precise adjustment of camera placement or angle.

This means the transparent object recognition was achieved entirely through SAM 3's inference capabilities and automatic adjustments.

In traditional image processing development, you'd need tedious parameter tuning (heuristic craftsmanship) like "for this refrigerator's lighting conditions, the binarization threshold should be around X..."

But this time, we completely eliminated that human "overfitting to the environment." The fact that a pure foundation model demonstrated this level of environmental adaptability has huge implications for reducing deployment costs and accelerating rollout speed.

Results and Future Outlook

The result: water bottles inside the refrigerator were segmented with remarkable accuracy, unfazed by transparent shelves or reflections.

Water bottles in the refrigerator with clean segmentation masks (color-coded)

Now I might finally avoid the tragedy of having a packed fridge when I don't want a drink, and an empty one when I do... maybe.

Through this PoC, we confirmed that foundation models like SAM 3 can be a powerful solution to the classic challenge of "transparent object recognition."

As a startup, we're searching for the seeds of society-changing innovation through the accumulation of experiments like this—experiments that are relatable (and admittedly a bit ridiculous).

If you have thoughts or feedback on this article, or if you're an engineer thinking "I want to try this with my fridge!"—drop a comment below!

Running SAM 3 on AMD Ryzen AI Max+ 395: A Complete Guide to Fixing the rocBLAS Error

ARmedia — Thu, 18 Dec 2025 07:43:37 +0000

I've been battling with AI (Claude) for 14 hours a day. Couldn't be happier.

— Akio Shiki (@ar_akio) October 20, 2025

Hi, I'm Akio, an engineer at an AI development startup. In my previous article, I introduced SAM 3. This time, I'll share the pitfalls I encountered when running SAM 3 on AMD hardware.

We're constantly testing the latest AI models and hardware, and right now I have in my hands what can only be described as a monument to AMD engineering: the Ryzen AI Max+ 395.

AMD Official

The specs on this machine are, frankly, insane. With high-bandwidth memory and a powerful iGPU, this device truly shines when running massive LLMs like OpenAI's gpt-oss-120b locally.

But that's not what I'm doing today.

Today, it's Meta's latest image segmentation model: SAM 3 (Segment Anything Model 3).

Meta Official

"Wait, SAM 3? Isn't that lightweight? If you want inference speed, wouldn't an NVIDIA dGPU be a better fit?"

You're absolutely right. No argument there.

Running SAM 3 on a Ryzen AI Max+ 395 is, in a sense, using a sledgehammer to crack a nut.

But you know what? I don't care. The reason is simple:

"I just wanted to run the hottest new model on AMD's latest hardware."

This is a passion project, efficiency be damned. That said, the errors I encountered and the solutions I found should be universally valuable for AMD users. Consider this a definitive guide to conquering the rocBLAS error that virtually every Ryzen AI user will face.

The Despair: No Answers Anywhere on the Web

My setup: Windows 11, using AMD's AI stack ROCm (HIP SDK) to run SAM 3 on PyTorch.

Setup went smoothly. Time to run the inference script! ...And the moment I did, my terminal was flooded with merciless error logs.

rocBLAS error: TensileLibrary.dat not found

Ah yes, the classic AMD environment error. "TensileLibrary.dat not found." Translation: "I can't find the computation library for your GPU (gfx1151), so I can't do any calculations."

Because the Ryzen AI Max+ 395 uses the latest architecture, the official libraries haven't fully caught up with the path configurations... a common story with newly released hardware.

The Standard "Environment Variable Spoofing" Trick Doesn't Work?

Normally in AMD circles, when you hit this error, you use a workaround: spoofing the environment variable. Since gfx1151 is highly compatible with the Radeon RX 7000 series (gfx1100), you can trick the system into thinking "I'm actually gfx1100."

$env:HSA_OVERRIDE_GFX_VERSION = "11.0.0"

This should solve everything... or so I thought. But this time, it didn't work. The error logs stubbornly insisted "I can't find the files for gfx1151" and kept looking in site-packages\_rocm_sdk_libraries_gfx1151\bin.

Searching the Depths of the Internet, Finding Nothing

Even after consulting various AI assistants for solutions, the final verdict was:

🔴 Current Status: Local execution is technically impossible (as of December 2025)
Local execution is technically impossible until AMD officially releases Tensile libraries for gfx1151.

No way. There has to be a solution. I refused to give up.

"rocBLAS error gfx1151," "Ryzen AI 300 PyTorch"... I searched Google with every keyword I could think of, dove deep into GitHub Issues and Reddit threads, but information was shockingly nonexistent.

The Ryzen AI 300 series (Strix Point) is so new that apparently no one in the world had established a workaround for this error yet. Just as I was about to resign myself to using this as a dedicated LLM machine, I decided to go back to basics and dig through the library folders on my own PC.

The Solution: The Files Were "Hidden" All Along

If the web has no answers, look locally. The path indicated in the error logs indeed had no folder. However, when I thoroughly searched through site-packages—the PyTorch (ROCm version) installation folder—I found an unfamiliar directory.

_rocm_sdk_libraries_custom

"Custom...?" With a bad feeling, I opened it up and found something surprising.

gfx1151's TensileLibrary.dat

(Screenshot: gfx1151-related files inside _rocm_sdk_libraries_custom\bin\rocblas\library)

There it is! TensileLibrary_lazy_gfx1151.dat!

The RDNA 3.5 library files were included all along. But while PyTorch was looking for a folder named _rocm_sdk_libraries_gfx1151, the actual files were isolated deep within _rocm_sdk_libraries_custom. No wonder it couldn't find them.

It makes sense why there was no information online. This wasn't a configuration error—it was a folder structure mismatch, an extremely analog trap.

The Complete Fix: Folder Transplant Surgery

Once you know the cause, you just need to put the files where they belong. For AMD Ryzen AI users everywhere, here's the solution—possibly the first public documentation of this fix.

Step 1: Rescue the Files from Their Hiding Place

Open the following path in File Explorer (adjust the Python environment path for your setup):

...\site-packages\_rocm_sdk_libraries_custom\bin\rocblas\library

Copy all files in this directory (.dat files, .hsaco files, etc.).

Step 2: Create the Correct Folder Structure

Go back to the site-packages root and create a new folder hierarchy that matches what PyTorch expects:

Create a folder named _rocm_sdk_libraries_gfx1151
Inside it, create a folder named bin

Step 3: Place Files and Rename

Paste all the files you copied in Step 1 into the bin folder you just created.

As an extra precaution, duplicate TensileLibrary_lazy_gfx1151.dat and rename the copy to TensileLibrary.dat.

Files successfully moved to the newly created folder

The Result: SAM 3 Running Blazingly Fast

After fixing the folders, I ran the script again, fingers crossed.

Successful execution log (The "cuda" device label is just PyTorch being PyTorch—it sometimes shows this even for non-CUDA devices!)

It worked! The error completely disappeared, and the integrated GPU was humming along running inference. VRAM usage: 7GB. Single image inference: about 8 seconds. Pretty lightweight performance, I'd say. Real-time video is out of the question, though. (Man, I really want a high-end NVIDIA GPU...)

The Ryzen AI Max+ 395 is built for much heavier workloads, but there's something satisfying about watching it breeze through a lightweight model like SAM 3. Just confirming that "the latest image models can run on AMD hardware" is a win for today.

Conclusion

The lesson from this troubleshooting adventure: "Don't just read the error logs—examine the actual folder structure too." Basic stuff, right?

Powerful hardware like the Ryzen AI Max+ 395 is in a transitional period where the software ecosystem (especially Windows ROCm) hasn't caught up with hardware evolution. However, as this case shows, there are many situations where "the files exist, but the paths aren't configured correctly." Don't give up—dig through those directories and you might find the solution.

To all AMD users struggling with this same error: give this "folder transplant surgery" a try. Here's to comfortable (and slightly overpowered) local AI adventures!

If you have feedback on this article or requests for "truly heavy models" you'd like me to test on the Ryzen AI Max+ 395, drop a comment below!

Next time, I'll be posting about combining SAM 3 with IoT cameras (ESP32-based), so stay tuned!

SAM 3 Is Here: Meta's Latest Vision AI Can Now Understand Your Words

ARmedia — Thu, 18 Dec 2025 07:43:27 +0000

I've been battling with AI (Claude) for 14 hours a day. Couldn't be happier.

— Akio Shiki (@ar_akio) October 20, 2025

Meta Official Site

Hi there! I'm Akio, an engineer at an AI development startup.

In November 2025, Meta quietly dropped SAM 3 (Segment Anything Model 3). Have you had a chance to try it yet?

"Wait, didn't SAM 2 just come out?" "What's new this time?"

If you're asking these questions, you're not alone. But here's the thing—SAM 3 isn't just an incremental update or minor accuracy improvement. It represents a fundamental leap toward true multimodal segmentation, making the old "click to segment" workflow feel like ancient history.

In this article, I'll break down what makes SAM 3 so impressive based on the official GitHub and Hugging Face releases. And in the second half, I'll give you a sneak peek at our successful local implementation using the AMD Ryzen AI Max+ 395—complete with screenshots from our dev environment.

A Quick Refresher: What Is SAM?

Let's briefly look back at the Segment Anything Model lineage:

SAM 1 (2023): Introduced as a zero-shot model that could segment any object in an image with just clicks or bounding boxes. It revolutionized segmentation tasks overnight.
SAM 2 (2024): Extended capabilities to video, enabling object tracking across frames. This opened up new possibilities for video editing and analysis.

And now, we have SAM 3.

What Makes SAM 3 Revolutionary: 3 Key Advancements

After diving into the official repository (facebookresearch/sam3) and demos, the direction of evolution is crystal clear.

1. Just Tell It What You Want: Open Vocabulary Segmentation

This is the headline feature—and as an engineer, it's what excites me most.

Previous SAM versions required you to specify where to segment (via clicks or bounding boxes). SAM 3 natively understands text prompts.

For example, given street footage:

Type "red car" and it detects and masks every red car in the frame.
Say "yellow school bus" and it instantly identifies and tracks it.

This means detection, segmentation, and tracking are now fully unified. You no longer need to tell the AI where something is—it understands what you're describing and connects that to the visual information automatically.

Type "impala" and it segments only the impalas

2. A Unified Vision Foundation Across Images, Video, and 3D

SAM 3 completely breaks down the barrier between still images and video.

Using a shared vision backbone, it performs object detection on individual frames while maintaining consistent tracking across the temporal axis.

Even more exciting is the 3D reconstruction capability. Sometimes called "SAM 3D," this feature enables not just 2D segmentation but also estimation of an object's three-dimensional shape from images or video. This opens up real possibilities for XR (AR/VR) development and robotics applications.

SAM 3's architecture integrating image, video, and prompt processing
(Source: Meta AI GitHub)

3. Impressive Efficiency and Optimization

More features usually means heavier models, but SAM 3 bucks this trend with optimized inference efficiency. Benchmarks on Meta's "SA-Co Dataset" show it outperforming previous models in accuracy while being designed with edge device deployment in mind.

The Startup Perspective: Why SAM 3 Matters

For AI startups like ours, SAM 3 means dramatically faster development cycles.

Previously, detecting specific objects—say, a particular component in a factory or a specific crop variety—required collecting massive labeled datasets and fine-tuning dedicated models (like YOLO variants).

With SAM 3's open vocabulary capability, you can simply prompt "damaged component" or "ripe tomato" and get high-accuracy detection and segmentation with zero-shot inference—no additional training required.

This has the potential to compress PoC timelines from months to days. For startups where speed-to-proposal is everything, this is an incredibly powerful advantage.

Sneak Peek: Our Local Implementation

Now for the technical deep-dive. Our lab has already completed a local implementation of SAM 3.

Running on-premises rather than through cloud APIs is crucial for security, latency, and cost considerations.

Our hardware of choice: the beast that is the AMD Ryzen AI Max+ 395.

CPU: 16-core Zen 5 (Strix Halo)
Memory: 128GB LPDDR5x (8000MT/s)
Overall TOPS: Up to 126 TOPS

Conventional wisdom says running massive models like SAM 3 requires expensive GPU servers like the H100. However, by leveraging the Ryzen AI's unified memory architecture with its generous memory capacity, we've successfully run SAM 3 smoothly in a local environment without sending any data to the cloud.

That said, the Ryzen AI Max+ 395's main strength is its unified memory architecture, making it ideal for memory-hungry workloads like running gpt-oss-120b locally at low cost. For use cases like this one that require lightweight memory footprint plus speed, an NVIDIA GPU (including consumer-grade options) is probably the better fit.

Running SAM 3 locally on the Ryzen AI Max+ 395. Inference is impressively fast.

We're currently developing a system that integrates this setup with IoT devices (edge cameras) for real-time "segment by description" detection.

The detailed implementation guide for this SAM 3 × Ryzen AI Max+ 395 × IoT stack—including code and benchmark results—will be covered in an upcoming article. Follow along to stay updated!

Conclusion: Vision AI Enters the "Understanding" Phase

SAM 3 isn't just a segmentation tool. It's a model that serves as "eyes" capable of understanding the world through language and perceiving it spatially.

GitHub: facebookresearch/sam3
Hugging Face: facebook/sam3
Demo: Segment Anything Demo

I encourage you to try the official demo yourself. Prepare to be impressed by the accuracy. AI is evolving faster than most of us realize.

If you have thoughts, feedback, or specific requests like "I'd love to know more about X aspect of the Ryzen AI implementation!"—drop a comment below. I'll take your input into account for the next article!

We're hiring! We're looking for engineers who want to tackle real-world AI implementation with cutting-edge technology. Interested? Check out the link in my profile!