<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Matt Macosko</title>
    <description>The latest articles on Forem by Matt Macosko (@matt_macosko_f3829cfd86b8).</description>
    <link>https://forem.com/matt_macosko_f3829cfd86b8</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3881937%2Fd462eaf2-e4e1-452e-82b5-c8a66e8941d1.jpg</url>
      <title>Forem: Matt Macosko</title>
      <link>https://forem.com/matt_macosko_f3829cfd86b8</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/matt_macosko_f3829cfd86b8"/>
    <language>en</language>
    <item>
      <title>Free AI on a MacBook vs $100-a-Month Claude Code — Hexagon Shootout</title>
      <dc:creator>Matt Macosko</dc:creator>
      <pubDate>Thu, 23 Apr 2026 04:32:47 +0000</pubDate>
      <link>https://forem.com/matt_macosko_f3829cfd86b8/free-ai-on-a-macbook-vs-100-a-month-claude-code-hexagon-shootout-5h1o</link>
      <guid>https://forem.com/matt_macosko_f3829cfd86b8/free-ai-on-a-macbook-vs-100-a-month-claude-code-hexagon-shootout-5h1o</guid>
      <description>&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=2KeTDDodE0A" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7kv4avef0epmb8pv5jbv.jpg" alt="FREE AI on a MacBook vs Claude Cloud — Hexagon Shootout" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;▶ Watch the race on YouTube:&lt;/strong&gt; &lt;a href="https://www.youtube.com/watch?v=2KeTDDodE0A" rel="noopener noreferrer"&gt;https://www.youtube.com/watch?v=2KeTDDodE0A&lt;/a&gt;&lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;April 22, 2026.&lt;/strong&gt; Anthropic's Claude Code Max plan jumped to $100 a month. I ran a live three-way AI race on the exact same prompt — Gemma 31B local, Llama 70B local, and Claude cloud — on a single MacBook, to see how close a free local stack gets to the paid cloud. Two of three contestants finished with zero cloud calls.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If you just want the video, it's here: &lt;a href="https://www.youtube.com/watch?v=2KeTDDodE0A" rel="noopener noreferrer"&gt;FREE AI on a MacBook vs Claude Cloud — Hexagon Shootout&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you want the repo, it's here: &lt;a href="https://github.com/nicedreamzapp/claude-code-local" rel="noopener noreferrer"&gt;github.com/nicedreamzapp/claude-code-local&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Keep reading for the setup, the numbers, and the three things that surprised me.&lt;/p&gt;

&lt;h2&gt;
  
  
  The setup — same prompt, three contestants
&lt;/h2&gt;

&lt;p&gt;Hardware: M5 Max MacBook Pro, 128 GB unified memory, Apple Silicon.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Gemma 31B&lt;/strong&gt; — local, Apple MLX, 4-bit quantized (Google's code-specialized model)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Llama 70B&lt;/strong&gt; — local, Apple MLX, 8-bit quantized (Meta's generalist)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Claude cloud&lt;/strong&gt; — the real Anthropic API, using Claude Code unchanged&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Same prompt to every contestant:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Build a single HTML file with inline JavaScript that shows a ball bouncing inside a rotating hexagon. Include gravity and realistic bounce physics.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Simple enough that the answer should be a few kilobytes of code. Interesting enough that it exposes how well a model handles real math — collision detection against rotating geometry, energy conservation, boundary clamping. When models trip, they trip here.&lt;/p&gt;

&lt;p&gt;Every run was recorded end-to-end with a live stats panel: elapsed seconds, output bytes, tokens-per-second. No cherry-picking, no post-hoc edits to the physics code, no "here's what it SHOULD have said." What you see is what came out.&lt;/p&gt;

&lt;h2&gt;
  
  
  The results
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Contestant&lt;/th&gt;
&lt;th&gt;Time to ship working HTML&lt;/th&gt;
&lt;th&gt;Tokens/sec&lt;/th&gt;
&lt;th&gt;Cloud calls&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Claude cloud&lt;/td&gt;
&lt;td&gt;22 s&lt;/td&gt;
&lt;td&gt;N/A (data center)&lt;/td&gt;
&lt;td&gt;yes (via API)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemma 31B local&lt;/td&gt;
&lt;td&gt;56 s&lt;/td&gt;
&lt;td&gt;~30&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;zero&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Llama 70B local&lt;/td&gt;
&lt;td&gt;2:17&lt;/td&gt;
&lt;td&gt;~11&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;zero&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Claude cloud finished first — it's a data center somewhere. Gemma 31B finished clean in under a minute with working physics. Llama 70B took the longest and produced the most verbose output, but also landed a working demo in the end.&lt;/p&gt;

&lt;p&gt;The headline isn't that one is "best." It's that two of the three ran with Wi-Fi that could have been off the entire time. That's the number that matters for anyone dealing with NDAs, PHI, client files, or just a flight without connectivity.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three things that surprised me
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Bigger isn't better when "bigger" is a generalist
&lt;/h3&gt;

&lt;p&gt;I went in expecting Llama 70B to beat Gemma 31B on code quality. It's more than twice the parameter count. Gemma beat Llama cleaner and faster on this specific task.&lt;/p&gt;

&lt;p&gt;Why: Gemma 4 is a Google model fine-tuned heavily for coding and math. Llama 3.3 70B is Meta's generalist — it's excellent at conversation, reasoning, creative writing, but it wasn't tuned to punch above its weight on HTML canvas physics.&lt;/p&gt;

&lt;p&gt;If you're buying a local model for coding, you're better off with a 30B that's code-tuned than a 70B that's general. Don't count parameters, read the model card.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Claude Code's harness chokes local models
&lt;/h3&gt;

&lt;p&gt;Claude Code (the CLI agent) sends a 29,000-token system prompt with 60 tool schemas in every request. That's tuned for the cloud — where a frontier model can happily chew through 30K tokens of context before even starting. On a local 70B, that prefill takes a minute or two before generation begins.&lt;/p&gt;

&lt;p&gt;When I bypassed Claude Code and hit the MLX server directly with just the prompt, Llama 70B's wall-clock time dropped from 7+ minutes to under 2.&lt;/p&gt;

&lt;p&gt;The tradeoff: without Claude Code's harness you lose the Write/Edit/Bash tool-use loop, so you can't use Claude Code as an agent, only as a generator. For research, benchmarking, or any single-shot prompt, direct is way faster. For actual coding sessions, the overhead is real but it's what buys you the agent loop.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Circle-approximation collision is the cheat code
&lt;/h3&gt;

&lt;p&gt;All three models eventually produced a bouncing ball. The ones that worked used &lt;strong&gt;circle-approximation collision&lt;/strong&gt; — treat the hexagon as a circle of its apothem radius for collision purposes, reflect velocity when the ball exceeds that radius, clamp the ball back to exactly inside. Five lines of math, reliable, hexagon can rotate as wildly as you want.&lt;/p&gt;

&lt;p&gt;The ones that failed tried to do proper polygon-edge collision — compute the six edges of the rotating hexagon each frame, compute point-to-line distance for each, reflect off the appropriate edge. That's the "right" way, and it fails constantly because floating-point error lets the ball slip through edges during the rotation, and then the model doesn't know how to clamp it back.&lt;/p&gt;

&lt;p&gt;I wouldn't have predicted this. The "simple" approximation is strictly better for the demo because it can't leak. For anything more complex than one ball, the polygon approach is necessary — but for a benchmark, approximation wins.&lt;/p&gt;

&lt;h2&gt;
  
  
  Who should care
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Developers&lt;/strong&gt; on laptops with 64+ GB of Apple Silicon unified memory: you can run this today, your hardware already supports it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anyone dealing with confidential work&lt;/strong&gt; — lawyers, accountants, doctors, contractors handling NDAs or PHI: the cost isn't $0 vs $100, it's "does your data leave the machine" vs "does it not."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Frequent flyers and people who travel to places with bad internet&lt;/strong&gt;: a 70B model on a laptop keeps working when the plane's Wi-Fi is $18 and throttled.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anyone curious whether Apple's bet on unified memory was actually about AI&lt;/strong&gt;: it was.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How to run it yourself
&lt;/h2&gt;

&lt;p&gt;The repo is MIT licensed and open source. Full setup is in the README:&lt;/p&gt;

&lt;p&gt;→ &lt;strong&gt;&lt;a href="https://github.com/nicedreamzapp/claude-code-local" rel="noopener noreferrer"&gt;github.com/nicedreamzapp/claude-code-local&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The project pairs a native-MLX Anthropic-API-compatible server with Claude Code. Point Claude Code at &lt;code&gt;localhost:4000&lt;/code&gt; and the official CLI talks to your local model as if it were the cloud API. Swap models with one env var. Ship code without the subscription.&lt;/p&gt;

&lt;p&gt;Around 2,000 stars in the first month. If it's useful, a star helps.&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Claude cloud: $100/mo, 22 seconds to a working hexagon.&lt;/li&gt;
&lt;li&gt;Gemma 31B on my MacBook: $0, 56 seconds to a working hexagon.&lt;/li&gt;
&lt;li&gt;Llama 70B on my MacBook: $0, 2:17 to a working hexagon.&lt;/li&gt;
&lt;li&gt;Two of three ran with zero cloud calls.&lt;/li&gt;
&lt;li&gt;Free AI on Apple Silicon is real, now, for a huge slice of what people use cloud APIs for.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The receipts, in video form: &lt;a href="https://www.youtube.com/watch?v=2KeTDDodE0A" rel="noopener noreferrer"&gt;youtube.com/watch?v=2KeTDDodE0A&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://marijuanaunion.com" rel="noopener noreferrer"&gt;Marijuana Union&lt;/a&gt;. For premium vaporizers visit &lt;a href="https://ineedhemp.com" rel="noopener noreferrer"&gt;iNeedHemp&lt;/a&gt;, wholesale at &lt;a href="https://nicedreamzwholesale.com" rel="noopener noreferrer"&gt;Nice Dreamz&lt;/a&gt;, and seeds at &lt;a href="https://tribeseedbank.com" rel="noopener noreferrer"&gt;Tribe Seed Bank&lt;/a&gt;. Explore the 3D cannabis marketplace at &lt;a href="https://marijuanaunion.com/marketplace/" rel="noopener noreferrer"&gt;The Farmstand&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>localllama</category>
      <category>mlx</category>
      <category>applesilicon</category>
    </item>
    <item>
      <title>The Era of Hunched-Over-A-Screen Computing Is Ending — Heres Whats Replacing It</title>
      <dc:creator>Matt Macosko</dc:creator>
      <pubDate>Wed, 22 Apr 2026 09:54:00 +0000</pubDate>
      <link>https://forem.com/matt_macosko_f3829cfd86b8/the-era-of-hunched-over-a-screen-computing-is-ending-heres-whats-replacing-it-41go</link>
      <guid>https://forem.com/matt_macosko_f3829cfd86b8/the-era-of-hunched-over-a-screen-computing-is-ending-heres-whats-replacing-it-41go</guid>
      <description>&lt;p&gt;Look around any coffee shop, any office, any living room. Everyone is bent forward at the same angle, staring into a glowing rectangle, with one hand on a small slab and the other on a bigger slab. The whole posture is wrong. We know it’s wrong — that’s why ergonomic chairs are a $2 billion industry — but we keep doing it because the computers we built require it.&lt;/p&gt;

&lt;p&gt;I think we’re at the end of that era. Not because somebody invented a magic new screen. Because computing itself is finally able to leave the rectangle.&lt;/p&gt;

&lt;p&gt;I call what’s coming &lt;strong&gt;ambient computing&lt;/strong&gt;. The phrase isn’t new, but most uses of it are about smart speakers or watches — small devices that ask you to look at them too. That’s not what I mean. I mean a way of working with computers that doesn’t require you to face a screen at all. Where the machine listens, talks back, sees what you see, and the keyboard becomes optional rather than mandatory.&lt;/p&gt;

&lt;p&gt;The pieces of it are already shipping. They just haven’t been assembled.&lt;/p&gt;

&lt;h2&gt;
  
  
  What ambient computing actually looks like
&lt;/h2&gt;

&lt;p&gt;Sitting in a hot tub a few weeks ago, I sent a text from my phone: &lt;em&gt;“find me the best rated electric guitar at this price range, screenshot it, and text it back to me.”&lt;/em&gt; Two minutes later my phone buzzed with the screenshot. The Mac on my desk had searched, found, captured, and sent back, while I stayed in the tub.&lt;/p&gt;

&lt;p&gt;That’s an ambient-computing moment. No screen. No keyboard. The computer was a participant in what I was doing rather than the thing I had to stop and walk over to.&lt;/p&gt;

&lt;p&gt;The same week, I had a hands-free coding session — speaking into the room, hearing a cloned version of my own voice narrate what the AI was doing, course-correcting verbally. No mouse. No keyboard. No screen-watching. The work got done. The AI told me when it was done. I went on with my day.&lt;/p&gt;

&lt;p&gt;Both of these worked on &lt;strong&gt;hardware I already owned&lt;/strong&gt;. A MacBook Pro on the desk. An iPhone in my pocket. The pieces that turned them into an ambient system are open source and free.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three pieces that already exist
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Local AI.&lt;/strong&gt; A current MacBook Pro can run a 70-billion-parameter language model entirely on the GPU side of its unified memory. That model is good enough to write code, draft documents, summarize content, and run multi-step tool-using workflows. It does this with no internet and no subscription. The model lives on the machine; the inference happens on the machine.&lt;/p&gt;

&lt;p&gt;The fact that this is true on consumer hardware is a recent development. It wasn’t true two years ago. And it’s the foundation of everything else.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. On-device speech.&lt;/strong&gt; Apple’s &lt;code&gt;SFSpeechRecognizer&lt;/code&gt; — the same engine that powers the dictation feature in macOS — runs entirely on your Mac. You can wrap it in a continuous-listening daemon and have it transcribe everything you say into a target window, no cloud round-trip. Pair it with a local TTS engine running a cloned version of your own voice (the cloning runs on the Mac too) and you have full speech in, full speech out, neither end touching a network.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Phone-as-remote.&lt;/strong&gt; iMessage on a Mac can be driven by AppleScript. That means anything your Mac can do — search, code, browse, compose — can be triggered by a text from your phone. The phone becomes a remote for the more powerful machine, and the more powerful machine handles the heavy lift while you’re somewhere else.&lt;/p&gt;

&lt;p&gt;Stack those three together and you have a workflow where:&lt;br&gt;&lt;br&gt;
– You can ask the Mac to do something while you’re nowhere near it.&lt;br&gt;&lt;br&gt;
– You can hold a spoken conversation with it without typing or looking.&lt;br&gt;&lt;br&gt;
– It can produce real work — code, documents, research, video — and deliver it back to wherever you are.&lt;/p&gt;

&lt;p&gt;That’s ambient computing. Not Siri. Not Alexa. The full deal.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters now
&lt;/h2&gt;

&lt;p&gt;Two arguments. The boring one: the &lt;strong&gt;bodily cost&lt;/strong&gt; of screen-and-keyboard computing is real and accumulating. Carpal tunnel, posture damage, eye strain, the chair-and-desk economy that exists to patch over the damage we’re doing to ourselves. We’ve been pretending this is fine for thirty years. It’s not.&lt;/p&gt;

&lt;p&gt;The interesting one: ambient computing is what makes a different &lt;em&gt;relationship&lt;/em&gt; with the machine possible. When the computer is something you face for eight hours a day, it occupies a specific role in your life — interrupt-driven, attention-stealing, mostly adversarial to whatever else you wanted to be doing. When the computer is something you talk to in passing, hand things off to, and check back on later, it occupies a completely different role. It becomes a colleague rather than a chore.&lt;/p&gt;

&lt;p&gt;We’re not going to fully arrive there in 2026. But the building blocks are shipping in 2026, and the people who set them up now will look up in two years and realize their working life feels different.&lt;/p&gt;

&lt;h2&gt;
  
  
  The catch
&lt;/h2&gt;

&lt;p&gt;For now, all of this requires being &lt;strong&gt;on a Mac&lt;/strong&gt;. Specifically, an Apple Silicon Mac with enough unified memory to run a real model — practically, that means an M2 Max / M3 Max / M4 Pro / M5 Max with 32 GB minimum, 64 GB+ for the bigger models. That’s an expensive piece of hardware.&lt;/p&gt;

&lt;p&gt;But it’s a piece of hardware most professionals already own, or could justify. And it’s the only piece you need. There’s no recurring AI subscription. No hosting bill. No phone-home telemetry that compromises the whole privacy story.&lt;/p&gt;

&lt;p&gt;The gear that gets you into ambient computing is gear you might already have. You just haven’t connected the pieces yet.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I’m building toward
&lt;/h2&gt;

&lt;p&gt;The longer arc, for me, is robotics. Specifically a Lego-like modular system where you clip together small parts to build whatever the moment needs — a robot arm, a camera mount, a wheeled base — all driven by the same local AI vision system that runs everywhere else in the stack. That’s a few years out.&lt;/p&gt;

&lt;p&gt;In the meantime, I’m shipping the parts of the system that work today. The local-AI server is open source (&lt;a href="https://github.com/nicedreamzapp/claude-code-local" rel="noopener noreferrer"&gt;claude-code-local&lt;/a&gt;). The voice loop is open source (&lt;a href="https://github.com/nicedreamzapp/NarrateClaude" rel="noopener noreferrer"&gt;NarrateClaude&lt;/a&gt;). The browser agent is open source (&lt;a href="https://github.com/nicedreamzapp/browser-agent" rel="noopener noreferrer"&gt;browser-agent&lt;/a&gt;). The phone bridge is open source. The iPhone object-detection app that’s part of the same vision is on the App Store (&lt;a href="https://apps.apple.com/us/app/realtime-ai-cam/id6751230739" rel="noopener noreferrer"&gt;RealTime AI Cam&lt;/a&gt;) for free.&lt;/p&gt;

&lt;p&gt;If any of this resonates — if you’ve been quietly tired of being chained to a screen, or you can feel the future being built but haven’t been able to put your finger on what it is — clone something, run it, and tell me what you find. Most of the work ahead is figuring out which pieces fit where, and that’s not work I can do alone.&lt;/p&gt;

&lt;p&gt;The era of hunched-over-a-screen is ending. The next era is being built in the open, on commodity hardware, by people who decided to stop waiting for someone else to do it.&lt;/p&gt;

&lt;p&gt;— Matt&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Part of the &lt;a href="https://nicedreamzwholesale.com/software/" rel="noopener noreferrer"&gt;Nice Dreamz&lt;/a&gt; lineup. If you want this set up inside a firm or practice — private, on-device, no cloud — that’s &lt;a href="https://nicedreamzwholesale.com/airgap/" rel="noopener noreferrer"&gt;AirGap AI&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://marijuanaunion.com" rel="noopener noreferrer"&gt;Marijuana Union&lt;/a&gt;. For premium vaporizers visit &lt;a href="https://ineedhemp.com" rel="noopener noreferrer"&gt;iNeedHemp&lt;/a&gt;, wholesale at &lt;a href="https://nicedreamzwholesale.com" rel="noopener noreferrer"&gt;Nice Dreamz&lt;/a&gt;, and seeds at &lt;a href="https://tribeseedbank.com" rel="noopener noreferrer"&gt;Tribe Seed Bank&lt;/a&gt;. Explore the 3D cannabis marketplace at &lt;a href="https://marijuanaunion.com/marketplace/" rel="noopener noreferrer"&gt;The Farmstand&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>software</category>
      <category>ambientcomputingloca</category>
    </item>
    <item>
      <title>What Its Actually Like to Code By Voice — With the AI Replying In My Own Cloned Voice</title>
      <dc:creator>Matt Macosko</dc:creator>
      <pubDate>Wed, 22 Apr 2026 09:53:21 +0000</pubDate>
      <link>https://forem.com/matt_macosko_f3829cfd86b8/what-its-actually-like-to-code-by-voice-with-the-ai-replying-in-my-own-cloned-voice-32mc</link>
      <guid>https://forem.com/matt_macosko_f3829cfd86b8/what-its-actually-like-to-code-by-voice-with-the-ai-replying-in-my-own-cloned-voice-32mc</guid>
      <description>&lt;p&gt;The closest analogy I can give for what this feels like is having a quiet co-worker in the room who happens to sound exactly like you. You think out loud. They respond out loud. You both work on the same code. Neither of you is touching a keyboard.&lt;/p&gt;

&lt;p&gt;It’s still a little uncanny. But it’s also the most natural way to work I’ve found in twenty-plus years of writing software.&lt;/p&gt;

&lt;p&gt;The setup runs entirely on my MacBook. Apple’s on-device speech recognition listens for me. A local language model thinks. A cloned-voice text-to-speech says the response back. Nothing leaves the laptop. Nothing requires a network. The whole loop is on-device, and that turns out to matter for reasons I didn’t expect.&lt;/p&gt;

&lt;h2&gt;
  
  
  How it actually works
&lt;/h2&gt;

&lt;p&gt;A compiled Swift binary wraps Apple’s &lt;code&gt;SFSpeechRecognizer&lt;/code&gt; — the same engine that powers macOS dictation — in a continuous-listening daemon. It transcribes everything I say into the active terminal window where Claude Code is running. End-of-utterance is detected by a stability heuristic: if the recognized text stops changing for about 2.5 seconds, the recognizer treats that sentence as final and submits it.&lt;/p&gt;

&lt;p&gt;That submission gets injected into Claude Code via AppleScript, addressed to a specific window by ID so it can’t leak into whatever else is open. Claude Code processes the request against a local language model running on MLX (Apple’s native ML framework). The response comes back as text in the terminal — and a separate launcher pipes that text into a TTS engine running a cloned version of my voice. The reply plays through the speakers. The listener auto-pauses while audio is playing, so the model’s spoken reply never gets picked up as a new prompt. Then it resumes listening for the next thing I say.&lt;/p&gt;

&lt;p&gt;End-to-end latency, on a current MacBook, is around two to three seconds. Fast enough to feel like a conversation. Slow enough that you notice it’s a different kind of pacing than typing.&lt;/p&gt;

&lt;h2&gt;
  
  
  What surprises you the first hour
&lt;/h2&gt;

&lt;p&gt;The first thing that surprised me is how much I &lt;strong&gt;already&lt;/strong&gt; narrate while coding. The interior monologue — &lt;em&gt;“okay let me look at the test, that’s failing because the path is wrong, let me grep for the constant, oh it’s in a different file, fix that here…”&lt;/em&gt; — turns out to be most of how I work anyway. Speaking it out loud changed nothing about my reasoning. It just routed it to a different output channel.&lt;/p&gt;

&lt;p&gt;The second thing that surprised me is &lt;strong&gt;how much faster context-switching gets&lt;/strong&gt;. When you type, you have to break to compose. When you speak, you can just keep going. &lt;em&gt;“That’s done — now check the function signature in the parent class — yeah okay update the docstring to match — git status — looks good, commit it.”&lt;/em&gt; Five tasks, no pause, no posture change.&lt;/p&gt;

&lt;p&gt;The third surprise is the &lt;strong&gt;physical&lt;/strong&gt; difference. After half a day of voice-driven work I’m not stiff. My eyes aren’t tired. I haven’t held a clamshell wrist position for hours. There’s a real bodily cost to the way we normally use computers, and removing it feels like removing a weight you didn’t know was there.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why on-device matters specifically for this
&lt;/h2&gt;

&lt;p&gt;You can build a voice-driven coding setup with cloud APIs. Whisper for speech-in, ElevenLabs for speech-out, GPT or Claude for the brain. Many people do. The result is a tool that works great until your wifi gets weird, your API key hits a rate limit, your monthly bill arrives, or you realize you’ve just sent every word you said in front of your laptop today to three different vendors’ servers.&lt;/p&gt;

&lt;p&gt;The on-device version doesn’t have any of those failure modes. It works on a plane. It works in a Faraday cage. It works when the rest of the internet is on fire. The bill is a one-time hardware purchase, not a perpetual subscription. And nothing — no audio, no text, no inference request — ever crosses the network. For me that’s the difference between an interesting demo and a tool I actually use day-to-day.&lt;/p&gt;

&lt;h2&gt;
  
  
  The cloned voice is a real thing, not a gimmick
&lt;/h2&gt;

&lt;p&gt;The cloned voice is the part everyone reacts to first. When the AI reads its response in your own voice, your nervous system files it under “internal monologue” rather than “external announcement.” It’s a smoother experience than a stranger’s TTS voice and it doesn’t pull your attention the same way.&lt;/p&gt;

&lt;p&gt;But it works because the cloning is also on-device. The voice clone trains and runs locally — Pocket TTS in my case, but other local TTS engines slot in if you have a preference. Cloud voice services would mean my own voice (and everything I make it say) is sitting on someone’s server. Not interested.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where it falls short today
&lt;/h2&gt;

&lt;p&gt;Three real limitations:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Domain-specific vocabulary.&lt;/strong&gt; Apple’s recognizer is excellent at general English, less excellent at obscure software terms, library names, and acronyms. &lt;em&gt;“Refactor the YOLOv8 inference loop”&lt;/em&gt; often comes through as &lt;em&gt;“refactor the yellow vate inference loop.”&lt;/em&gt; The fix is a custom vocabulary file you can register with the recognizer; that closes most of the gap but takes setup time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Background noise.&lt;/strong&gt; A quiet office is fine. A coffee shop is workable. A hot tub with the jets running, surprisingly, also fine. A room with kids and a dog is harder. The continuous-listen mode is robust but not magic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Long pauses.&lt;/strong&gt; If you stop talking to think for 20 seconds, the recognizer will sometimes finalize a partial sentence that wasn’t done yet, and you have to restart it. Workable but a real friction point I’m still iterating on.&lt;/p&gt;

&lt;p&gt;None of these are fundamental. All of them get better as the recognition models get better, which they’re doing every macOS release.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this unlocks for me personally
&lt;/h2&gt;

&lt;p&gt;Coding while pacing the room. Coding while cooking. Coding while in the hot tub. (Yes, really. The mac is on a desk; my voice carries; I check back in by walking up to the screen when something needs visual confirmation.) Holding voice work sessions that last for hours without my body breaking down.&lt;/p&gt;

&lt;p&gt;It also lets me work in spaces that aren’t desks. Most of my best thinking happens away from a screen anyway — the keyboard part was always the bottleneck. Removing it doesn’t make me think differently. It makes the time I spend actually capturing the thinking more honest about how that thinking happens.&lt;/p&gt;

&lt;h2&gt;
  
  
  If you want to try it
&lt;/h2&gt;

&lt;p&gt;The local-AI server is at &lt;a href="https://github.com/nicedreamzapp/claude-code-local" rel="noopener noreferrer"&gt;github.com/nicedreamzapp/claude-code-local&lt;/a&gt;. The voice listener and dispatcher are at &lt;a href="https://github.com/nicedreamzapp/NarrateClaude" rel="noopener noreferrer"&gt;github.com/nicedreamzapp/NarrateClaude&lt;/a&gt;. Both are MIT-licensed, both run entirely on a MacBook with Apple Silicon, and both ship with double-click launchers so the install is closer to “set up an app” than “build a system.”&lt;/p&gt;

&lt;p&gt;You will spend an evening getting it tuned to your voice and your vocabulary. After that, it just works. And the working life it produces is, in my experience, qualitatively different from screen-and-keyboard.&lt;/p&gt;

&lt;p&gt;— Matt&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Part of the &lt;a href="https://nicedreamzwholesale.com/software/" rel="noopener noreferrer"&gt;Nice Dreamz&lt;/a&gt; lineup. If you’re a firm exploring private on-device AI, &lt;a href="https://nicedreamzwholesale.com/airgap/" rel="noopener noreferrer"&gt;AirGap AI&lt;/a&gt; is the engagement I do for setting this up inside law / medical / accounting practices.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://marijuanaunion.com" rel="noopener noreferrer"&gt;Marijuana Union&lt;/a&gt;. For premium vaporizers visit &lt;a href="https://ineedhemp.com" rel="noopener noreferrer"&gt;iNeedHemp&lt;/a&gt;, wholesale at &lt;a href="https://nicedreamzwholesale.com" rel="noopener noreferrer"&gt;Nice Dreamz&lt;/a&gt;, and seeds at &lt;a href="https://tribeseedbank.com" rel="noopener noreferrer"&gt;Tribe Seed Bank&lt;/a&gt;. Explore the 3D cannabis marketplace at &lt;a href="https://marijuanaunion.com/marketplace/" rel="noopener noreferrer"&gt;The Farmstand&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>software</category>
      <category>voicecodingwhisperco</category>
    </item>
    <item>
      <title>A Field Guide to Ambient Computing — The Words for the Thing Thats Coming</title>
      <dc:creator>Matt Macosko</dc:creator>
      <pubDate>Wed, 22 Apr 2026 09:53:15 +0000</pubDate>
      <link>https://forem.com/matt_macosko_f3829cfd86b8/a-field-guide-to-ambient-computing-the-words-for-the-thing-thats-coming-4ngp</link>
      <guid>https://forem.com/matt_macosko_f3829cfd86b8/a-field-guide-to-ambient-computing-the-words-for-the-thing-thats-coming-4ngp</guid>
      <description>&lt;p&gt;I’ve been building toward something I don’t have a clean name for yet. The pieces — local AI, on-device speech, the phone-as-remote, hands-free voice loops — exist and ship today. The pattern they form, when you put them together, doesn’t have a settled vocabulary. So I’m coining one. Not because I want to invent jargon, but because &lt;strong&gt;a thing without a name doesn’t exist for most people&lt;/strong&gt;, and this thing is going to exist for everyone within a few years.&lt;/p&gt;

&lt;p&gt;Here’s the working glossary. Use these terms. Argue with them. Replace them with better ones if you can think of any. The goal is to give shape to a way of working with computers that’s already real but not yet common.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Rectangle
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Noun.&lt;/strong&gt; The screen. Short for &lt;em&gt;the glowing rectangle of damage&lt;/em&gt;. The thing you’re hunched in front of right now. The thing that defines posture, attention, work pace, and back pain for most people who use computers professionally.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“I want to spend less of my day inside the Rectangle.”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The Rectangle is not bad. It’s just been the only option for so long that people forgot it was an option at all rather than the default state of computing. Once you have a working alternative — and you do now — the Rectangle stops being a given. It becomes one of several places work can happen.&lt;/p&gt;




&lt;h2&gt;
  
  
  Off-Screen Work
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Noun.&lt;/strong&gt; Productive computing done without facing a screen. The opposite of screen work, not the absence of work. Hands-free voice coding is off-screen work. Texting your Mac from the hot tub and getting back a finished research summary is off-screen work. Listening to your AI narrate a long task in your own cloned voice while you walk around your house is off-screen work.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“Half my morning was off-screen. I shipped more than I usually do at a desk.”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Ambient Computing
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Noun.&lt;/strong&gt; Computing that happens &lt;em&gt;around&lt;/em&gt; you instead of &lt;em&gt;in front of&lt;/em&gt; you. The machine listens, talks back, sees what you point a camera at, and the keyboard becomes optional rather than mandatory. Ambient computing isn’t smart speakers. Smart speakers ask you to talk to a brand. Ambient computing is your own machine, doing your own work, in your own voice, in the room with you.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“I’m building toward ambient computing — a stack you can talk to, hand things off to, and check back on later.”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  To Airgap (verb)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Transitive verb.&lt;/strong&gt; To configure an AI workflow such that it runs entirely on local hardware with no outbound network traffic — making client data, prompts, and responses physically incapable of leaving the machine.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“We airgapped the firm’s drafting workflow last week. Nothing they paste into the AI hits the internet.”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is the verb form of &lt;a href="https://nicedreamzwholesale.com/airgap/" rel="noopener noreferrer"&gt;AirGap AI&lt;/a&gt;, the consulting practice I run for firms in regulated industries. It’s also the right word for what you do when you set up your own local-AI stack on a MacBook and turn the wifi off to prove it works. Both of those count.&lt;/p&gt;




&lt;h2&gt;
  
  
  Hand-Off Computing
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Noun.&lt;/strong&gt; A workflow where you give the computer a task, walk away, and it tells you when it’s done. Distinct from interactive computing, where you sit and wait, and from background computing, which you forget about until it crashes. In hand-off computing the machine knows you walked away, finishes the work, and notifies you back through whatever channel you set up — usually a text to your phone.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“I just hand-off it the data analysis and go make breakfast. It buzzes my phone when the report’s ready.”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The Two-Slab Posture
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Noun.&lt;/strong&gt; The body shape you assume when working with a laptop and a phone simultaneously — head tilted forward, shoulders rounded, both hands pulled in toward the body. The dominant posture of professional computing in 2026, and the source of much of the chronic pain that office workers attribute to “stress.” The Two-Slab Posture is what off-screen work makes optional.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“By 4pm every day I’m locked in the Two-Slab Posture and I can feel it in my neck.”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Backpack Supercomputer
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Noun.&lt;/strong&gt; A current-generation laptop with enough on-device compute to run frontier-class AI models locally. Specifically: an Apple Silicon MacBook Pro with 64+ GB of unified memory, of the M2 / M3 / M4 / M5 Max generation. The phrase emphasizes that this hardware fits in a backpack while delivering performance that would have required a server rack five years ago.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“My M5 Max is a backpack supercomputer. I take it to client offices and it runs a 70-billion-parameter model on the train ride.”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The Ambient Stack
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Noun.&lt;/strong&gt; The minimum set of pieces required to assemble an ambient-computing setup. As of 2026, on Apple hardware:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A &lt;strong&gt;local LLM&lt;/strong&gt; running through an MLX-native server. (&lt;a href="https://github.com/nicedreamzapp/claude-code-local" rel="noopener noreferrer"&gt;claude-code-local&lt;/a&gt;.)&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;continuous-listen daemon&lt;/strong&gt; wrapping &lt;code&gt;SFSpeechRecognizer&lt;/code&gt;. (&lt;a href="https://github.com/nicedreamzapp/NarrateClaude" rel="noopener noreferrer"&gt;NarrateClaude&lt;/a&gt;.)&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;cloned-voice TTS&lt;/strong&gt; for spoken responses in the user’s own voice.&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;phone bridge&lt;/strong&gt; so iMessage can drive the Mac. (Custom AppleScript.)&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;local browser agent&lt;/strong&gt; for web tasks. (&lt;a href="https://github.com/nicedreamzapp/browser-agent" rel="noopener noreferrer"&gt;browser-agent&lt;/a&gt;.)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Five components. All open source. All run on hardware you may already own. The entire stack costs $0 in recurring fees once installed.&lt;/p&gt;




&lt;h2&gt;
  
  
  To Whisper-Code
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Verb.&lt;/strong&gt; To do programming work via voice, with the AI replying in the developer’s own cloned voice. Distinct from voice dictation (which still requires you to be at the screen to read the result). Whisper-coding is an end-to-end conversation about code, in audio, where the developer never has to look at the screen unless they choose to verify something visually.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“I whisper-coded the fix while pacing the kitchen. Saw the diff after lunch and it was right.”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The Cloned-Voice Loop
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Noun.&lt;/strong&gt; A feedback loop where the AI’s spoken responses are rendered in a TTS clone of the user’s own voice. This makes the response feel less like an external announcement and more like internal monologue, which the human nervous system processes more naturally and at lower cognitive load than a stranger’s voice. The loop runs on-device for the same privacy reasons as the rest of the ambient stack.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“After a week with the cloned-voice loop, hearing a stranger TTS feels jarring.”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Hot-Tub Coding
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Noun.&lt;/strong&gt; Sending a coding or research task to your Mac from your phone while not being at the Mac. Originally literal — sending the task while in a hot tub — now a generic term for any phone-driven hand-off computing session. The hallmark of hot-tub coding is that the &lt;em&gt;human&lt;/em&gt; is doing something else entirely while the &lt;em&gt;computer&lt;/em&gt; is doing the work the human ordered.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“That whole feature was hot-tub coded. I never sat down to write any of it.”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Local-First AI
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Adjective phrase.&lt;/strong&gt; A system architecture in which AI inference defaults to running on the user’s own device, with cloud as a fallback used only when local can’t handle the task — &lt;em&gt;not&lt;/em&gt; the other way around. The cultural and technical opposite of &lt;em&gt;cloud-first AI&lt;/em&gt;, which has been the default since 2019. Local-first AI is the architecture every privacy-sensitive industry is now going to need, whether they realize it yet or not.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“For NDA work, the only sane architecture is local-first AI.”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  A Note On Ownership
&lt;/h2&gt;

&lt;p&gt;These words don’t belong to me. If you find them useful, use them. If you build on the stack and coin a better word for something I’ve named here, I’ll switch to using yours. The point of giving things names is to make them discussable, not to lock down a vocabulary.&lt;/p&gt;

&lt;p&gt;But I do want the &lt;strong&gt;pattern&lt;/strong&gt; they describe to take hold. We have, in 2026, the technology to fundamentally change how working with computers feels — physically, mentally, ergonomically, financially. Most people don’t know it yet because the pattern doesn’t have a name they recognize. These are the names I think will help.&lt;/p&gt;

&lt;p&gt;If any of this resonates: clone the &lt;a href="https://github.com/nicedreamzapp/claude-code-local" rel="noopener noreferrer"&gt;open-source stack&lt;/a&gt;, assemble your own ambient setup, and tell me what you find. The next decade of computing isn’t being decided in any one company’s roadmap. It’s being decided by who shows up and starts using the parts that already exist.&lt;/p&gt;

&lt;p&gt;— Matt&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Part of the &lt;a href="https://nicedreamzwholesale.com/software/" rel="noopener noreferrer"&gt;Nice Dreamz&lt;/a&gt; lineup. If you’re a firm that wants the ambient stack installed and air-gapped for compliance reasons, &lt;a href="https://nicedreamzwholesale.com/airgap/" rel="noopener noreferrer"&gt;AirGap AI&lt;/a&gt; is the engagement I do for that.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://marijuanaunion.com" rel="noopener noreferrer"&gt;Marijuana Union&lt;/a&gt;. For premium vaporizers visit &lt;a href="https://ineedhemp.com" rel="noopener noreferrer"&gt;iNeedHemp&lt;/a&gt;, wholesale at &lt;a href="https://nicedreamzwholesale.com" rel="noopener noreferrer"&gt;Nice Dreamz&lt;/a&gt;, and seeds at &lt;a href="https://tribeseedbank.com" rel="noopener noreferrer"&gt;Tribe Seed Bank&lt;/a&gt;. Explore the 3D cannabis marketplace at &lt;a href="https://marijuanaunion.com/marketplace/" rel="noopener noreferrer"&gt;The Farmstand&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>software</category>
      <category>ambientcomputingvoca</category>
    </item>
    <item>
      <title>Your Medical Practice Is Probably Using Cloud AI on PHI Right Now — Heres the HIPAA Problem Nobody Is Talking About</title>
      <dc:creator>Matt Macosko</dc:creator>
      <pubDate>Wed, 22 Apr 2026 09:27:41 +0000</pubDate>
      <link>https://forem.com/matt_macosko_f3829cfd86b8/your-medical-practice-is-probably-using-cloud-ai-on-phi-right-now-heres-the-hipaa-problem-nobody-42nf</link>
      <guid>https://forem.com/matt_macosko_f3829cfd86b8/your-medical-practice-is-probably-using-cloud-ai-on-phi-right-now-heres-the-hipaa-problem-nobody-42nf</guid>
      <description>&lt;p&gt;Walk into any small medical practice today and ask the front-desk staff if they’ve ever pasted a chart note into ChatGPT to “rewrite this so the patient understands it” or to “summarize this lab result.” A lot of them will say yes. Some will say no but their browser history says otherwise. A few will look genuinely surprised that anyone’s asking.&lt;/p&gt;

&lt;p&gt;Here’s what they’re not thinking about: protected health information (PHI) under HIPAA includes more than the obvious identifiers. It includes anything that, in combination, could identify a patient — symptoms plus visit date, lab values plus condition, even free-text descriptions if specific enough. Once that text leaves the practice’s network and lands in a cloud AI service, the practice has technically engaged that AI vendor as a business associate, and a Business Associate Agreement (BAA) is required. The big AI providers offer BAAs only on enterprise tiers — usually $$$$ a month. Most practices using ChatGPT or Claude on patient data have no BAA in place at all.&lt;/p&gt;

&lt;p&gt;That’s a HIPAA breach waiting to be discovered. And it’s already everywhere.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this is suddenly a real problem
&lt;/h2&gt;

&lt;p&gt;For years, the assumption was that nobody on staff would use a “ChatGPT” on actual patient data — it’d be obvious that PHI shouldn’t go to a third-party server. That assumption no longer holds. The tools are too useful. Practice managers, MAs, billers, NPs, even physicians are routinely pasting things in: prior-auth letters, patient instructions, summary letters to referring providers, insurance appeal templates, lab interpretations.&lt;/p&gt;

&lt;p&gt;Each of those sessions, on the major cloud AI providers, is potentially a HIPAA-reportable event. The practice doesn’t know it. The vendor doesn’t know who the patient is. But under the regulation, the disclosure happened the moment the text crossed the network boundary without a BAA in place.&lt;/p&gt;

&lt;p&gt;OCR enforcement actions in the past few years have been heavy on exactly this kind of “we didn’t realize we were using a third-party processor” finding. Penalties for unintentional disclosure under the &lt;strong&gt;HIPAA Privacy Rule&lt;/strong&gt; start at $137 per violation and can reach $68,928 per violation depending on culpability — and “violations” can be counted per record disclosed. A single staff member pasting 30 patient summaries into ChatGPT over a quarter is, by the regulation’s math, 30 violations.&lt;/p&gt;

&lt;p&gt;Most practices are not okay if they’re audited tomorrow.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fix that actually works: on-device AI
&lt;/h2&gt;

&lt;p&gt;The practice doesn’t have to give up AI. It just has to keep it on the practice’s own machines, where there’s no third-party processor relationship.&lt;/p&gt;

&lt;p&gt;Modern Apple Silicon Macs — the kind a lot of practices already have at the front desk or in clinician offices — can run open-weight language models locally. The model runs in the Mac’s unified memory using Apple’s MLX framework. Prompts and responses never touch a network connection. There’s no API key, no vendor account, no outbound traffic to log or audit.&lt;/p&gt;

&lt;p&gt;For HIPAA purposes, this changes the legal posture entirely. Software running on a covered entity’s own hardware, with no data transmission outside the entity’s secured environment, is &lt;strong&gt;not&lt;/strong&gt; a third-party disclosure. It’s the same legal category as a Word document — local software processing data the practice already has lawful access to.&lt;/p&gt;

&lt;p&gt;The HIPAA Security Rule still applies (the Mac itself needs to be physically secured, encrypted at rest, with access controls), but those are the same controls the practice already runs for its EMR workstation. No new vendor risk. No new BAA. No quarterly compliance review of the AI provider’s SOC 2 report.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it looks like inside a practice
&lt;/h2&gt;

&lt;p&gt;A typical small-practice install:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;2-5 MacBooks (front desk, clinician workstations, billing). Most practices already have these.&lt;/li&gt;
&lt;li&gt;An open-source MLX server installed on each, running a 31B or 70B language model.&lt;/li&gt;
&lt;li&gt;A simple chat interface on the desktop. Looks and feels like ChatGPT. Behaves the same. Just doesn’t phone anywhere.&lt;/li&gt;
&lt;li&gt;A one-page &lt;strong&gt;HIPAA AI Use Policy&lt;/strong&gt; documenting that the practice’s AI tools run on-premises with no third-party data processors. This goes in the practice’s compliance binder.&lt;/li&gt;
&lt;li&gt;An hour of staff training on what tasks make sense for the local AI vs. what should still go through the EMR.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;After install, the practice’s AI usage is HIPAA-clean. Nothing to add to the BAA log. Nothing to disclose to patients. Nothing to argue about in an audit.&lt;/p&gt;

&lt;h2&gt;
  
  
  The specific wins for a medical practice
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Patient instructions in plain language.&lt;/strong&gt; Convert “post-op care: keep wound site dry x 5 days, rotate dressing q12h, NSAIDs prn” into a paragraph the patient will actually read. Local model, no PHI exposure.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Prior auth letters.&lt;/strong&gt; Drafting these from chart notes is a huge time sink. Local AI can generate the first draft from the relevant note, with the chart never leaving the practice.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Insurance appeals.&lt;/strong&gt; Same pattern. The AI sees the denial letter and the relevant clinical history; the practice’s data stays local.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Letters to referring providers.&lt;/strong&gt; Clean, professional, fast — without sending the patient’s chart to a cloud LLM.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Patient education content&lt;/strong&gt; customized to the practice (not the same generic handouts every other clinic uses).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of these are dramatic. All of them are time savers worth tens of hours per month per provider. And every one of them is safe to do with on-device AI in a way that’s genuinely not safe to do with cloud AI.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the cost actually looks like
&lt;/h2&gt;

&lt;p&gt;A small practice using cloud AI properly (with a BAA-covered enterprise tier) is looking at $40-100 per user per month, plus the legal and compliance overhead of vetting the vendor and adding them to the BAA log. That’s $5,000-15,000+ per year in subscription cost for a 5-person practice, before any compliance staff time.&lt;/p&gt;

&lt;p&gt;A one-time on-device install for the same practice runs &lt;strong&gt;$8,000 to $15,000&lt;/strong&gt; all-in (hardware aside — most practices already have the Macs). After that: zero recurring AI subscription cost. The AI runs on hardware the practice already owns, indefinitely.&lt;/p&gt;

&lt;p&gt;The financial argument is real, but it’s secondary to the compliance argument. The compliance argument is: &lt;strong&gt;on-device AI is the only AI configuration that doesn’t create a HIPAA business-associate relationship.&lt;/strong&gt; That’s not a marginal advantage. That’s a categorical difference.&lt;/p&gt;

&lt;h2&gt;
  
  
  Who should be looking at this now
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Solo and small group practices&lt;/strong&gt; doing primary care, behavioral health, dermatology, OB/GYN, mental health, dentistry — anywhere clinicians are tempted to use AI on chart text.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Therapy and counseling practices&lt;/strong&gt; where session notes are particularly sensitive and where most cloud AI tools are an obvious non-starter.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Concierge / direct-pay practices&lt;/strong&gt; where patients explicitly chose the practice for higher privacy expectations than chain medicine offers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Practices that already had a HIPAA scare&lt;/strong&gt; — a near-miss, an OCR letter, a malware incident — where the leadership now takes data flow questions seriously.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Any practice in California, New York, Massachusetts, or other states&lt;/strong&gt; with privacy laws that exceed HIPAA in scope.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the practice’s leadership doesn’t know exactly what AI tools the staff are currently using on patient text, that’s the answer to “should we look at this.” The fix is not to ban AI (it’ll go underground), it’s to give the practice an AI that doesn’t create a vendor-risk problem.&lt;/p&gt;




&lt;p&gt;I do on-device AI installations for small medical and therapy practices — fixed-fee, one week start to finish, including the HIPAA AI Use Policy and staff training. If your practice is quietly accumulating AI usage without a clear compliance posture, this is the cleanest fix on the market.&lt;/p&gt;

&lt;p&gt;More detail: &lt;a href="https://nicedreamzwholesale.com/airgap/" rel="noopener noreferrer"&gt;AirGap AI&lt;/a&gt; — book a 15-minute call from that page and I’ll walk through whether on-device is the right fit for your specific setup.&lt;/p&gt;

&lt;p&gt;— Matt Macosko, &lt;a href="https://nicedreamzwholesale.com" rel="noopener noreferrer"&gt;Nice Dreamz LLC&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The open-source software the install is built on is public at &lt;a href="https://github.com/nicedreamzapp/claude-code-local" rel="noopener noreferrer"&gt;github.com/nicedreamzapp/claude-code-local&lt;/a&gt; — you or your IT contractor can review exactly what runs on practice hardware.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://marijuanaunion.com" rel="noopener noreferrer"&gt;Marijuana Union&lt;/a&gt;. For premium vaporizers visit &lt;a href="https://ineedhemp.com" rel="noopener noreferrer"&gt;iNeedHemp&lt;/a&gt;, wholesale at &lt;a href="https://nicedreamzwholesale.com" rel="noopener noreferrer"&gt;Nice Dreamz&lt;/a&gt;, and seeds at &lt;a href="https://tribeseedbank.com" rel="noopener noreferrer"&gt;Tribe Seed Bank&lt;/a&gt;. Explore the 3D cannabis marketplace at &lt;a href="https://marijuanaunion.com/marketplace/" rel="noopener noreferrer"&gt;The Farmstand&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>software</category>
      <category>medicalpracticeshipa</category>
    </item>
    <item>
      <title>Three Generations of Running Claude Code Locally on a MacBook — What I Actually Learned</title>
      <dc:creator>Matt Macosko</dc:creator>
      <pubDate>Wed, 22 Apr 2026 07:43:25 +0000</pubDate>
      <link>https://forem.com/matt_macosko_f3829cfd86b8/three-generations-of-running-claude-code-locally-on-a-macbook-what-i-actually-learned-3mk</link>
      <guid>https://forem.com/matt_macosko_f3829cfd86b8/three-generations-of-running-claude-code-locally-on-a-macbook-what-i-actually-learned-3mk</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Timely update — April 22, 2026.&lt;/strong&gt; &lt;a href="https://www.xda-developers.com/anthropic-cut-claude-code-new-pro-subscriptions/" rel="noopener noreferrer"&gt;XDA Developers reports&lt;/a&gt; Anthropic is A/B-testing a Pro plan that doesn’t include Claude Code — affecting ~2% of new signups, with the pricing page updated to show Claude Code unchecked in the test variant. Current Pro users keep access for now, but the signal is clear: if you want Claude Code in the cloud, the cheapest path is moving toward the $100+ Max tier. What’s below is the free local version that runs exactly the same Claude Code CLI against a model on your own Mac, no subscription needed.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I spent a weekend trying to get Claude Code running against a local model on my Mac. I ended up rewriting the whole setup three times before I had something that didn’t embarrass me on a real coding task. Here’s what went wrong and what finally worked.&lt;/p&gt;

&lt;p&gt;The project is open source — it’s at &lt;a href="https://github.com/nicedreamzapp/claude-code-local" rel="noopener noreferrer"&gt;github.com/nicedreamzapp/claude-code-local&lt;/a&gt; if you want to skip the story and just run it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why do this at all
&lt;/h2&gt;

&lt;p&gt;Claude Code is great, but every call you make sends your code to Anthropic’s cloud. For a lot of what I do — NDA work, client code, things that legally shouldn’t leave the room — that’s a non-starter. I don’t want to turn the AI off, I want to run it somewhere the data doesn’t travel. On a MacBook with 128 GB of unified memory, that’s not hypothetical anymore. You can fit a 70B+ parameter model in RAM and get real work done without a single packet leaving the box.&lt;/p&gt;

&lt;h2&gt;
  
  
  Gen 1: Ollama + a translation proxy
&lt;/h2&gt;

&lt;p&gt;The obvious first move. Ollama speaks the OpenAI API. Claude Code speaks the Anthropic API. So you write a little Python proxy in the middle that translates one to the other.&lt;/p&gt;

&lt;p&gt;It worked. It was also painfully slow — &lt;strong&gt;30 tokens/second, and a real coding task took 133 seconds&lt;/strong&gt;. The proxy was doing two API translations per turn, and every tool call meant serializing and re-deserializing JSON twice. For anything more complex than “write a loop” the thing spent more time shuffling bytes than running inference.&lt;/p&gt;

&lt;p&gt;Honestly, Gen 1 was mostly useful for proving the idea was sound. Nothing more.&lt;/p&gt;

&lt;h2&gt;
  
  
  Gen 2: llama.cpp + TurboQuant
&lt;/h2&gt;

&lt;p&gt;I tried to fix the speed problem by swapping Ollama for llama.cpp and adding Google Research’s TurboQuant KV cache compression. That got me to &lt;strong&gt;41 tok/s&lt;/strong&gt; — a real improvement on the model side. But Claude Code tasks &lt;em&gt;still&lt;/em&gt; took 133 seconds, because the proxy was the bottleneck, not the model.&lt;/p&gt;

&lt;p&gt;This was the lesson I needed: you can’t fix a translation-overhead problem by making the translator’s client faster.&lt;/p&gt;

&lt;h2&gt;
  
  
  Gen 3: kill the proxy entirely
&lt;/h2&gt;

&lt;p&gt;This is where things clicked. Instead of making Ollama or llama.cpp speak Anthropic’s API through a middleman, I wrote a native MLX server that speaks the Anthropic API directly.&lt;/p&gt;

&lt;p&gt;No proxy. No translation layer. Claude Code connects to &lt;code&gt;localhost:4000&lt;/code&gt; thinking it’s talking to &lt;code&gt;api.anthropic.com&lt;/code&gt;, and the server routes straight into MLX (Apple’s native Metal-based ML framework) running the model on the GPU side of unified memory.&lt;/p&gt;

&lt;p&gt;Same Claude Code task: &lt;strong&gt;17.6 seconds&lt;/strong&gt;. That’s &lt;strong&gt;7.5× faster&lt;/strong&gt; than anything I had before, and it came from deleting code, not adding it.&lt;/p&gt;

&lt;p&gt;Throughput with Qwen 3.5 122B (a mixture-of-experts model where only 10B of the 122B params activate per token) hits &lt;strong&gt;65 tok/s on an M5 Max&lt;/strong&gt;. That’s faster than cloud Opus and, depending on how you measure, close to cloud Sonnet. At zero dollars a month and with nothing leaving the machine.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I didn’t expect
&lt;/h2&gt;

&lt;p&gt;Three things surprised me:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Tool-call recovery mattered more than raw speed.&lt;/strong&gt; Small local models sometimes emit garbled tool calls — XML syntax mixed with JSON keys, that sort of thing. Claude Code just silently retries when it can’t parse a call, and without a recovery layer you get infinite loops of “let me try that for you” that never actually run anything. Writing a parser that catches the common garbles and re-infers tool names from parameter keys turned out to be the difference between a demo and something I’d actually use.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. The 10K-token Claude Code harness prompt is too big for local models.&lt;/strong&gt; Claude Code sends a giant system prompt with every request that’s tuned for cloud Claude. Local models see it and often respond with “I am not able to execute this task.” I added an auto-detection that recognizes a Claude Code coding session (presence of Bash/Read/Edit/Write tools) and swaps in a 100-token version tuned for local models. Prompt token count drops by 99% and the refusal problem goes away.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. KV cache quantization bits matter.&lt;/strong&gt; 4-bit KV cache saves memory, but small models lose coherence on long conversations. Bumping to 8-bit (starting at token 1024) fixed the “wait, what were we doing?” drift without a meaningful memory hit.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where I ended up
&lt;/h2&gt;

&lt;p&gt;The repo ships three model options now — Gemma 4 31B (fast, fits a 64 GB Mac), Llama 3.3 70B (slowest but smartest, full 8-bit precision), and Qwen 3.5 122B (fastest throughput, MoE sparsity). Same server, same API, one env var swaps the model.&lt;/p&gt;

&lt;p&gt;It also has a browser agent that drives Brave via Chrome DevTools (so local AI can actually do research, not just write code), an iMessage pipeline for when I’m away from the Mac, and — the part I’m proudest of — a hands-free voice loop where Apple’s on-device &lt;code&gt;SFSpeechRecognizer&lt;/code&gt; listens and a cloned-voice TTS replies. Both halves of that loop run locally. Your voice never leaves the laptop.&lt;/p&gt;

&lt;h2&gt;
  
  
  If you want to try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/nicedreamzapp/claude-code-local
&lt;span class="nb"&gt;cd &lt;/span&gt;claude-code-local
bash setup.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;setup.sh&lt;/code&gt; auto-detects your RAM, picks an appropriate model, downloads it (one-time, 18-75 GB), installs the MLX server, and drops a launcher on your Desktop. Double-click and you’re coding locally.&lt;/p&gt;

&lt;p&gt;The full source is MIT. It’s ~1000 lines of Python plus a few shell launchers. No dependencies I couldn’t audit from scratch, and zero outbound network calls from any of it (Claude Code’s own binary makes one non-blocking startup handshake to Anthropic that you can firewall off with no loss of function — documented in the README).&lt;/p&gt;

&lt;p&gt;From where I sit, the interesting piece isn’t the speed numbers. It’s that “your code never leaves the machine” stopped being aspirational. If you’re a lawyer, an accountant, a doctor, or anyone whose work comes with confidentiality obligations, this is the version of AI coding you can actually use.&lt;/p&gt;

&lt;h2&gt;
  
  
  On-device isn’t just a MacBook thing
&lt;/h2&gt;

&lt;p&gt;This project is one of two I’ve shipped in the on-device-AI space. The other one is &lt;strong&gt;&lt;a href="https://apps.apple.com/us/app/realtime-ai-cam/id6751230739" rel="noopener noreferrer"&gt;RealTime AI Camera&lt;/a&gt;&lt;/strong&gt; — a free iPhone app that detects &lt;strong&gt;all 601 object classes from Open Images V7&lt;/strong&gt; fully offline, at an average of 10 FPS. Every other iPhone detection app I’ve seen caps out at the standard 80 COCO classes because the bigger model is much harder to get running on-device. I spent weeks on the PyTorch → CoreML conversion, hallucination tuning across the extra 521 classes, and the memory-bandwidth bottleneck in the camera pipeline — &lt;a href="https://github.com/nicedreamzapp/RealTimeAICam" rel="noopener noreferrer"&gt;wrote up the whole build here&lt;/a&gt; if that’s your lane.&lt;/p&gt;

&lt;p&gt;Different hardware, different model size, same philosophy: your data doesn’t have to leave the device for the AI to be useful. claude-code-local is the MacBook version of that idea. RealTime AI Camera is the iPhone version.&lt;/p&gt;

&lt;p&gt;— Matt&lt;/p&gt;




&lt;p&gt;&lt;em&gt;More of my open-source lineup: &lt;a href="https://nicedreamzwholesale.com/software/" rel="noopener noreferrer"&gt;nicedreamzwholesale.com/software&lt;/a&gt;. If you want this set up inside a firm or practice, that’s &lt;a href="https://nicedreamzwholesale.com/airgap/" rel="noopener noreferrer"&gt;AirGap AI&lt;/a&gt; — book a 15-min call there.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://marijuanaunion.com" rel="noopener noreferrer"&gt;Marijuana Union&lt;/a&gt;. For premium vaporizers visit &lt;a href="https://ineedhemp.com" rel="noopener noreferrer"&gt;iNeedHemp&lt;/a&gt;, wholesale at &lt;a href="https://nicedreamzwholesale.com" rel="noopener noreferrer"&gt;Nice Dreamz&lt;/a&gt;, and seeds at &lt;a href="https://tribeseedbank.com" rel="noopener noreferrer"&gt;Tribe Seed Bank&lt;/a&gt;. Explore the 3D cannabis marketplace at &lt;a href="https://marijuanaunion.com/marketplace/" rel="noopener noreferrer"&gt;The Farmstand&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>software</category>
      <category>claudecodelocalaimlx</category>
    </item>
    <item>
      <title>This Is What a Robot Can See Now — 601 Objects, Live, Offline, on Your iPhone</title>
      <dc:creator>Matt Macosko</dc:creator>
      <pubDate>Wed, 22 Apr 2026 07:43:19 +0000</pubDate>
      <link>https://forem.com/matt_macosko_f3829cfd86b8/this-is-what-a-robot-can-see-now-601-objects-live-offline-on-your-iphone-gm0</link>
      <guid>https://forem.com/matt_macosko_f3829cfd86b8/this-is-what-a-robot-can-see-now-601-objects-live-offline-on-your-iphone-gm0</guid>
      <description>&lt;p&gt;Hold up a banana. Your phone says “banana.” Hold up a ukulele. It says “ukulele.” A stapler, a french horn, a goose, a CT scanner, a waffle iron — it names them all, live, at 10 frames per second, with the internet turned off.&lt;/p&gt;

&lt;p&gt;That’s the part I keep coming back to. We’ve quietly crossed a line where a machine running on the 6-ounce thing in your pocket can recognize &lt;strong&gt;601 different objects in the world around it&lt;/strong&gt; without phoning anywhere. No cloud. No account. No waiting. That’s an extraordinary amount of sight to hand to a piece of consumer hardware, and it’s available right now, for free.&lt;/p&gt;

&lt;p&gt;I built &lt;strong&gt;&lt;a href="https://apps.apple.com/us/app/realtime-ai-cam/id6751230739" rel="noopener noreferrer"&gt;RealTime AI Camera&lt;/a&gt;&lt;/strong&gt; to show people what that actually feels like. The app is on the App Store, it’s free, and the source is on &lt;a href="https://github.com/nicedreamzapp/RealTimeAICam" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;. Point it at your kitchen and watch it label everything in real time. Most people don’t realize how far the on-device models have come until they see it happening in their own hand.&lt;/p&gt;

&lt;p&gt;Here’s why it’s a bigger deal than it sounds — and what it took to actually build it.&lt;/p&gt;

&lt;h2&gt;
  
  
  80 things vs 601 things
&lt;/h2&gt;

&lt;p&gt;Pretty much every iPhone object detection app you’ve ever used recognizes the same 80 things. That’s the default COCO class set that ships with YOLO — person, car, dog, cup, laptop, banana, the obvious ones. For years that’s been the practical ceiling on-device. If you wanted your app to detect anything beyond that, you sent the frame to a server, which meant goodbye offline, goodbye privacy.&lt;/p&gt;

&lt;p&gt;RealTime AI Camera uses the much bigger &lt;strong&gt;Open Images V7&lt;/strong&gt; class set: the original 80 plus 521 more. Musical instruments by type. Animals you’ve never heard of. Kitchen appliances, tools, clothing specifics, scientific instruments, transportation, furniture subtypes. A model that understands the shape of “french horn” distinct from “tuba,” “goose” distinct from “duck,” “stapler” distinct from “hole punch” — all running on your phone, all offline.&lt;/p&gt;

&lt;p&gt;The fact that you can walk around a house and have a device narrate the world back to you at 10 FPS without a network connection is a genuinely new thing. I keep meeting people who assume any “AI” on their phone must be sending data somewhere. Watching them flip airplane mode on and watch the app keep working is sometimes the first moment they actually believe on-device AI is real.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the app actually looks like
&lt;/h2&gt;

&lt;p&gt;Screenshots from the shipping app on iPhone — object detection, OCR, offline translation, LiDAR depth overlays.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://raw.githubusercontent.com/nicedreamzapp/nicedreamzapp/main/HomeSCreen1.png" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1wkoxwra9mqjuqeqnifq.png" alt="RealTime AI Camera home screen" width="800" height="1734"&gt;&lt;/a&gt;&lt;br&gt;&lt;br&gt;
&lt;a href="https://raw.githubusercontent.com/nicedreamzapp/nicedreamzapp/main/IMG_2169.png" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flq7krjhbxjucisc030xl.png" alt="RealTime AI Camera detecting objects live" width="800" height="1734"&gt;&lt;/a&gt;&lt;br&gt;&lt;br&gt;
&lt;a href="https://raw.githubusercontent.com/nicedreamzapp/nicedreamzapp/main/IMG_2208.png" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7473862lvg8vegz5y0dj.png" alt="RealTime AI Camera bounding boxes over 601 classes" width="800" height="1734"&gt;&lt;/a&gt;&lt;br&gt;&lt;br&gt;
&lt;a href="https://raw.githubusercontent.com/nicedreamzapp/nicedreamzapp/main/IMG_2224.jpeg" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffazf0kn72t42rymocns9.jpeg" alt="RealTime AI Camera LiDAR depth overlay" width="800" height="1734"&gt;&lt;/a&gt;&lt;br&gt;&lt;br&gt;
&lt;a href="https://raw.githubusercontent.com/nicedreamzapp/nicedreamzapp/main/IMG_2247.png" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzg3uzfdliekhshbuc8sc.png" alt="RealTime AI Camera OCR or translation mode" width="800" height="1734"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why 80 → 601 isn’t just “more classes”
&lt;/h2&gt;

&lt;p&gt;Going from 80 classes to 601 is not a linear problem. The model has to learn all of it, and then it has to fit on an iPhone, and then it has to run fast enough to feel real-time.&lt;/p&gt;

&lt;p&gt;Every one of those requirements individually is solvable. Stacking them together is where it gets mean.&lt;/p&gt;

&lt;h2&gt;
  
  
  The PyTorch → CoreML conversion
&lt;/h2&gt;

&lt;p&gt;The weights started life in PyTorch. Apple doesn’t run PyTorch natively — everything on-device goes through CoreML, which means a conversion step. For an 80-class YOLO, that conversion is mostly automatic. For a 601-class YOLOv8 with Open Images V7 weights, I spent days on it. Some ops translate cleanly, some fall through into “custom layer” territory, some silently produce a model that runs but outputs garbage. The first few versions I converted ran at full FPS and detected nothing accurately — the tensors were all being reshaped wrong at inference time.&lt;/p&gt;

&lt;p&gt;The published model is on HuggingFace: &lt;a href="https://huggingface.co/divinetribe/yolov8n-oiv7-coreml" rel="noopener noreferrer"&gt;&lt;code&gt;divinetribe/yolov8n-oiv7-coreml&lt;/code&gt;&lt;/a&gt;. If you want to skip the conversion pain and just use the model, that’s the shortcut I wish I’d had.&lt;/p&gt;

&lt;h2&gt;
  
  
  Hallucination at scale
&lt;/h2&gt;

&lt;p&gt;When your class count is 7.5× bigger, you don’t just get 7.5× more possible correct detections — you get many multiples more &lt;em&gt;wrong&lt;/em&gt; ones. The model starts seeing “musical instrument” in every shadow, “building” in every wall, “person” in every coat on a chair. At 80 classes the false-positive rate is manageable because the class boundaries are mostly clean. At 601, every low-confidence detection is a roll of the dice across a much bigger space of wrong answers.&lt;/p&gt;

&lt;p&gt;Fixing this took iteration on three knobs: confidence threshold, NMS (non-max suppression) threshold, and per-class filtering. Some classes in Open Images V7 are just noisy — the ground truth in the training set was fuzzy to begin with. For the app, I tuned the thresholds conservatively enough that the user sees confident detections and not a constant light show of wrong guesses. That conservatism costs some recall, but the experience is night and day better than “technically detecting 601 things.”&lt;/p&gt;

&lt;h2&gt;
  
  
  Screens across multiple iPhones
&lt;/h2&gt;

&lt;p&gt;This part I didn’t expect to be hard. SwiftUI is supposed to make responsive layout easy. In practice, running a live camera feed with overlaid bounding boxes + OCR text + LiDAR distance badges + a control strip, across iPhone SE through iPhone 17 Pro Max, is a real problem. Aspect ratios differ. The notch and Dynamic Island move. LiDAR is only on Pro models so the UI needs to gracefully degrade when it’s missing. Older iPhones can’t sustain 10 FPS on the bigger 601-class model so the app has to throttle.&lt;/p&gt;

&lt;p&gt;Most of the hours on the “done except for…” list at the end of the project were UI hours, not ML hours. Nobody tells you that until you’ve lived it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The bottleneck
&lt;/h2&gt;

&lt;p&gt;The performance bottleneck on iPhone is not the Neural Engine — it’s memory bandwidth between the camera pipeline and the inference step. CoreML + Metal Performance Shaders + the Neural Engine are all fast. Moving pixel buffers from &lt;code&gt;AVCaptureSession&lt;/code&gt; into a format the model wants, at 30+ FPS without dropping frames, is where you lose time. I ended up doing a lot of zero-copy plumbing — reusing the &lt;code&gt;CVPixelBuffer&lt;/code&gt; that the camera hands you, avoiding intermediate &lt;code&gt;UIImage&lt;/code&gt; conversions, keeping everything on-GPU through the pipeline. Average 10 FPS on the 601-class model across the iPhone lineup came from that plumbing, not from model optimization.&lt;/p&gt;

&lt;p&gt;The shipped app has four features that all run in that same pipeline:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Object detection&lt;/strong&gt; — YOLOv8, 601 classes, Open Images V7&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;English OCR&lt;/strong&gt; — on-device printed text recognition (Vision framework)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Spanish → English translation&lt;/strong&gt; — offline, rule-based + dictionary (no cloud translation)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LiDAR distance&lt;/strong&gt; — per-object depth, on Pro models&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;All of it on one device, no network, no account, no ads.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters beyond the app
&lt;/h2&gt;

&lt;p&gt;On-device AI is having a moment. Everybody’s talking about it, most shipped products still cheat by falling back to cloud when the model isn’t up to the job. RealTime AI Camera is a small proof that you can actually ship a nontrivial AI experience that never leaves the device, for free, and have it work.&lt;/p&gt;

&lt;p&gt;This is the same thesis behind my other big open-source project, &lt;a href="https://github.com/nicedreamzapp/claude-code-local" rel="noopener noreferrer"&gt;claude-code-local&lt;/a&gt; — running Anthropic’s Claude Code CLI against a local AI model on a MacBook, zero cloud calls, full coding experience. Different target (MacBook vs iPhone), different model sizes (31-122 billion params vs ~10 million), same philosophy: your data doesn’t need to leave the machine for the AI to be useful.&lt;/p&gt;

&lt;p&gt;If you’re building in this space and thinking about on-device-first, I’d love to hear what you’re running into. Open an issue on the &lt;a href="https://github.com/nicedreamzapp/RealTimeAICam" rel="noopener noreferrer"&gt;RealTimeAICam repo&lt;/a&gt;, or drop me a line.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;Free on the App Store: &lt;a href="https://apps.apple.com/us/app/realtime-ai-cam/id6751230739" rel="noopener noreferrer"&gt;RealTime AI Cam&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Source on GitHub: &lt;a href="https://github.com/nicedreamzapp/RealTimeAICam" rel="noopener noreferrer"&gt;github.com/nicedreamzapp/RealTimeAICam&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Model on HuggingFace: &lt;a href="https://huggingface.co/divinetribe/yolov8n-oiv7-coreml" rel="noopener noreferrer"&gt;&lt;code&gt;divinetribe/yolov8n-oiv7-coreml&lt;/code&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;— Matt&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Part of the &lt;a href="https://nicedreamzwholesale.com/software/" rel="noopener noreferrer"&gt;Nice Dreamz Apps&lt;/a&gt; lineup — private, on-device AI tools. If you want this kind of thing set up inside a firm or practice, that’s &lt;a href="https://nicedreamzwholesale.com/airgap/" rel="noopener noreferrer"&gt;AirGap AI&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://marijuanaunion.com" rel="noopener noreferrer"&gt;Marijuana Union&lt;/a&gt;. For premium vaporizers visit &lt;a href="https://ineedhemp.com" rel="noopener noreferrer"&gt;iNeedHemp&lt;/a&gt;, wholesale at &lt;a href="https://nicedreamzwholesale.com" rel="noopener noreferrer"&gt;Nice Dreamz&lt;/a&gt;, and seeds at &lt;a href="https://tribeseedbank.com" rel="noopener noreferrer"&gt;Tribe Seed Bank&lt;/a&gt;. Explore the 3D cannabis marketplace at &lt;a href="https://marijuanaunion.com/marketplace/" rel="noopener noreferrer"&gt;The Farmstand&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>software</category>
      <category>realtimeaicamyolov8o</category>
    </item>
    <item>
      <title>Cloud AI Coding Costs Keep Climbing — How to Pay $0 and Still Use Claude Code</title>
      <dc:creator>Matt Macosko</dc:creator>
      <pubDate>Wed, 22 Apr 2026 07:42:41 +0000</pubDate>
      <link>https://forem.com/matt_macosko_f3829cfd86b8/cloud-ai-coding-costs-keep-climbing-how-to-pay-0-and-still-use-claude-code-4jf2</link>
      <guid>https://forem.com/matt_macosko_f3829cfd86b8/cloud-ai-coding-costs-keep-climbing-how-to-pay-0-and-still-use-claude-code-4jf2</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Update — April 22, 2026.&lt;/strong&gt; XDA Developers &lt;a href="https://www.xda-developers.com/anthropic-cut-claude-code-new-pro-subscriptions/" rel="noopener noreferrer"&gt;reported yesterday&lt;/a&gt; that Anthropic is A/B-testing a version of the Pro plan that doesn’t include Claude Code — affecting about 2% of new signups. Current Pro users keep it; the pricing page has been updated to show Claude Code unchecked for the test variant. If the test rolls out fully, the cheapest way to use Claude Code jumps from the $20 Pro tier to the $100+ Max tier. Either way, the direction is clear: the cost of cloud AI coding is going up, not down. What’s below is how to keep using Claude Code regardless.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Every few weeks there’s another headline about an AI company raising prices, tightening rate limits, or putting a favorite tool behind a higher tier. If you’re a developer who uses Claude Code or a similar agent every day, the math starts getting uncomfortable. What used to be a $20/month habit can quietly become $100/month — per seat — before you notice. Multiply by a small team and it’s a real line item.&lt;/p&gt;

&lt;p&gt;I got tired of watching that number climb so I built a way around it. Claude Code still works great. You just run it against a local AI on your own Mac instead of the cloud.&lt;/p&gt;

&lt;p&gt;The project is open source and free: &lt;a href="https://github.com/nicedreamzapp/claude-code-local" rel="noopener noreferrer"&gt;github.com/nicedreamzapp/claude-code-local&lt;/a&gt;. Bash one script, double-click a launcher, and you’re coding against a local model that costs nothing to run beyond electricity.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the cloud bill keeps growing
&lt;/h2&gt;

&lt;p&gt;Cloud AI pricing isn’t stable and shouldn’t be treated like it is. The models keep getting more capable, and each jump in capability gets repriced into a higher tier. Context windows expand and the per-token cost goes up to match. “Included in your plan” features get quietly moved up a tier. None of this is wrong — the cost to serve a frontier model is real — but the net effect on a developer’s monthly AI spend is a steady upward drift you can’t plan for.&lt;/p&gt;

&lt;p&gt;Meanwhile, the machine on your desk has gotten radically more capable at running those same kinds of models. An M-series MacBook with 32+ GB of RAM can now comfortably run a coding model at 15–65 tokens per second. That’s real, useful inference speed. Five years ago this wasn’t possible. Today it’s ordinary.&lt;/p&gt;

&lt;p&gt;The question stops being &lt;em&gt;“can my Mac run this?”&lt;/em&gt; and becomes &lt;em&gt;“why am I paying a subscription for something my Mac can do by itself?”&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  How the local setup works
&lt;/h2&gt;

&lt;p&gt;Claude Code — Anthropic’s CLI coding agent — talks to &lt;code&gt;api.anthropic.com&lt;/code&gt; by default. It sends your prompts and code to the cloud, gets responses back, runs tool calls. Fast, reliable, costs money.&lt;/p&gt;

&lt;p&gt;The workaround is a tiny server that runs on your Mac and &lt;em&gt;pretends to be&lt;/em&gt; the Anthropic API. Claude Code thinks it’s talking to the cloud. It’s actually talking to &lt;code&gt;localhost:4000&lt;/code&gt;, which hands the request to an MLX-powered local model running on your Apple Silicon GPU. The response comes back formatted exactly like the real API would format it. Claude Code doesn’t know the difference.&lt;/p&gt;

&lt;p&gt;I went through three generations of this before it actually worked well:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Gen 1&lt;/strong&gt; — Ollama + a translation proxy. 30 tok/s, 133 seconds per task. Worked, but slow.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gen 2&lt;/strong&gt; — llama.cpp with KV cache compression + same proxy. 41 tok/s. Still 133 seconds per task because the proxy was the bottleneck, not the model.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gen 3&lt;/strong&gt; — Killed the proxy entirely. Wrote a native MLX server that speaks Anthropic’s API directly. &lt;strong&gt;65 tok/s and 17.6 seconds per task.&lt;/strong&gt; 7.5× faster than anything I had before, and it came from deleting code rather than adding it.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The key insight: when you’re running everything locally, every layer of translation or RPC between Claude Code and the model is overhead that buys you nothing. Direct is always faster than proxied.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it costs
&lt;/h2&gt;

&lt;p&gt;Setup — once, ~20 minutes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;code&gt;git clone https://github.com/nicedreamzapp/claude-code-local&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;cd claude-code-local &amp;amp;&amp;amp; bash setup.sh&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Double-click the launcher that lands on your Desktop.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Ongoing — $0/month. Electricity maybe adds a dollar or two to your power bill if you’re running inference constantly. That’s it. No API key. No usage tier. No rate limits. No dashboard to check.&lt;/p&gt;

&lt;p&gt;Running Claude Code against a local model also means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Your code never leaves your Mac.&lt;/strong&gt; For NDA work, client files, or anything under compliance rules, this isn’t a “nice to have” — it’s the difference between being able to use AI and having to turn it off.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Works without internet.&lt;/strong&gt; Plane, train, coffee shop with bad wifi, client office with firewall restrictions — if the machine is running, the AI is running.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No surprise bill.&lt;/strong&gt; You won’t wake up to a $400 month because you left a loop running overnight.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The tradeoff
&lt;/h2&gt;

&lt;p&gt;To be fair: local models aren’t cloud Claude. Cloud Sonnet and Opus are still more capable on really hard reasoning tasks. For 80-90% of day-to-day coding — writing functions, refactoring, explaining code, running tool calls, doing multi-step edits — the local models are good enough that I’ve stopped reaching for cloud. For the hardest 10-15% of problems, cloud still wins.&lt;/p&gt;

&lt;p&gt;The honest recommendation: run local as your default, and keep a cloud subscription only if you’re reaching for it weekly. Most developers will find they rarely need to.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where the ceiling is going
&lt;/h2&gt;

&lt;p&gt;Local models keep getting better faster than cloud pricing falls. The 31B and 70B open-weight models available today are what people paid cloud premiums for 18 months ago. In another 18 months, what’s running on your Mac will be indistinguishable in capability from what the cloud was charging $200/month for this year.&lt;/p&gt;

&lt;p&gt;That’s the real story. Not “cloud is bad.” Cloud is fine. It’s that local has caught up, and the cost math has flipped. If you’re a developer who cares about the bill, or a firm whose files can’t legally leave the premises, or just someone who doesn’t want to be surprised by next quarter’s pricing page — this is the version that makes sense now.&lt;/p&gt;

&lt;p&gt;Try it: &lt;a href="https://github.com/nicedreamzapp/claude-code-local" rel="noopener noreferrer"&gt;github.com/nicedreamzapp/claude-code-local&lt;/a&gt;. MIT license. Zero dependencies you can’t audit. Run it, or don’t, but either way you’ll have one less subscription to worry about.&lt;/p&gt;

&lt;p&gt;— Matt&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Part of the &lt;a href="https://nicedreamzwholesale.com/software/" rel="noopener noreferrer"&gt;Nice Dreamz&lt;/a&gt; lineup. If you’re a firm (law, accounting, medical, therapy) that can’t legally put client files through cloud AI, &lt;a href="https://nicedreamzwholesale.com/airgap/" rel="noopener noreferrer"&gt;AirGap AI&lt;/a&gt; is the private on-device setup I do for firms — book a 15-min call there.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://marijuanaunion.com" rel="noopener noreferrer"&gt;Marijuana Union&lt;/a&gt;. For premium vaporizers visit &lt;a href="https://ineedhemp.com" rel="noopener noreferrer"&gt;iNeedHemp&lt;/a&gt;, wholesale at &lt;a href="https://nicedreamzwholesale.com" rel="noopener noreferrer"&gt;Nice Dreamz&lt;/a&gt;, and seeds at &lt;a href="https://tribeseedbank.com" rel="noopener noreferrer"&gt;Tribe Seed Bank&lt;/a&gt;. Explore the 3D cannabis marketplace at &lt;a href="https://marijuanaunion.com/marketplace/" rel="noopener noreferrer"&gt;The Farmstand&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>software</category>
      <category>claudecodelocalaiant</category>
    </item>
    <item>
      <title>If Your Law Firm Is Using Cloud AI on Client Files, You Probably Have a Problem</title>
      <dc:creator>Matt Macosko</dc:creator>
      <pubDate>Wed, 22 Apr 2026 07:42:35 +0000</pubDate>
      <link>https://forem.com/matt_macosko_f3829cfd86b8/if-your-law-firm-is-using-cloud-ai-on-client-files-you-probably-have-a-problem-492f</link>
      <guid>https://forem.com/matt_macosko_f3829cfd86b8/if-your-law-firm-is-using-cloud-ai-on-client-files-you-probably-have-a-problem-492f</guid>
      <description>&lt;p&gt;Most lawyers I talk to are already quietly using AI — ChatGPT, Claude, Copilot — for drafting, research, and summarization. It’s useful. It saves hours. And in almost every case, the firm never formally decided whether it was allowed. A paralegal or an associate started, then a partner tried it, then it was everywhere, and nobody circled back to the ethics question.&lt;/p&gt;

&lt;p&gt;Here’s the uncomfortable part: under ABA Model Rule 1.6, the duty of confidentiality applies to &lt;strong&gt;everything&lt;/strong&gt; relating to a client’s representation, not just privileged communications. Anything you paste into a cloud AI service is data you’ve handed to a third-party processor. Some states have issued specific guidance warning that AI use requires the same diligence as any other vendor — data handling agreements, security audits, informed client consent.&lt;/p&gt;

&lt;p&gt;Most firms are doing zero of that. They’re just pasting.&lt;/p&gt;

&lt;h2&gt;
  
  
  What “on-device AI” actually means
&lt;/h2&gt;

&lt;p&gt;There is another way to run AI tools inside a firm — &lt;strong&gt;locally&lt;/strong&gt;, on the machines the firm already owns, with no frame of the document ever touching a network.&lt;/p&gt;

&lt;p&gt;This isn’t hypothetical. Apple Silicon MacBooks from roughly 2022 onward can run open-weight coding and writing models at useful speeds, entirely inside their own unified memory. No API call. No cloud round-trip. No “your data may be used to improve the service” footnote.&lt;/p&gt;

&lt;p&gt;The technical term is &lt;strong&gt;air-gapped AI&lt;/strong&gt;. Same underlying capability — large-language-model drafting, document review, summarization — minus the third-party data processor relationship. Because nothing leaves the Mac, there’s nothing to disclose to the client, nothing to document under a vendor-risk framework, nothing that can be subpoenaed from a cloud provider’s server. The surface area for a breach is the machine itself, which you already secure as part of normal firm operations.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this wasn’t practical a year ago
&lt;/h2&gt;

&lt;p&gt;For years the “local AI” story was a lab-bench curiosity. You could technically run a language model on your laptop, but it was slow, it couldn’t follow complex instructions, and the output quality was visibly worse than what cloud tools produced. Clients would notice.&lt;/p&gt;

&lt;p&gt;That changed faster than most firms realize. The current generation of open-weight models running on Apple Silicon produces output that’s indistinguishable from cloud Claude or GPT for most legal-adjacent tasks — drafting correspondence, summarizing long documents, extracting facts from depositions, plain-language explanation of statutes, cite checking against uploaded documents. The models have caught up. The hardware has caught up. The thing that hasn’t caught up is firms’ awareness that the option exists.&lt;/p&gt;

&lt;h2&gt;
  
  
  The setup, in plain English
&lt;/h2&gt;

&lt;p&gt;A typical firm install looks like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A handful of MacBooks (or an existing one per attorney who does heavy drafting).&lt;/li&gt;
&lt;li&gt;An open-source MLX server that runs a 31- to 70-billion-parameter model on each machine.&lt;/li&gt;
&lt;li&gt;A CLI or chat interface that matches what the attorneys are already used to — Claude Code for tech-heavy work, a local chat app for drafting.&lt;/li&gt;
&lt;li&gt;A one-page firm policy that documents &lt;em&gt;“our AI runs on premises, does not use third-party processors, complies with [your state’s] duty of confidentiality guidance.”&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The technical install is half a day per machine. The policy paperwork and a lunch-and-learn for the attorneys takes another day. After that it runs like any other piece of firm infrastructure — quietly in the background.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it costs vs what cloud AI costs
&lt;/h2&gt;

&lt;p&gt;A cloud AI subscription for a 10-attorney firm runs anywhere from $200 to $1,500 per month depending on tier and usage. Annual: $2,400 to $18,000. That’s ongoing forever, and each vendor price increase flows straight through to the firm’s overhead.&lt;/p&gt;

&lt;p&gt;A one-time install of on-device AI across the same 10 attorneys is a fixed engagement — typically $8,000 to $15,000 all-in, hardware aside (most firms already have the Macs). After that, &lt;strong&gt;zero recurring AI spend.&lt;/strong&gt; Year two onward the firm’s marginal AI cost is the electricity to run the MacBook.&lt;/p&gt;

&lt;p&gt;The financial case is real, but it’s not actually the strongest argument. The strongest argument is: &lt;strong&gt;if a regulatory body ever asks how your firm handles client data in AI tools, “we run it on-device, no third-party processor” is the only answer that ends the conversation.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Who this is for
&lt;/h2&gt;

&lt;p&gt;This isn’t the right fit for every firm. If your practice doesn’t handle confidential client information, or you’ve already done the vendor-risk work and your clients have signed off on cloud AI, cloud is fine and probably cheaper to start.&lt;/p&gt;

&lt;p&gt;Where on-device is specifically the right move:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Solo and small firms&lt;/strong&gt; handling family law, trusts/estates, criminal defense, or any practice where client files carry heavy privacy expectations.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Firms with specific-client confidentiality agreements&lt;/strong&gt; that explicitly prohibit third-party data processors.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Regulated practice areas&lt;/strong&gt; (healthcare-adjacent, financial-services-adjacent, government contracting) where the compliance overhead of vendor AI is genuinely disproportionate.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Firms in California, New York, Florida, Texas, or any state&lt;/strong&gt; whose bar has issued specific AI guidance that firms haven’t actually complied with yet.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How to think about the decision
&lt;/h2&gt;

&lt;p&gt;If you’re a partner reading this, the easiest diagnostic is: pull your associates and paralegals into a room and ask a yes/no question — &lt;em&gt;“In the last 30 days, have you pasted client work into ChatGPT, Claude, or similar?”&lt;/em&gt; If the answer is anything other than a clear no, you already have an on-device-AI-worth-evaluating moment on your hands, whether you knew it or not.&lt;/p&gt;

&lt;p&gt;The fix is not banning AI — banning it just moves the usage underground and makes it worse. The fix is giving attorneys a sanctioned AI that doesn’t create a vendor-risk problem. That’s what on-device is.&lt;/p&gt;




&lt;p&gt;I do firm installations of exactly this setup — fixed-fee, one week start to finish, including the policy paperwork and attorney training. If your firm is quietly accumulating AI usage without a formal stance, or you’ve been waiting for the tech to get good enough to deploy safely, it’s there now.&lt;/p&gt;

&lt;p&gt;More detail on the service: &lt;a href="https://nicedreamzwholesale.com/airgap/" rel="noopener noreferrer"&gt;AirGap AI&lt;/a&gt;. Book a 15-minute call from that page and I’ll tell you in plain terms whether on-device is the right fit for your specific practice.&lt;/p&gt;

&lt;p&gt;— Matt Macosko, &lt;a href="https://nicedreamzwholesale.com" rel="noopener noreferrer"&gt;Nice Dreamz LLC&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The open-source setup I use for installs is public at &lt;a href="https://github.com/nicedreamzapp/claude-code-local" rel="noopener noreferrer"&gt;github.com/nicedreamzapp/claude-code-local&lt;/a&gt; if you or your IT person want to review exactly what runs on firm hardware.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://marijuanaunion.com" rel="noopener noreferrer"&gt;Marijuana Union&lt;/a&gt;. For premium vaporizers visit &lt;a href="https://ineedhemp.com" rel="noopener noreferrer"&gt;iNeedHemp&lt;/a&gt;, wholesale at &lt;a href="https://nicedreamzwholesale.com" rel="noopener noreferrer"&gt;Nice Dreamz&lt;/a&gt;, and seeds at &lt;a href="https://tribeseedbank.com" rel="noopener noreferrer"&gt;Tribe Seed Bank&lt;/a&gt;. Explore the 3D cannabis marketplace at &lt;a href="https://marijuanaunion.com/marketplace/" rel="noopener noreferrer"&gt;The Farmstand&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>software</category>
      <category>lawfirmslegaltechond</category>
    </item>
    <item>
      <title>"It Comes Out Of The Gate Very Fast": Disclosure Day Is An Action Movie</title>
      <dc:creator>Matt Macosko</dc:creator>
      <pubDate>Mon, 20 Apr 2026 08:03:25 +0000</pubDate>
      <link>https://forem.com/matt_macosko_f3829cfd86b8/it-comes-out-of-the-gate-very-fast-disclosure-day-is-an-action-movie-ff2</link>
      <guid>https://forem.com/matt_macosko_f3829cfd86b8/it-comes-out-of-the-gate-very-fast-disclosure-day-is-an-action-movie-ff2</guid>
      <description>&lt;p&gt;The moment Universal released the December 2025 teaser — wide Kansas sky, a meteorologist tilting her head, one note of John Williams score — the internet settled on an idea of what &lt;em&gt;Disclosure Day&lt;/em&gt; was going to be. Slow. Sparse. Grown-up Spielberg. The &lt;em&gt;Close Encounters&lt;/em&gt; of 2026. A film where the camera dwells on faces looking up, and we watch the sky go strange.&lt;/p&gt;

&lt;p&gt;That idea was half right. Per &lt;a href="https://www.empireonline.com/movies/news/disclosure-day-action-movie-steven-spielberg-very-fast-exclusive/" rel="noopener noreferrer"&gt;Empire's exclusive&lt;/a&gt; for the June 2026 issue, Spielberg has other plans for the first 30 minutes.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"This movie comes out of the gate very fast. People who are expecting another slow-burn first act — this is not that movie."— Steven Spielberg to Empire&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What We Know About the Opening
&lt;/h2&gt;

&lt;p&gt;Based on the CinemaCon footage, the Super Bowl trailer, and the Empire cover package, the opening stretch of &lt;em&gt;Disclosure Day&lt;/em&gt; includes:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A cold open in medias res.&lt;/strong&gt; The first image, per reporters who saw the CinemaCon reel, is not a Kansas cornfield. It's a door being kicked in.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Josh O'Connor's fugitive run.&lt;/strong&gt; Daniel Kellner already has the disclosure file when we meet him. He is not discovering anything in act one. He is running with it. This is a huge structural shift from how contact films usually work — the secret is already out, and the movie is about containment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A car-chase-onto-a-train sequence.&lt;/strong&gt; Confirmed by IMDb trivia and hinted at by O'Connor himself ("a car chase that is going to melt people"). The action staging is reportedly why Janusz Kamiński's second unit was in New Jersey for eleven weeks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Kansas City weather broadcast.&lt;/strong&gt; The "click" sequence with Emily Blunt — previously assumed to be the film's quiet centerpiece — is actually in the first act. It's the inciting event, not the climax.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Spielberg Pivoted
&lt;/h2&gt;

&lt;p&gt;David Koepp's prior Spielberg collaborations — &lt;em&gt;Jurassic Park&lt;/em&gt;, &lt;em&gt;War of the Worlds&lt;/em&gt;, &lt;em&gt;Indiana Jones and the Crystal Skull&lt;/em&gt; — are all structured around the escalation of chase and threat. Koepp is not a meditative writer. He is a propulsion writer.&lt;/p&gt;

&lt;p&gt;Spielberg telling Empire that the audience expectations have "caught up" to where the culture is means something specific: in 2026, the public already knows there are congressional UAP hearings happening. They already know Grusch testified. The movie does not need to spend 45 minutes establishing that something strange is going on in the sky. The audience is already there. So Spielberg is skipping that act and starting with the consequences.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Close Encounters Comparison Breaks
&lt;/h2&gt;

&lt;p&gt;If &lt;em&gt;Close Encounters of the Third Kind&lt;/em&gt; spent half its runtime building to the Devils Tower meeting, &lt;em&gt;Disclosure Day&lt;/em&gt; inverts it. The contact is the premise, not the ending. The film is about what happens to Margaret Fairchild, Daniel Kellner, Noah Scanlon, and a handful of other ordinary people once the signal has arrived and the cover has failed.&lt;/p&gt;

&lt;p&gt;That is why Blunt's quote about "questions posed by &lt;em&gt;Close Encounters&lt;/em&gt;" being "answered" works. &lt;em&gt;Disclosure Day&lt;/em&gt; doesn't repeat the 1977 film's arc. It picks up where that film ended — and runs.&lt;/p&gt;

&lt;h2&gt;
  
  
  What It Means for the Box Office
&lt;/h2&gt;

&lt;p&gt;Universal's tracking reportedly pushed for a more actioned-up back half of the marketing campaign after CinemaCon. Expect the next trailer — which Variety says is locked for early May — to lead with O'Connor running, cars flipping, and Firth's Wardex team closing in. The "look up at the sky" imagery isn't going away. It's just no longer the only mode. &lt;em&gt;Disclosure Day&lt;/em&gt; is a summer action movie with a philosophical third act, not the other way around.&lt;/p&gt;

&lt;h3&gt;
  
  
  Sources
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.empireonline.com/movies/news/disclosure-day-action-movie-steven-spielberg-very-fast-exclusive/" rel="noopener noreferrer"&gt;Empire — Disclosure Day Is An Action Movie That Comes Out Of The Gate Very Fast&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://artthreat.net/21411-80045-disclosure-day-director-steven-spielberg-reveals-action-packed-sci-fi-thriller-d/" rel="noopener noreferrer"&gt;Art Threat — Action-Packed Sci-Fi Thriller&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://gizmodo.com/disclosure-day-stephen-spielberg-chacter-details-2000742142" rel="noopener noreferrer"&gt;Gizmodo — Mysterious Main Characters&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Disclosure Day&lt;/em&gt; opens in theaters and IMAX on June 12, 2026.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://disclosureday.nicedreamzwholesale.com" rel="noopener noreferrer"&gt;Disclosure Day Hub&lt;/a&gt; — the fan-built resource tracking Steven Spielberg's UFO film (June 12, 2026). Explore the full &lt;a href="https://disclosureday.nicedreamzwholesale.com/news-hub.html" rel="noopener noreferrer"&gt;news hub&lt;/a&gt;, &lt;a href="https://disclosureday.nicedreamzwholesale.com/cast-guide.html" rel="noopener noreferrer"&gt;cast guide&lt;/a&gt;, and &lt;a href="https://disclosureday.nicedreamzwholesale.com/interviews.html" rel="noopener noreferrer"&gt;interview archive&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>disclosureday</category>
      <category>actionmovie</category>
      <category>spielberg</category>
      <category>openingscene</category>
    </item>
    <item>
      <title>Inside Empire's Disclosure Day Cover Story: Every Quote That Matters</title>
      <dc:creator>Matt Macosko</dc:creator>
      <pubDate>Mon, 20 Apr 2026 08:03:23 +0000</pubDate>
      <link>https://forem.com/matt_macosko_f3829cfd86b8/inside-empires-disclosure-day-cover-story-every-quote-that-matters-54pp</link>
      <guid>https://forem.com/matt_macosko_f3829cfd86b8/inside-empires-disclosure-day-cover-story-every-quote-that-matters-54pp</guid>
      <description>&lt;p&gt;Empire's June 2026 issue hits newsstands this week with &lt;em&gt;Disclosure Day&lt;/em&gt; on the cover and a feature package inside that does something Universal's marketing has not so far: it tells us who the characters actually are, what the film feels like, and why everyone involved keeps using the word "reckoning."&lt;/p&gt;

&lt;p&gt;The headlines from the piece are already everywhere — Emily Blunt saying the film answers &lt;em&gt;Close Encounters&lt;/em&gt; questions, Spielberg calling it an action movie that "comes out of the gate very fast." But the full interview set is richer than the pull-quotes, and lays out the strongest picture we have of the film two months out.&lt;/p&gt;

&lt;h2&gt;
  
  
  Steven Spielberg
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Steven Spielberg
&lt;/h3&gt;

&lt;p&gt;Director&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"I've been waiting a long time to tell a story where the visitors don't come as the answer to our loneliness — they come as the answer to a question we're finally ready to ask."On the film's central idea&lt;/p&gt;

&lt;p&gt;"I always said I guarantee life in the universe. What I couldn't do in 1977, and what I can do now, is show what happens when that guarantee stops being abstract."On 50 years since Close Encounters&lt;/p&gt;

&lt;p&gt;"David [Koepp] and I rewrote the third act three times during production. Every time a congressional hearing happened, we adjusted. The world kept getting closer to the movie."On the real-world UAP conversation&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Emily Blunt — Margaret Fairchild
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Emily Blunt
&lt;/h3&gt;

&lt;p&gt;Meteorologist / Conduit&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"There are definitely questions posed by &lt;em&gt;Close Encounters&lt;/em&gt; that are answered in &lt;em&gt;Disclosure Day&lt;/em&gt;."The quote that launched a thousand Reddit threads&lt;/p&gt;

&lt;p&gt;"Margaret is a local Kansas City weather anchor. She is extremely ordinary until extremely un-ordinary things start happening through her. That's the movie."On her character&lt;/p&gt;

&lt;p&gt;"The broadcast scene took five days to shoot. Steven wouldn't tell me what sound I was going to make until the morning of. I had to show up and let my body become it."On the now-famous "click" sequence&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Josh O'Connor — Daniel Kellner
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Josh O'Connor
&lt;/h3&gt;

&lt;p&gt;Cybersecurity expert / Whistleblower&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Daniel works for the agency that has been keeping the secret. He's young, he's a little arrogant, he's been told he can handle it, and then he handles it and realizes the people he works for never should have been handling it."On his character's arc&lt;/p&gt;

&lt;p&gt;"It's old-school Spielberg. I say this to everyone. It feels like the movies that made me want to act. There is a car chase in this film that is going to melt people."On the film's tone&lt;/p&gt;

&lt;p&gt;"Colin plays my boss. So you know going in one of us isn't making it out."On working with Colin Firth&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Colin Firth — Noah Scanlon
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Colin Firth
&lt;/h3&gt;

&lt;p&gt;CEO of Wardex&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Noah is not a villain. He is a man who was handed a file in 1987 and told he was now responsible for the most consequential secret in human history. What he does with it over forty years — that is the film."On his character&lt;/p&gt;

&lt;p&gt;"The chair scene that everyone has been talking about from the first-look images — I will not tell you what it is. I will tell you that it is not what you think."On the 'mind control device' image&lt;/p&gt;

&lt;p&gt;"You don't say no to Steven. You just don't. He called me on a Sunday, explained the film in fifteen minutes, and I agreed on the call."On being cast&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Colman Domingo
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Colman Domingo
&lt;/h3&gt;

&lt;p&gt;[Role kept under wraps by Universal]&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"I bawled reading the script. I bawled again on set. I will probably bawl in the theater. This is a movie about humanity being seen."On his emotional response&lt;/p&gt;

&lt;p&gt;"My character walks into the third act and the floor drops. That's all I can say. Steven made me promise."On third-act secrecy&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Eve Hewson — Jane Blankenship
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Eve Hewson
&lt;/h3&gt;

&lt;p&gt;Daniel's girlfriend&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Jane is the audience. She loves Daniel and she has no idea what he's carrying. When she finds out, she has to decide if the world is worth saving — or if she just wants him safe. That choice is the heart of the film."On her character&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  David Koepp — Screenwriter
&lt;/h2&gt;

&lt;h3&gt;
  
  
  David Koepp
&lt;/h3&gt;

&lt;p&gt;Writer (Jurassic Park, War of the Worlds)&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Steven told me he wanted a thriller that worked on its own — you don't need to know anything about Close Encounters, UAPs, or the AARO reports to follow it. But if you do, there's a second film playing underneath the first one."On the film's dual-layer design&lt;/p&gt;

&lt;p&gt;"I keep getting asked if &lt;em&gt;Disclosure Day&lt;/em&gt; is a &lt;em&gt;Close Encounters&lt;/em&gt; sequel. My honest answer: not officially. But the same gate is open in both films."On the Close Encounters connection&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Takeaway
&lt;/h2&gt;

&lt;p&gt;The Empire cover story accomplishes what six months of trailer drops couldn't: it makes the film feel like a piece of writing rather than a piece of hype. Every cast member circles the same word — "reckoning" — and every one of them declines to say what the third act actually is. Between Blunt's Close Encounters line, Firth's "not what you think" on the chair scene, and Koepp's "same gate is open in both films," the reading public now has enough to triangulate, and not enough to spoil.&lt;/p&gt;

&lt;p&gt;That's exactly where Universal wants the conversation sitting on April 20, 2026. Eight weeks to release.&lt;/p&gt;

&lt;h3&gt;
  
  
  Sources
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.empireonline.com/movies/news/disclosure-day-answers-questions-close-encounters-emily-blunt-exclusive/" rel="noopener noreferrer"&gt;Empire — Emily Blunt: Disclosure Day Answers Close Encounters Questions&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.empireonline.com/movies/news/disclosure-day-action-movie-steven-spielberg-very-fast-exclusive/" rel="noopener noreferrer"&gt;Empire — An Action Movie That Comes Out of the Gate Very Fast&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://gizmodo.com/disclosure-day-stephen-spielberg-chacter-details-2000742142" rel="noopener noreferrer"&gt;Gizmodo — Who the Mysterious Main Characters Are&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.space.com/entertainment/space-movies-shows/disclosure-day-release-date-plot-cast-and-everything-else-we-know-about-spielbergs-sci-fi-return" rel="noopener noreferrer"&gt;Space.com — Everything We Know&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Disclosure Day&lt;/em&gt; opens in theaters and IMAX on June 12, 2026.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://disclosureday.nicedreamzwholesale.com" rel="noopener noreferrer"&gt;Disclosure Day Hub&lt;/a&gt; — the fan-built resource tracking Steven Spielberg's UFO film (June 12, 2026). Explore the full &lt;a href="https://disclosureday.nicedreamzwholesale.com/news-hub.html" rel="noopener noreferrer"&gt;news hub&lt;/a&gt;, &lt;a href="https://disclosureday.nicedreamzwholesale.com/cast-guide.html" rel="noopener noreferrer"&gt;cast guide&lt;/a&gt;, and &lt;a href="https://disclosureday.nicedreamzwholesale.com/interviews.html" rel="noopener noreferrer"&gt;interview archive&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>empiremagazine</category>
      <category>disclosureday</category>
      <category>emilyblunt</category>
      <category>joshoconnor</category>
    </item>
    <item>
      <title>Spielberg Just Pitched a UFO Theory That Rewrites the Whole Movie: They're Us. From the Future.</title>
      <dc:creator>Matt Macosko</dc:creator>
      <pubDate>Mon, 20 Apr 2026 08:02:50 +0000</pubDate>
      <link>https://forem.com/matt_macosko_f3829cfd86b8/spielberg-just-pitched-a-ufo-theory-that-rewrites-the-whole-movie-theyre-us-from-the-future-131c</link>
      <guid>https://forem.com/matt_macosko_f3829cfd86b8/spielberg-just-pitched-a-ufo-theory-that-rewrites-the-whole-movie-theyre-us-from-the-future-131c</guid>
      <description>&lt;p&gt;If you were reading the CinemaCon writeups for the alien reveal, the standing ovation, or the "more truth than fiction" quote, you might have missed it. Tucked into Colman Domingo's back-and-forth with Spielberg on the Caesars stage was a short, almost offhand remark that has since detonated across UFO Twitter, r/UFOs, and the Nimitz-incident podcast circuit.&lt;/p&gt;

&lt;p&gt;Per reporters in the room (&lt;a href="https://www.goldderby.com/film/2026/steven-spielberg-ufo-movie-trailer-plot-cast-release-date/" rel="noopener noreferrer"&gt;Gold Derby&lt;/a&gt;, &lt;a href="https://bleedingcool.com/movies/disclosure-day-detailed-by-steven-spielberg-at-cinemacon/" rel="noopener noreferrer"&gt;Bleeding Cool&lt;/a&gt;), Spielberg laid out what he described as a "hopeful" theory: that the unexplained phenomena showing up in Navy gun-cam footage, in Peruvian skies, off the coast of Catalina — all of it — aren't visitors from another star system. They're us. Traveling back in time.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"The hopeful theory is that what people are calling UAPs are actually humans, further down the timeline, coming back to visit the past. Think about what that means. We made it. We're still here."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Theory, In Plain Language
&lt;/h2&gt;

&lt;p&gt;The "future humans" hypothesis has been floating around UFO research circles for a while (see Dr. Michael Masters' 2019 book &lt;em&gt;Identified Flying Objects&lt;/em&gt;), but it has never really broken into mainstream coverage. The idea:&lt;/p&gt;

&lt;h3&gt;
  
  
  Why UAPs Might Be Time Travelers, Not Aliens
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;UAP occupants consistently described as humanoid — two arms, two legs, bilateral symmetry. Unusual for evolutionary convergence across star systems, trivial for our own descendants.&lt;/li&gt;
&lt;li&gt;UAPs don't announce themselves. They observe. That fits better with anthropologists studying a culture than with an expeditionary force.&lt;/li&gt;
&lt;li&gt;Classic UAP behavior — appearing near nuclear sites, population centers, historical inflection points — maps cleanly onto "historians visiting the turning points of their own past."&lt;/li&gt;
&lt;li&gt;If they're us-from-the-future, the non-interference pattern isn't inexplicable. It's the temporal-mechanics equivalent of not stepping on your own grandfather.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why This Reframes Disclosure Day
&lt;/h2&gt;

&lt;p&gt;Emily Blunt told Empire for the June 2026 issue that "there are definitely questions posed by &lt;em&gt;Close Encounters&lt;/em&gt; that are answered in &lt;em&gt;Disclosure Day&lt;/em&gt;." If you assume the visitors in the 1977 film were extraterrestrials, the statement is impossible — different films, different stories. But if the visitors in &lt;em&gt;Close Encounters&lt;/em&gt; are &lt;em&gt;us&lt;/em&gt;, and &lt;em&gt;Disclosure Day&lt;/em&gt; is the movie where that finally gets confirmed, then the whole cross-film continuity works.&lt;/p&gt;

&lt;p&gt;Consider the leaked details we already have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Emily Blunt plays a meteorologist who becomes a "conduit" — speaking in unearthly clicks live on air. If the visitors are human, the clicks aren't an alien language. They're compressed information from a human-descended protocol.&lt;/li&gt;
&lt;li&gt;Josh O'Connor plays a whistleblower running from Wardex, the government contractor. Cover-up makes sense if what's being covered isn't "aliens exist" but "time travel is real."&lt;/li&gt;
&lt;li&gt;Colin Firth, head of Wardex, is seen in the Empire first-look strapped to a mind-control device. Could easily be a temporal-communications rig.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Spielberg Pattern
&lt;/h2&gt;

&lt;p&gt;Spielberg has been leaning on "hopeful" for years when asked about aliens — contrasted against the "they come to destroy us" posture of &lt;em&gt;Independence Day&lt;/em&gt; or &lt;em&gt;War of the Worlds&lt;/em&gt; (his own version notwithstanding). In 1977 the visitors brought the Roy Neary pilots home. In 1982 E.T. just wanted a ride.&lt;/p&gt;

&lt;p&gt;A future-humans resolution is the most Spielberg possible landing for this movie: the aliens aren't the other. They're the future version of the audience. The third act isn't contact. It's a reunion.&lt;/p&gt;

&lt;h2&gt;
  
  
  Should We Take It Seriously?
&lt;/h2&gt;

&lt;p&gt;Maybe. Maybe not. Spielberg is a showman and this is a marketing cycle. He floated the theory with a grin. But he also specifically said he has been protecting the third act from leaks — and then chose, on his first-ever CinemaCon stage, to hand the internet a theory that maps cleanly onto the third act of his film. That's a very expensive way to be random.&lt;/p&gt;

&lt;p&gt;If "&lt;a href="///more-truth-than-fiction.html"&gt;more truth than fiction&lt;/a&gt;" is the marketing line, then "it's us from the future" might be the plot.&lt;/p&gt;

&lt;h3&gt;
  
  
  Sources
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.goldderby.com/film/2026/steven-spielberg-ufo-movie-trailer-plot-cast-release-date/" rel="noopener noreferrer"&gt;Gold Derby — New Footage and Time Travel Theory&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://bleedingcool.com/movies/disclosure-day-detailed-by-steven-spielberg-at-cinemacon/" rel="noopener noreferrer"&gt;Bleeding Cool — Disclosure Day Will Answer Questions&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.yahoo.com/entertainment/movies/articles/cinemacon-2026-odyssey-sets-sail-220000639.html" rel="noopener noreferrer"&gt;Yahoo/Variety — CinemaCon 2026 Recap&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Disclosure Day&lt;/em&gt; opens in theaters and IMAX on June 12, 2026.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://disclosureday.nicedreamzwholesale.com" rel="noopener noreferrer"&gt;Disclosure Day Hub&lt;/a&gt; — the fan-built resource tracking Steven Spielberg's UFO film (June 12, 2026). Explore the full &lt;a href="https://disclosureday.nicedreamzwholesale.com/news-hub.html" rel="noopener noreferrer"&gt;news hub&lt;/a&gt;, &lt;a href="https://disclosureday.nicedreamzwholesale.com/cast-guide.html" rel="noopener noreferrer"&gt;cast guide&lt;/a&gt;, and &lt;a href="https://disclosureday.nicedreamzwholesale.com/interviews.html" rel="noopener noreferrer"&gt;interview archive&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>disclosureday</category>
      <category>spielbergtimetravelt</category>
      <category>uap</category>
      <category>ufo</category>
    </item>
  </channel>
</rss>
